Let's Run Jinyeah

[Python] Read/Write csv file (csv, Pandas) 본문

Programming/Python

[Python] Read/Write csv file (csv, Pandas)

jinyeah 2022. 5. 25. 15:36

What is difference between csv(.csv) and excel(.xls)?

1. CSV

  • simple type of plain text file which uses a specific structure to arrange tabular data
  • a newline terminates each row to begin the next row
  • each column is separated by a comman within a row

2. Excel

  • spreadsheet software included in the Microsoft office suite
  • binary file that holds information about all the worksheets in a workbook

Read & Write csv file

[csv]

  • built-in module
  • doesn't provide scientific data manipulation tools that Pandas does
  • write
    • use write() : Each line should be separated by "\n"
      • write(): wirte a single string
      • writelines(): write a sequence of strings(tuple, list)
    • use csv.write()
      • wirterow(): write data into the file as a line
      • writerows()
import csv

# write
info_list = [ ['Nikhil', 'COE', '2', '9.0'], 
         ['Sanchit', 'COE', '2', '9.1'], 
         ['Aditya', 'IT', '2', '9.3'], 
         ['Sagar', 'SE', '1', '9.5'], 
         ['Prateek', 'MCE', '3', '7.8'], 
         ['Sahil', 'EP', '2', '9.1']] 
         
output_dir = "./<result_file>.csv"
with open(output_dir, 'w') as csvfile:
		# use csv.writer()
        csvwriter = csv.writer(csvfile)
     	csvwriter.writerows(info_list)
        
        # use wirte()
        for a in info_list:
            file.writelines(f"{a}\n") # file.writerow(f"{a})
  • read

dataset_file = "./<csv_file>.csv"
with open(dataset_file) as csv_file:
		# delimiter: character that separates text in a line
        csv_reader = csv.reader(csv_file, delimiter=",")
        list_of_rows = list(csv_reader)
        for row in list_of_rows:
            print(row)
 
"""
output: 
['Nikhil', 'COE', '2', '9.0']
['Sanchit', 'COE', '2', '9.1']
['Aditya', 'IT', '2', '9.3']
['Sagar', 'SE', '1', '9.5']
['Prateek', 'MCE', '3', '7.8']
['Sahil', 'EP', '2', '9.1']
"""

[pandas]

  • library that should be manually installed
  • changes csv file to dataframe needed for manipulating data with pandas
  • provide various scientific data manipulation tools

< Summary >

In data-warehouse, Excel is preferable for detailed standardized schema specification

If you want only reading csv file, use csv (pandas will increase dependencies of project)

if you handle a big data and need various data manipulation, use pandas

 

 

reference

https://www.guru99.com/excel-vs-csv.html

https://stackoverflow.com/questions/12377473/write-versus-writelines-and-concatenated-strings/12377575#12377575

https://www.delftstack.com/ko/howto/python/how-to-read-csv-to-list-in-python/

 

'Programming > Python' 카테고리의 다른 글

[Window] Python 설정  (0) 2022.10.24
Numpy 모음  (0) 2022.06.17
[Python] Modify DICOM image and save as DICOM  (0) 2022.05.19
[Python] 가상환경 생성 및 활성화  (0) 2022.05.17
[Python] List method  (0) 2021.10.19
Comments