CSV (Comma Separated Values) files are plain text files that contain data items separated by commas. Each line contains one record, and each record contains one or more fields, separated by commas. CSV files are typically used to store tabular data in plain text form.
The following example shows how 5 columns data will be stored in a CSV file. You can paste the data in a file and save it as 'demo.csv' for future showcases.
public_name, first_name, last_name, gender, age
Bob, Bobbie, Smith, male, 42
Suzie, Susanne, Taylor, female, 34
Greg, Gregory, Johnson, male, 24
Ruthie, Ruth, Miller, female, 45
Brad, Bradley, Jones, male, 56
Monica, Monique, Martin, female, 38
Carl, Carlos, Williams, male, 32
Lisa, Lisette, Richardson, female, 29
John, Johnny, Brown, male, 52
Stephanie, Stephanie, Anderson, female, 41
Separating the items with a comma is just a convention, you can also use any other separators such as a hyphen, space, tab, etc.
The csv Module
As the name tries to suggest, the csv module is used for working with data stored in the (CSV) format. The module provides the necessary tools to easily parse, create, modify, and export data stored in the CSV format. It is freely available in the standard library, we therefore just need to import it without performing any extra installations.
import csv
dir(csv)
//['Dialect', 'DictReader', 'DictWriter', 'Error', 'QUOTE_ALL', 'QUOTE_MINIMAL', 'QUOTE_NONE',
//'QUOTE_NONNUMERIC', 'Sniffer', 'StringIO', '_Dialect', '__all__', '__builtins__', '__cached__',
//'__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '__version__',
//'excel', 'excel_tab', 'field_size_limit', 'get_dialect', 'list_dialects', 're', 'reader',
//'register_dialect', 'unix_dialect', 'unregister_dialect', 'writer']
The builtin dir() function returns a list of the names and attributes defined by a module, class, or object. We used it above to list all the functions and attributes in the csv
module. As shown above, the module provides a set of useful functions to read, write, manipulate, and export data stored in the CSV format. We will look at a handful and the most essential of these functions.
csv.reader()
The csv.reader()
function is used to read in data from CSV files. The function reads in each line from the CSV file as a list of strings.
Syntax:
csv.reader(file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
The file
parameter specifies a file object with the csv data, we use the open() function with the read mode ( "r" ) to create the file object . The optional delimiter
parameter allows you to specify a character used to separate fields in the CSV file, the default is a comma. The quotechar parameter allows you to specify the quote character used to enclose fields in the CSV file. Finally, the quoting parameter allows you to specify whether or not quotes are stripped from fields.
The function returns a generator object whose each item is a list containing strings of the elements in a given line of the csv file.
Example, reading data from our 'demo.csv' file:
import csv
with open('demo.csv', 'r') as file:
data = csv.reader(file, quotechar='"')
for line in data:
print(line)
//['public_name', ' first_name', ' last_name', ' gender', ' age']
//['Bob', ' Bobbie', ' Smith', ' male', ' 42']
//['Suzie', ' Susanne', ' Taylor', ' female', ' 34']
//['Greg', ' Gregory', ' Johnson', ' male', ' 24']
//['Ruthie', ' Ruth', ' Miller', ' female', ' 45']
//['Brad', ' Bradley', ' Jones', ' male', ' 56']
//['Monica', ' Monique', ' Martin', ' female', ' 38']
//['Carl', ' Carlos', ' Williams', ' male', ' 32']
//['Lisa', ' Lisette', ' Richardson', ' female', ' 29']
//['John', ' Johnny', ' Brown', ' male', ' 52']
//['Stephanie', ' Stephanie', ' Anderson', ' female', ' 41']
We might want to perform some extra operations and manipulations to the data returned by the csv.reader()
function. For example in our 'demo.csv' file we would need to remove the leading space in the comma separated values, in this case we can use the string.strip() method. We might also want to change the age value from string to int
for each person/record.
import csv
with open('demo.csv', 'r') as file:
data = csv.reader(file)
header = next(data)
print(header)
print()
for line in data:
line = [l.strip() for l in line]
age = line[-1]
line[-1] = int(age)
print(line)
//['public_name', ' first_name', ' last_name', ' gender', ' age']
//
//['Bob', 'Bobbie', 'Smith', 'male', 42]
//['Suzie', 'Susanne', 'Taylor', 'female', 34]
//['Greg', 'Gregory', 'Johnson', 'male', 24]
//['Ruthie', 'Ruth', 'Miller', 'female', 45]
//['Brad', 'Bradley', 'Jones', 'male', 56]
//['Monica', 'Monique', 'Martin', 'female', 38]
//['Carl', 'Carlos', 'Williams', 'male', 32]
//['Lisa', 'Lisette', 'Richardson', 'female', 29]
//['John', 'Johnny', 'Brown', 'male', 52]
//['Stephanie', 'Stephanie', 'Anderson', 'female', 41]
csv.writer()
The csv.writer()
is used to write data from an iterable such as a list, a dict, etc to a csv file.
Syntax:
csv_writer = csv.writer(output_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
The required file
argument specifies the csv file in which we want to write, the file should be opened in write mode or append mode using the open()
function beforehand. The delimiter specifies the seperator between items in each record/row, the default is a comma.
The function returns a writer object.
The function's basic usage is as follows
import csv
# Open the CSV file in write mode
with open('filename.csv', 'w') as csvfile:
# Create a writer object
csv_writer = csv.writer(csvfile)
# Write rows to the CSV file
csv_writer.writerow(['Column 1', 'Column 2', 'Column 3']) # Write a single row
csv_writer.writerows([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) # Write multiple rows
The writerow()
method of the writer
object is used to write a single row. The writerows
method, on the other hand, takes a two dimensional sequence and writes each of the inner sequence as a single row.
Example:
import csv
header = ["Country", " Capital", " Leader"]
data = [('USA', ' Washington', ' Joe Biden'), ('China', ' Beijing', ' Xi Jinping'), ('Ukraine', ' Kyiv', ' Volodymyr Zelenskyy'), ('India', ' New Delhi', ' Narendra Modi'), ('Japan', ' Tokyo', ' Fumio Kishida'), ('Italy', ' Rome', ' Sergio Mattarella'), ('France', ' Paris', ' Emmanuel Macron'), ('Philippines', ' Manilla', ' Bongbong Marcos'), ('Rwanda', 'Kigali', ' Paul Kagame')]
with open("demo.csv", 'w', newline = '') as file:
writer = csv.writer(file)
writer.writerow(header)
writer.writerows(data)
//26
csv.DictReader()
The csv.DictReader()
function is used to read CSV files as dictionaries. It uses the first row of the CSV file as keys and reads the data of the subsequent rows as the matching values. This makes it easy to extract or update specific values from the CSV files. It also helps in maintaining data consistency and integrity, since the data is already organized in a key value pair format.
Example:
import csv
with open('demo.csv', 'r') as file:
data = csv.DictReader(file)
for line in data:
print(line)
//{'Country': 'USA', ' Capital': ' Washington', ' Leader': ' Joe Biden'}
//{'Country': 'China', ' Capital': ' Beijing', ' Leader': ' Xi Jinping'}
//{'Country': 'Ukraine', ' Capital': ' Kyiv', ' Leader': ' Volodymyr Zelenskyy'}
//{'Country': 'India', ' Capital': ' New Delhi', ' Leader': ' Narendra Modi'}
//{'Country': 'Japan', ' Capital': ' Tokyo', ' Leader': ' Fumio Kishida'}
//{'Country': 'Italy', ' Capital': ' Rome', ' Leader': ' Sergio Mattarella'}
//{'Country': 'France', ' Capital': ' Paris', ' Leader': ' Emmanuel Macron'}
//{'Country': 'Philippines', ' Capital': ' Manilla', ' Leader': ' Bongbong Marcos'}
//{'Country': 'Rwanda', ' Capital': 'Kigali', ' Leader': ' Paul Kagame'}
csv.DictWriter()
The Python csv.DictWriter class is used to write dictionaries to a CSV file. It works similar to the csv.writer class except that it maps Python dictionaries into CSV rows. The csv.DictWriter instance can be used to write rows to the CSV file with the writerow() method, which takes a dictionary as an argument. Multiple rows in the CSV file can be written by looping over a sequence of dictionaries with the writerows() method.
Syntax:
writer = csv.DictWriter(file, field_names)
The familiar file
argument specifies the csv file object in which we want to write. The field_names
specifies names which will be used in writing the header row and match the dictionary keys when writing each row of data. In the following example, we append more values to our 'demo.csv' file using the DictWriter
function.
fields = ['Country', ' Capital', ' Leader']
D = {'Country':'Russia', ' Capital':' Moscow', ' Leader':' Vladimir Putin'}
more = [{'Country':'Qatar', ' Capital':' Doha', ' Leader':' Tamim bin Hamad Al Thani'}, {'Country':'Brazil', ' Capital':' Brasilia', ' Leader':' Luiz Inacio Lula da Silva'}, {'Country':'Finland', ' Capital':' Helsinki', ' Leader':' Sauli Niinisto'}]
with open('demo.csv', 'a', newline = '') as file:
writer = csv.DictWriter(file, fields)
writer.writerow(D)
writer.writerows(more)
Common uses of CSV files.
- Databases and Spreadsheets: CSV files can be used as a data source for many databases and spreadsheets, allowing users to easily store, manipulate, and share data.
- Data Visualization: CSV files can be used to create charts and graphs, allowing for an easy way to visualize data.
- Logging: CSV files can be used to log data for many applications and systems, as the format allows for a quick and easy way to store and access information.
- Batch Processing: CSV files can be used to automate large data sets and batch process them quickly.
- Data Exchange: CSV files are often used to transfer data between different applications and systems, as most applications can read and write CSV files.