Day 15: Working with CSV Files

Introduction to CSV (Comma Separated Values) files

CSV (Comma Separated Values) files are a popular file format for storing and exchanging data. As the name suggests, the data is separated by commas, with each row of data represented by a new line.

CSV files are commonly used in many fields, including business, finance, and science. They are often used for storing and exchanging large amounts of data, such as customer lists, financial records, and scientific measurements.

In Python, you can read and write CSV files using the built-in csv module. This module provides a number of functions for working with CSV files, including functions for reading CSV files, writing CSV files, and parsing CSV data into Python objects.

Reading and writing CSV files using the csv module

To read and write CSV files in Python, you can use the built-in csv module. This module provides a range of methods for working with CSV files, including reading and writing data in CSV format.

Here’s an example of how to use the csv module to read data from a CSV file:

import csv

with open('example.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)

    for row in reader:
        print(row)

In this example, we use the open() function to open a CSV file in read mode, and then pass the resulting file object to the csv.reader() function to create a CSV reader object. We can then iterate over the rows of the CSV file using a for loop, and print out each row.

To write data to a CSV file using the csv module, you can use the csv.writer() function. Here’s an example:

import csv

with open('example.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)

    writer.writerow(['Name', 'Age', 'Gender'])
    writer.writerow(['Alice', '25', 'Female'])
    writer.writerow(['Bob', '30', 'Male'])

In this example, we use the open() function to open a CSV file in write mode, and then pass the resulting file object to the csv.writer() function to create a CSV writer object. We can then use the writerow() method to write rows of data to the CSV file. The newline='' parameter is used to ensure that no extra newlines are added between rows.

The csv module provides many more functions and options for working with CSV files, including support for custom delimiters, quoting rules, and more. For more information, see the official Python documentation on the csv module.

Handling errors with try-except blocks

Handling errors with try-except blocks is a key part of any file handling operation, including reading and writing CSV files. Here is an example of how to use a try-except block to handle errors when reading a CSV file:

import csv

try:
    with open('my_csv_file.csv', 'r') as file:
        reader = csv.reader(file)
        for row in reader:
            print(row)
except FileNotFoundError:
    print("File not found.")
except csv.Error as e:
    print(f"CSV error: {e}")

In this example, the code attempts to open a CSV file and read its contents. If the file does not exist, a FileNotFoundError exception is raised and caught by the first except block, which prints an error message. If there is an error reading the CSV file (such as invalid syntax), a csv.Error exception is raised and caught by the second except block, which also prints an error message. By using try-except blocks, you can gracefully handle errors that might occur during file handling operations.

Parsing CSV files and processing data

When working with CSV files, you may need to parse the data in order to extract the information you need. In Python, you can use the csv module to read and write CSV files, and the built-in csv.reader object to parse the data.

Here’s an example of how to read a CSV file and parse the data:

import csv

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)

In this example, the csv.reader object is used to iterate through each row of the CSV file and print it to the console.

To process the data, you can access individual columns by their index. For example, to print only the first and last columns of each row, you can modify the code as follows:

import csv

with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row[0], row[-1])

This will print the first and last columns of each row to the console.

You can also use the csv.DictReader object to read CSV files and create a dictionary for each row, where the keys are the column names and the values are the cell values. Here’s an example:

import csv

with open('data.csv', 'r') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row['Name'], row['Age'])

In this example, the keys are the column names ‘Name’ and ‘Age’, and the values are the cell values for each row. You can access the values for other columns by their names as well.

Once you have parsed the data, you can perform various operations on it, such as sorting, filtering, or calculating aggregates. For example, you can sort the data by a particular column by using the sorted() function and specifying a key function that returns the value for the column. Here’s an example:

import csv

with open('data.csv', 'r') as file:
    reader = csv.DictReader(file)
    data = sorted(reader, key=lambda row: row['Age'])
    for row in data:
        print(row['Name'], row['Age'])

This will print the data sorted by the ‘Age’ column.

Overall, parsing CSV files is a common task in data processing, and Python provides a powerful and easy-to-use set of tools for working with CSV data.

Exercise:

Create a Python program that reads a CSV file, calculates the average of a certain column, and writes the results to a new file.