Working with CSV Files in Python

Loading

CSV (Comma-Separated Values) is a widely used file format for storing tabular data. Python provides the csv module and other libraries like pandas for efficient CSV handling.

Why use CSV?

  • Lightweight and easy to process.
  • Compatible with Excel, databases, and data analysis tools.
  • Stores structured data in a readable format.

1. Reading CSV Files in Python

Method 1: Using the csv.reader()

import csv

with open("data.csv", "r") as file:
reader = csv.reader(file) # Create CSV reader object
for row in reader: # Iterate through rows
print(row) # Output each row as a list

By default, csv.reader() splits rows using commas (',').


Method 2: Reading CSV as a Dictionary (csv.DictReader)

This allows access to columns by column name instead of index.

import csv

with open("data.csv", "r") as file:
reader = csv.DictReader(file) # Read rows as dictionaries
for row in reader:
print(row["Name"], row["Age"]) # Access columns by name

Best for structured CSV files with headers.


2. Writing to CSV Files

Method 1: Using csv.writer() (List Format)

import csv

data = [
["Name", "Age", "City"],
["Alice", 30, "New York"],
["Bob", 25, "London"]
]

with open("output.csv", "w", newline="") as file:
writer = csv.writer(file) # Create CSV writer object
writer.writerows(data) # Write multiple rows

writerows() writes multiple rows at once.
newline="" prevents blank lines in Windows.


Method 2: Writing CSV as Dictionary (csv.DictWriter)

import csv

data = [
{"Name": "Alice", "Age": 30, "City": "New York"},
{"Name": "Bob", "Age": 25, "City": "London"}
]

with open("output.csv", "w", newline="") as file:
fieldnames = ["Name", "Age", "City"]
writer = csv.DictWriter(file, fieldnames=fieldnames)

writer.writeheader() # Write column headers
writer.writerows(data) # Write multiple dictionary rows

Best when working with structured dictionaries.
Ensures headers match column names.


3. Appending Data to a CSV File (a mode)

Use append mode ("a") to add new rows without overwriting existing data.

import csv

new_data = [["Charlie", 28, "Berlin"]]

with open("output.csv", "a", newline="") as file:
writer = csv.writer(file)
writer.writerows(new_data)

Useful when updating logs, reports, or large datasets.


4. Handling Different CSV Delimiters (Tab, Semicolon, etc.)

CSV files may use other delimiters like tabs (\t) or semicolons (;).

Reading CSV with Custom Delimiter

import csv

with open("data.csv", "r") as file:
reader = csv.reader(file, delimiter=";") # Specify delimiter
for row in reader:
print(row)

Writing CSV with Custom Delimiter

import csv

data = [["Name", "Age"], ["Alice", 30], ["Bob", 25]]

with open("output.csv", "w", newline="") as file:
writer = csv.writer(file, delimiter=";") # Use semicolon
writer.writerows(data)

Ensure delimiter matches the CSV format used in other tools (e.g., Excel, databases).


5. Reading Large CSV Files Efficiently

For large datasets, reading everything at once can consume memory. Instead, process line by line.

import csv

with open("large_file.csv", "r") as file:
reader = csv.reader(file)
next(reader) # Skip header row
for row in reader:
print(row) # Process each row

Use next(reader) to skip headers when needed.
Avoid loading all data into memory for large files.


6. Using Pandas for CSV Files (Recommended for Data Analysis)

pandas is a powerful library for handling structured data.

Reading CSV with Pandas

import pandas as pd

df = pd.read_csv("data.csv") # Load CSV into DataFrame
print(df.head()) # Display first 5 rows

Writing CSV with Pandas

pythonCopyEditdf.to_csv("output.csv", index=False)  # Save DataFrame to CSV

Pandas handles missing values, column names, and large datasets efficiently.


7. Checking If a CSV File Exists

Before reading or writing, check if the file exists.

import os

if os.path.exists("data.csv"):
print("File exists")
else:
print("File does not exist")

8. Exception Handling in CSV Operations

Use try-except to handle missing files or incorrect formats.

import csv

try:
with open("data.csv", "r") as file:
reader = csv.reader(file)
for row in reader:
print(row)
except FileNotFoundError:
print("Error: File not found!")
except csv.Error:
print("Error: Problem with CSV formatting!")

Ensures smooth execution even when files are missing or corrupted.


9. Summary Table

OperationFunctionExample
Read CSV (List)csv.reader()for row in reader:
Read CSV (Dict)csv.DictReader()row["Name"]
Write CSV (List)csv.writer()writer.writerows(data)
Write CSV (Dict)csv.DictWriter()writer.writerows(data)
Append to CSVcsv.writer()open("file.csv", "a")
Custom Delimiterdelimiter=";"csv.reader(file, delimiter=";")
Efficient Readingfor row in reader:Avoids memory overload
Using Pandaspd.read_csv()df.to_csv("file.csv")

Leave a Reply

Your email address will not be published. Required fields are marked *