Using the Pickle Module in Python

Loading

The pickle module in Python is used to serialize (convert objects to a byte stream) and deserialize (convert byte stream back to objects) Python objects. This is useful for saving and loading complex data structures, such as lists, dictionaries, or custom objects, to and from files.

Why Use Pickle?

  • Saves Python objects permanently in a binary format.
  • Loads data faster than reading from text-based formats (e.g., JSON).
  • Works with complex data structures (lists, dictionaries, objects).

Limitations:

  • Not human-readable (binary format).
  • Not cross-language compatible (specific to Python).
  • Security risk: Loading untrusted pickle files can execute malicious code.

📌 Importing Pickle Module:

pythonCopyEditimport pickle

1. Serializing (Pickling) Python Objects

Pickling converts Python objects into a byte stream, which can be stored in a file.

Example: Saving a Dictionary

import pickle

# Sample dictionary
data = {"name": "Alice", "age": 25, "city": "New York"}

# Open file in binary write mode
with open("data.pkl", "wb") as file:
pickle.dump(data, file)

print("Data successfully saved!")

Key Points:

  • open("filename.pkl", "wb") → Opens a file in write-binary mode.
  • pickle.dump(object, file) → Serializes and saves the object to a file.

2. Deserializing (Unpickling) Python Objects

Unpickling converts the byte stream back into a Python object.

Example: Loading the Pickled Dictionary

import pickle

# Open file in binary read mode
with open("data.pkl", "rb") as file:
loaded_data = pickle.load(file)

print("Loaded Data:", loaded_data)

Key Points:

  • open("filename.pkl", "rb") → Opens a file in read-binary mode.
  • pickle.load(file) → Deserializes and loads the object.

Output:

Loaded Data: {'name': 'Alice', 'age': 25, 'city': 'New York'}

3. Pickling Multiple Objects

Save multiple objects in a single file.

import pickle

# Define multiple objects
names = ["Alice", "Bob", "Charlie"]
scores = {"Alice": 90, "Bob": 85, "Charlie": 92}

# Open file in binary write mode
with open("multiple.pkl", "wb") as file:
pickle.dump(names, file)
pickle.dump(scores, file)

print("Multiple objects saved successfully!")

Loading Multiple Objects

import pickle

# Open file in binary read mode
with open("multiple.pkl", "rb") as file:
names_loaded = pickle.load(file)
scores_loaded = pickle.load(file)

print("Names:", names_loaded)
print("Scores:", scores_loaded)

Output:

Names: ['Alice', 'Bob', 'Charlie']
Scores: {'Alice': 90, 'Bob': 85, 'Charlie': 92}

4. Pickling Custom Objects

Pickling can be used to save and load objects of user-defined classes.

Example: Saving and Loading a Class Object

import pickle

# Define a class
class Person:
def __init__(self, name, age):
self.name = name
self.age = age

def __repr__(self):
return f"Person(name={self.name}, age={self.age})"

# Create an object
person1 = Person("Alice", 25)

# Save object
with open("person.pkl", "wb") as file:
pickle.dump(person1, file)

# Load object
with open("person.pkl", "rb") as file:
loaded_person = pickle.load(file)

print("Loaded Person:", loaded_person)

Output:

Loaded Person: Person(name=Alice, age=25)

5. Using pickle.dumps() and pickle.loads()

Instead of saving to a file, we can convert objects to bytes in memory.

Example: Converting Objects to Bytes

import pickle

# Sample dictionary
data = {"fruit": "apple", "color": "red"}

# Serialize to bytes
byte_data = pickle.dumps(data)
print("Serialized Data:", byte_data)

# Deserialize from bytes
restored_data = pickle.loads(byte_data)
print("Restored Data:", restored_data)

Key Methods:

  • pickle.dumps(obj) → Converts object to a byte stream.
  • pickle.loads(byte_data) → Converts byte stream back to object.

6. Handling Pickle Errors

Handle exceptions when working with pickle files.

import pickle

try:
with open("nonexistent.pkl", "rb") as file:
data = pickle.load(file)
except FileNotFoundError:
print("Error: File not found!")
except pickle.UnpicklingError:
print("Error: Corrupt or invalid pickle file!")

Common Errors:

  • FileNotFoundError → File does not exist.
  • pickle.UnpicklingError → Corrupt or non-pickle file.

7. Pickling with Compression (gzip or bz2)

To save space, we can compress pickle files.

Example: Compressing Pickle Data with gzip

import pickle
import gzip

# Data to pickle
data = {"framework": "Django", "language": "Python"}

# Save with compression
with gzip.open("compressed.pkl.gz", "wb") as file:
pickle.dump(data, file)

# Load compressed pickle
with gzip.open("compressed.pkl.gz", "rb") as file:
decompressed_data = pickle.load(file)

print("Decompressed Data:", decompressed_data)

Use gzip.open() to compress pickle files.


8. Secure Pickling with pickle.DEFAULT_PROTOCOL

Python provides different pickle protocols. Use the latest for security and efficiency.

import pickle

# Sample data
data = {"status": "active", "users": 1000}

# Use latest protocol
with open("secure.pkl", "wb") as file:
pickle.dump(data, file, protocol=pickle.DEFAULT_PROTOCOL)

Use protocol=pickle.DEFAULT_PROTOCOL to avoid compatibility issues.


9. Avoiding Security Risks

Pickle can execute arbitrary code, making it unsafe for untrusted files.

Unsafe Example:

import pickle

# Malicious payload
malicious_code = b"cos\nsystem\n(S'echo Hacked!'\ntR."

# Executing untrusted pickle data
pickle.loads(malicious_code) # This will execute a system command!

Prevent Risks:

  • Never unpickle untrusted data.
  • Use alternative formats like JSON for text-based data.
  • Consider restricted unpickling using pickle.Unpickler with custom classes.

10. Summary Table

OperationMethodExample
Save to Filepickle.dump(obj, file)pickle.dump(data, file)
Load from Filepickle.load(file)data = pickle.load(file)
Convert to Bytespickle.dumps(obj)byte_data = pickle.dumps(data)
Load from Bytespickle.loads(bytes)data = pickle.loads(byte_data)
Save Multiple Objectspickle.dump() multiple timespickle.dump(obj1, file); pickle.dump(obj2, file)
Compress Pickle Filegzip.open("file.pkl.gz", "wb")pickle.dump(data, file)
Secure Picklingprotocol=pickle.DEFAULT_PROTOCOLpickle.dump(data, file, protocol=4)

Leave a Reply

Your email address will not be published. Required fields are marked *