The pickle
module in Python is used to serialize (convert objects to a byte stream) and deserialize (convert byte stream back to objects) Python objects. This is useful for saving and loading complex data structures, such as lists, dictionaries, or custom objects, to and from files.
Why Use Pickle?
- Saves Python objects permanently in a binary format.
- Loads data faster than reading from text-based formats (e.g., JSON).
- Works with complex data structures (lists, dictionaries, objects).
Limitations:
- Not human-readable (binary format).
- Not cross-language compatible (specific to Python).
- Security risk: Loading untrusted pickle files can execute malicious code.
📌 Importing Pickle Module:
pythonCopyEditimport pickle
1. Serializing (Pickling) Python Objects
Pickling converts Python objects into a byte stream, which can be stored in a file.
Example: Saving a Dictionary
import pickle
# Sample dictionary
data = {"name": "Alice", "age": 25, "city": "New York"}
# Open file in binary write mode
with open("data.pkl", "wb") as file:
pickle.dump(data, file)
print("Data successfully saved!")
Key Points:
open("filename.pkl", "wb")
→ Opens a file in write-binary mode.pickle.dump(object, file)
→ Serializes and saves the object to a file.
2. Deserializing (Unpickling) Python Objects
Unpickling converts the byte stream back into a Python object.
Example: Loading the Pickled Dictionary
import pickle
# Open file in binary read mode
with open("data.pkl", "rb") as file:
loaded_data = pickle.load(file)
print("Loaded Data:", loaded_data)
Key Points:
open("filename.pkl", "rb")
→ Opens a file in read-binary mode.pickle.load(file)
→ Deserializes and loads the object.
Output:
Loaded Data: {'name': 'Alice', 'age': 25, 'city': 'New York'}
3. Pickling Multiple Objects
Save multiple objects in a single file.
import pickle
# Define multiple objects
names = ["Alice", "Bob", "Charlie"]
scores = {"Alice": 90, "Bob": 85, "Charlie": 92}
# Open file in binary write mode
with open("multiple.pkl", "wb") as file:
pickle.dump(names, file)
pickle.dump(scores, file)
print("Multiple objects saved successfully!")
Loading Multiple Objects
import pickle
# Open file in binary read mode
with open("multiple.pkl", "rb") as file:
names_loaded = pickle.load(file)
scores_loaded = pickle.load(file)
print("Names:", names_loaded)
print("Scores:", scores_loaded)
Output:
Names: ['Alice', 'Bob', 'Charlie']
Scores: {'Alice': 90, 'Bob': 85, 'Charlie': 92}
4. Pickling Custom Objects
Pickling can be used to save and load objects of user-defined classes.
Example: Saving and Loading a Class Object
import pickle
# Define a class
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"Person(name={self.name}, age={self.age})"
# Create an object
person1 = Person("Alice", 25)
# Save object
with open("person.pkl", "wb") as file:
pickle.dump(person1, file)
# Load object
with open("person.pkl", "rb") as file:
loaded_person = pickle.load(file)
print("Loaded Person:", loaded_person)
Output:
Loaded Person: Person(name=Alice, age=25)
5. Using pickle.dumps()
and pickle.loads()
Instead of saving to a file, we can convert objects to bytes in memory.
Example: Converting Objects to Bytes
import pickle
# Sample dictionary
data = {"fruit": "apple", "color": "red"}
# Serialize to bytes
byte_data = pickle.dumps(data)
print("Serialized Data:", byte_data)
# Deserialize from bytes
restored_data = pickle.loads(byte_data)
print("Restored Data:", restored_data)
Key Methods:
pickle.dumps(obj)
→ Converts object to a byte stream.pickle.loads(byte_data)
→ Converts byte stream back to object.
6. Handling Pickle Errors
Handle exceptions when working with pickle files.
import pickle
try:
with open("nonexistent.pkl", "rb") as file:
data = pickle.load(file)
except FileNotFoundError:
print("Error: File not found!")
except pickle.UnpicklingError:
print("Error: Corrupt or invalid pickle file!")
Common Errors:
FileNotFoundError
→ File does not exist.pickle.UnpicklingError
→ Corrupt or non-pickle file.
7. Pickling with Compression (gzip
or bz2
)
To save space, we can compress pickle files.
Example: Compressing Pickle Data with gzip
import pickle
import gzip
# Data to pickle
data = {"framework": "Django", "language": "Python"}
# Save with compression
with gzip.open("compressed.pkl.gz", "wb") as file:
pickle.dump(data, file)
# Load compressed pickle
with gzip.open("compressed.pkl.gz", "rb") as file:
decompressed_data = pickle.load(file)
print("Decompressed Data:", decompressed_data)
Use gzip.open()
to compress pickle files.
8. Secure Pickling with pickle.DEFAULT_PROTOCOL
Python provides different pickle protocols. Use the latest for security and efficiency.
import pickle
# Sample data
data = {"status": "active", "users": 1000}
# Use latest protocol
with open("secure.pkl", "wb") as file:
pickle.dump(data, file, protocol=pickle.DEFAULT_PROTOCOL)
Use protocol=pickle.DEFAULT_PROTOCOL
to avoid compatibility issues.
9. Avoiding Security Risks
Pickle can execute arbitrary code, making it unsafe for untrusted files.
Unsafe Example:
import pickle
# Malicious payload
malicious_code = b"cos\nsystem\n(S'echo Hacked!'\ntR."
# Executing untrusted pickle data
pickle.loads(malicious_code) # This will execute a system command!
Prevent Risks:
- Never unpickle untrusted data.
- Use alternative formats like JSON for text-based data.
- Consider restricted unpickling using
pickle.Unpickler
with custom classes.
10. Summary Table
Operation | Method | Example |
---|---|---|
Save to File | pickle.dump(obj, file) | pickle.dump(data, file) |
Load from File | pickle.load(file) | data = pickle.load(file) |
Convert to Bytes | pickle.dumps(obj) | byte_data = pickle.dumps(data) |
Load from Bytes | pickle.loads(bytes) | data = pickle.loads(byte_data) |
Save Multiple Objects | pickle.dump() multiple times | pickle.dump(obj1, file); pickle.dump(obj2, file) |
Compress Pickle File | gzip.open("file.pkl.gz", "wb") | pickle.dump(data, file) |
Secure Pickling | protocol=pickle.DEFAULT_PROTOCOL | pickle.dump(data, file, protocol=4) |