![]()
In Python, data classes provide a simple way to define classes that primarily store data, reducing boilerplate code for common tasks like initializing objects, comparing instances, and printing readable representations.
Python’s built-in dataclasses module (introduced in Python 3.7) offers the @dataclass decorator, which automatically generates methods like:
__init__() → Auto-generates an initializer
__repr__() → Creates a readable string representation
__eq__() → Enables equality comparison
__hash__() → (Optional) Allows hashability for use in sets and dictionaries
Why Use Data Classes?
- Less boilerplate → No need to manually define
__init__(),__repr__(), etc. - Readability → Code is cleaner and self-explanatory.
- Automatic comparison → Built-in
__eq__()for instance comparisons. - Immutable options → Can create frozen (read-only) objects.
1. Creating a Simple Data Class
The @dataclass decorator is used to define a data class.
Example: Basic Data Class
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
p1 = Person("Alice", 30)
p2 = Person("Bob", 25)
print(p1) # Output: Person(name='Alice', age=30)
print(p1 == p2) # Output: False (compares values automatically)
What happens here?
@dataclassautomatically generates__init__(),__repr__(), and__eq__()p1 == p2works without manually defining__eq__()
2. Adding Default Values
You can set default values using the = operator.
Example: Default Values
@dataclass
class Employee:
name: str
salary: float = 50000 # Default value
e1 = Employee("John")
e2 = Employee("Doe", 70000)
print(e1) # Output: Employee(name='John', salary=50000)
print(e2) # Output: Employee(name='Doe', salary=70000)
Why use this?
- Allows optional attributes without explicitly passing values.
3. Using field() for Default Values with dataclasses.field()
For default values requiring function calls (e.g., list, dict, datetime), use field(default_factory=...).
Example: Using default_factory
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class Task:
title: str
created_at: datetime = field(default_factory=datetime.now) # Auto-assign current time
t1 = Task("Buy groceries")
t2 = Task("Read a book")
print(t1) # Output: Task(title='Buy groceries', created_at=2025-03-10 12:34:56)
print(t2) # Different timestamp
Why use default_factory?
- Prevents using mutable defaults (
[],{}) that could cause shared state issues.
4. Making Data Classes Immutable (frozen=True)
Setting frozen=True makes the class immutable (read-only).
Example: Immutable Data Class
@dataclass(frozen=True)
class Point:
x: int
y: int
p = Point(5, 10)
print(p.x) # Output: 5
p.x = 20 # Error: Cannot modify frozen dataclass
Why use frozen=True?
- Prevents accidental modifications.
- Useful for hashable objects (e.g., dictionary keys).
5. Controlling __repr__(), __eq__(), and __hash__()
By default, dataclass generates these methods. You can disable them if needed.
Example: Customizing @dataclass Behavior
@dataclass(repr=False, eq=False)
class Car:
brand: str
model: str
c1 = Car("Toyota", "Camry")
c2 = Car("Toyota", "Camry")
print(c1) # Output: <__main__.Car object at 0x...> (No auto `__repr__()`)
print(c1 == c2) # Output: False (No `__eq__()` defined)
Why use this?
- Helps when customizing class behavior or avoiding unintended comparisons.
6. Sorting Data Classes with order=True
Setting order=True enables sorting (<, >, <=, >= operators).
Example: Sorting Objects
@dataclass(order=True)
class Student:
grade: int
name: str # Sorting is based on the first field (grade)
s1 = Student(90, "Alice")
s2 = Student(85, "Bob")
print(s1 > s2) # Output: True (90 > 85)
Why use order=True?
- Enables sorting of objects without manually defining comparison methods.
7. Converting Data Classes to Dictionaries (asdict())
The dataclasses.asdict() function converts a data class instance into a dictionary.
Example: Converting to Dictionary
from dataclasses import asdict
@dataclass
class Product:
name: str
price: float
p = Product("Laptop", 999.99)
print(asdict(p)) # Output: {'name': 'Laptop', 'price': 999.99}
Why use this?
- Useful for JSON serialization or working with APIs.
8. Inheriting from Data Classes
Data classes support inheritance.
Example: Inheriting from Another Data Class
@dataclass
class Animal:
species: str
@dataclass
class Dog(Animal):
breed: str
age: int
d = Dog("Mammal", "Labrador", 5)
print(d) # Output: Dog(species='Mammal', breed='Labrador', age=5)
Why use this?
- Allows hierarchical organization of related data.
9. Comparing dataclasses vs Regular Classes
| Feature | Regular Class | Data Class |
|---|---|---|
| Boilerplate Code | More (manual __init__, __repr__, etc.) | Less (auto-generated) |
Equality Comparison (==) | Based on object identity | Based on field values |
Hashability (hash()) | Requires manual implementation | Automatic (if frozen=True) |
| Sorting | Requires defining __lt__, __gt__, etc. | Auto-generated (order=True) |
| Conversion to Dict | Manual implementation needed | asdict() available |
10. When to Use Data Classes?
Use dataclasses when:
- You need lightweight classes for storing data.
- You want automatic comparison and printing (
__eq__(),__repr__()). - You need immutability (
frozen=True). - You work with JSON, APIs, or databases (use
asdict()).
Avoid dataclasses if:
- You need custom behavior beyond just storing data.
- You require complex inheritance and method overrides.
- You are using older Python versions (<3.7) (Use
namedtupleinstead).
