Garbage collection (GC) in Python automatically frees up memory when objects are no longer needed. However, in some cases, memory is not released as expected. This can cause high memory usage, performance issues, or memory leaks in applications.
1. Why Is Garbage Collection Not Freeing Memory?
Python’s garbage collector works automatically, but certain factors prevent memory from being freed, such as:
- Reference Cycles → Objects referencing each other are not freed.
- Global or Long-Lived Objects → Variables stored in global scope or cache remain in memory.
- Circular References in Classes → Objects in self-referencing classes create cycles.
- Memory Fragmentation → Allocated memory is not efficiently reused.
- External Libraries Holding Memory → Some C extensions (NumPy, Pandas, TensorFlow) do not release memory immediately.
- Objects Not Out of Scope → If objects remain referenced, Python does not delete them.
2. How to Force Garbage Collection?
Python provides the gc
module to manually trigger garbage collection.
import gc
gc.collect() # Forces garbage collection
However, forcing GC is not always effective if references to objects still exist.
3. Common Causes & Solutions
Cause 1: Reference Cycles (Circular References)
If two objects reference each other, Python’s default reference counting cannot free them, causing a memory leak.
Example (Circular Reference Blocking GC)
import gc
class A:
def __init__(self):
self.ref = None
a1 = A()
a2 = A()
a1.ref = a2 # Circular reference
a2.ref = a1 # Circular reference
del a1, a2 # Deleting variables does not free memory
gc.collect() # Might not work as expected
Solution: Use weakref
to Break Circular References
import weakref
class A:
def __init__(self):
self.ref = None
a1 = A()
a2 = A()
a1.ref = weakref.ref(a2) # Weak reference
a2.ref = weakref.ref(a1)
del a1, a2
gc.collect() # Now GC can free memory
Cause 2: Objects Still in Scope
Even after calling gc.collect()
, objects won’t be freed if they still have references in the program.
Example (Object Stays in Scope)
import gc
class Data:
pass
obj = Data() # Still referenced
del obj
gc.collect() # Memory might not be freed
Solution: Remove All References
obj = None # Removes reference
gc.collect() # Forces GC to free memory
Alternatively, use:
del obj
gc.collect()
Cause 3: Global Variables Preventing GC
Global variables remain in memory even if you delete them inside a function.
Example (Global Variables Persist)
import gc
data = [1, 2, 3] # Global variable
def clear_data():
global data
del data # Deletes reference inside function
gc.collect()
clear_data()
print(data) # NameError: 'data' is not defined, but memory may not be fully freed
Solution: Use globals()
to Completely Remove It
del globals()['data']
gc.collect()
This ensures Python completely removes the variable from memory.
Cause 4: Objects Stored in Caches or Modules
If an object is stored in a cache, list, or a module, it may not be freed.
Example (Object Stored in List)
import gc
cache = []
obj = [1, 2, 3]
cache.append(obj) # Object is still referenced
del obj
gc.collect() # Memory is NOT freed
Solution: Clear the Cache
cache.clear()
gc.collect() # Now memory is freed
If using LRU cache (functools.cache):
from functools import lru_cache
@lru_cache(maxsize=None)
def expensive_function(x):
return x * 2
expensive_function(10)
expensive_function.cache_clear() # Clears stored function results
Cause 5: Memory Fragmentation
Memory fragmentation happens when small objects are allocated in different memory blocks, making it hard for Python to release memory.
Solution: Use malloc_trim()
(Linux Only)
import ctypes
ctypes.CDLL("libc.so.6").malloc_trim(0)
On Windows, restarting the program is the best option.
Cause 6: Large Objects in NumPy, Pandas, or TensorFlow
Some libraries allocate memory outside of Python’s GC, so gc.collect()
won’t work.
Example (NumPy Not Freeing Memory)
import numpy as np
arr = np.ones((10000, 10000)) # Large NumPy array
del arr
gc.collect() # Memory is not freed
Solution: Use del
+ gc.collect()
+ empty_cache()
(For GPU libraries)
import torch
del arr
gc.collect()
torch.cuda.empty_cache() # Frees GPU memory in PyTorch
For Pandas:
import pandas as pd
df = pd.DataFrame({'A': range(1000000)})
del df
gc.collect() # Helps in some cases
4. Summary of Fixes
Issue | Fix |
---|---|
Reference Cycles | Use weakref or gc.collect() |
Objects Still in Scope | Remove all references (del obj , obj = None ) |
Global Variables | Use del globals()['var_name'] |
Stored in Cache/Module | Clear cache (cache.clear() , lru_cache().clear() ) |
Memory Fragmentation | Use malloc_trim(0) (Linux only) |
External Libraries (NumPy, TensorFlow, Pandas, PyTorch) | Use del obj , gc.collect() , and empty_cache() |