Garbage collection not freeing memory

Loading

Garbage collection (GC) in Python automatically frees up memory when objects are no longer needed. However, in some cases, memory is not released as expected. This can cause high memory usage, performance issues, or memory leaks in applications.


1. Why Is Garbage Collection Not Freeing Memory?

Python’s garbage collector works automatically, but certain factors prevent memory from being freed, such as:

  • Reference Cycles → Objects referencing each other are not freed.
  • Global or Long-Lived Objects → Variables stored in global scope or cache remain in memory.
  • Circular References in Classes → Objects in self-referencing classes create cycles.
  • Memory Fragmentation → Allocated memory is not efficiently reused.
  • External Libraries Holding Memory → Some C extensions (NumPy, Pandas, TensorFlow) do not release memory immediately.
  • Objects Not Out of Scope → If objects remain referenced, Python does not delete them.

2. How to Force Garbage Collection?

Python provides the gc module to manually trigger garbage collection.

import gc

gc.collect() # Forces garbage collection

However, forcing GC is not always effective if references to objects still exist.


3. Common Causes & Solutions

Cause 1: Reference Cycles (Circular References)

If two objects reference each other, Python’s default reference counting cannot free them, causing a memory leak.

Example (Circular Reference Blocking GC)

import gc

class A:
def __init__(self):
self.ref = None

a1 = A()
a2 = A()
a1.ref = a2 # Circular reference
a2.ref = a1 # Circular reference

del a1, a2 # Deleting variables does not free memory
gc.collect() # Might not work as expected

Solution: Use weakref to Break Circular References

import weakref

class A:
def __init__(self):
self.ref = None

a1 = A()
a2 = A()
a1.ref = weakref.ref(a2) # Weak reference
a2.ref = weakref.ref(a1)

del a1, a2
gc.collect() # Now GC can free memory

Cause 2: Objects Still in Scope

Even after calling gc.collect(), objects won’t be freed if they still have references in the program.

Example (Object Stays in Scope)

import gc

class Data:
pass

obj = Data() # Still referenced
del obj
gc.collect() # Memory might not be freed

Solution: Remove All References

obj = None  # Removes reference
gc.collect() # Forces GC to free memory

Alternatively, use:

del obj
gc.collect()

Cause 3: Global Variables Preventing GC

Global variables remain in memory even if you delete them inside a function.

Example (Global Variables Persist)

import gc

data = [1, 2, 3] # Global variable

def clear_data():
global data
del data # Deletes reference inside function
gc.collect()

clear_data()
print(data) # NameError: 'data' is not defined, but memory may not be fully freed

Solution: Use globals() to Completely Remove It

del globals()['data']
gc.collect()

This ensures Python completely removes the variable from memory.


Cause 4: Objects Stored in Caches or Modules

If an object is stored in a cache, list, or a module, it may not be freed.

Example (Object Stored in List)

import gc

cache = []
obj = [1, 2, 3]
cache.append(obj) # Object is still referenced

del obj
gc.collect() # Memory is NOT freed

Solution: Clear the Cache

cache.clear()
gc.collect() # Now memory is freed

If using LRU cache (functools.cache):

from functools import lru_cache

@lru_cache(maxsize=None)
def expensive_function(x):
return x * 2

expensive_function(10)
expensive_function.cache_clear() # Clears stored function results

Cause 5: Memory Fragmentation

Memory fragmentation happens when small objects are allocated in different memory blocks, making it hard for Python to release memory.

Solution: Use malloc_trim() (Linux Only)

import ctypes
ctypes.CDLL("libc.so.6").malloc_trim(0)

On Windows, restarting the program is the best option.


Cause 6: Large Objects in NumPy, Pandas, or TensorFlow

Some libraries allocate memory outside of Python’s GC, so gc.collect() won’t work.

Example (NumPy Not Freeing Memory)

import numpy as np

arr = np.ones((10000, 10000)) # Large NumPy array
del arr
gc.collect() # Memory is not freed

Solution: Use del + gc.collect() + empty_cache() (For GPU libraries)

import torch

del arr
gc.collect()
torch.cuda.empty_cache() # Frees GPU memory in PyTorch

For Pandas:

import pandas as pd

df = pd.DataFrame({'A': range(1000000)})
del df
gc.collect() # Helps in some cases

4. Summary of Fixes

IssueFix
Reference CyclesUse weakref or gc.collect()
Objects Still in ScopeRemove all references (del obj, obj = None)
Global VariablesUse del globals()['var_name']
Stored in Cache/ModuleClear cache (cache.clear(), lru_cache().clear())
Memory FragmentationUse malloc_trim(0) (Linux only)
External Libraries (NumPy, TensorFlow, Pandas, PyTorch)Use del obj, gc.collect(), and empty_cache()

Leave a Reply

Your email address will not be published. Required fields are marked *