The error MemoryError: unable to allocate large NumPy array
occurs when Python tries to create a NumPy array that is too large for the available system memory.
1. Why Does This Happen?
- RAM Limitations: If the requested array exceeds the available RAM, Python raises
MemoryError
. - Inefficient Data Types: Using high-memory data types like
float64
instead offloat32
increases memory usage. - No Virtual Memory Available: If swapping (virtual memory) is disabled, large arrays may fail to allocate.
2. Common Causes and Solutions
Cause 1: Requesting an Extremely Large Array
If the requested array size is too large, it may exceed available RAM.
Example (Allocating an Excessive Array)
import numpy as np
arr = np.zeros((100000, 100000), dtype=np.float64) # Requires ~74.5 GB RAM!
Solution: Use a Smaller Array or Reduce Data Type Size
arr = np.zeros((10000, 10000), dtype=np.float32) # Uses ~381 MB RAM
Estimate Required Memory:
size_in_bytes = 100000 * 100000 * np.dtype(np.float64).itemsize # 8 bytes per float64
size_in_gb = size_in_bytes / (1024**3)
print(f"Required Memory: {size_in_gb:.2f} GB")
Cause 2: Using an Inefficient Data Type
Using float64
instead of float32
doubles memory usage.
Example (Using Unnecessary float64
)
arr = np.ones((50000, 50000), dtype=np.float64) # ~18.6 GB
Solution: Use a Smaller Data Type
arr = np.ones((50000, 50000), dtype=np.float32) # ~9.3 GB
Alternative: Use Integer Types Instead of Floats
arr = np.ones((50000, 50000), dtype=np.int8) # Uses only ~2.3 GB
Cause 3: Not Using Memory-Mapped Files
NumPy supports memory mapping, which allows working with large arrays without loading them fully into RAM.
Solution: Use memmap
to Work with Large Arrays
arr = np.memmap('large_array.dat', dtype=np.float32, mode='w+', shape=(100000, 100000))
arr[:] = np.random.rand(100000, 100000) # Works without consuming all RAM
- This stores data on disk, preventing RAM overflow.
- Use case: Large datasets in ML, AI, data science.
Cause 4: Lack of Virtual Memory (Swap Space)
When physical RAM is full, systems use virtual memory (swap). If disabled, NumPy may fail to allocate large arrays.
Solution: Enable Swap Space (Linux/macOS/WSL)
sudo fallocate -l 8G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
- This adds 8GB of virtual memory to prevent MemoryErrors.
Windows: Increase Pagefile Size
- Go to System Properties → Advanced → Performance
- Click Settings → Advanced → Virtual Memory
- Increase pagefile size.
Cause 5: Processing Data in Large Chunks
Processing a large dataset all at once can exhaust memory.
Solution: Process Data in Smaller Chunks
Instead of:
data = np.loadtxt('huge_file.csv', delimiter=',')
Use:
import pandas as pd
chunks = pd.read_csv('huge_file.csv', chunksize=10000)
for chunk in chunks:
process(chunk)
3. Summary of Fixes
Issue | Fix |
---|---|
Allocating an extremely large array | Reduce size or data type precision |
Using float64 unnecessarily | Use float32 or smaller integers |
RAM limits | Use memory-mapped arrays (memmap ) |
No swap space | Enable virtual memory (swap/pagefile) |
Processing large data all at once | Use chunk-based processing |