Introduction to NumPy

Loading

NumPy (Numerical Python) is one of the most fundamental and widely-used libraries in the Python ecosystem, especially for scientific computing and data analysis. It provides support for handling large, multi-dimensional arrays and matrices, along with a large collection of mathematical functions to operate on these arrays. NumPy serves as the foundation for many other data science libraries, such as Pandas, Matplotlib, and Scikit-learn.

In this guide, we will cover the basics of NumPy, how to install it, its core functionalities, and common use cases.


1. Installing NumPy

To start using NumPy, you need to install it first. If you haven’t installed it yet, you can do so using pip:

pip install numpy

Once installed, you can import NumPy in your Python script as follows:

import numpy as np

By convention, the alias np is commonly used for NumPy.


2. NumPy Arrays (ndarray)

The core data structure in NumPy is the ndarray (n-dimensional array). It is a grid of values (of the same type), indexed by a tuple of non-negative integers. Arrays are the fundamental building block for data manipulation in NumPy.

2.1. Creating NumPy Arrays

NumPy arrays can be created from Python lists, tuples, or using NumPy’s built-in functions.

From a Python List

import numpy as np

# Create a 1D array
arr = np.array([1, 2, 3, 4, 5])
print(arr)

From a Python Tuple

arr = np.array((1, 2, 3, 4, 5))
print(arr)

Using NumPy Functions

NumPy provides various functions to create arrays of different shapes, such as:

  • np.zeros(): Create an array of zeros
  • np.ones(): Create an array of ones
  • np.arange(): Create an array with a specified range
  • np.linspace(): Create an array with evenly spaced values between a specified range
# Create an array of zeros
zeros_array = np.zeros((3, 3))
print(zeros_array)

# Create an array of ones
ones_array = np.ones((2, 4))
print(ones_array)

# Create an array with a specified range
range_array = np.arange(0, 10, 2)
print(range_array)

# Create an array with evenly spaced values
linspace_array = np.linspace(0, 1, 5)
print(linspace_array)

3. Array Properties

NumPy arrays have various attributes that provide information about the array, such as its shape, size, and data type.

  • ndarray.shape: Returns the shape of the array (a tuple indicating the size of each dimension).
  • ndarray.size: Returns the total number of elements in the array.
  • ndarray.ndim: Returns the number of dimensions (axes) of the array.
  • ndarray.dtype: Returns the data type of the elements in the array.
arr = np.array([[1, 2, 3], [4, 5, 6]])

print("Shape:", arr.shape) # (2, 3)
print("Size:", arr.size) # 6
print("Dimensions:", arr.ndim) # 2
print("Data type:", arr.dtype) # int64

4. Array Indexing and Slicing

Just like Python lists, NumPy arrays can be indexed and sliced, but with additional functionality for multi-dimensional arrays.

4.1. Indexing

For 1D arrays:

arr = np.array([1, 2, 3, 4, 5])
print(arr[0]) # Output: 1
print(arr[-1]) # Output: 5

For 2D arrays (Matrix):

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr[0, 1]) # Output: 2 (First row, second column)
print(arr[1, -1]) # Output: 6 (Second row, last column)

4.2. Slicing

You can slice arrays to extract subsets of data.

For 1D arrays:

arr = np.array([1, 2, 3, 4, 5])
print(arr[1:4]) # Output: [2 3 4]

For 2D arrays:

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr[1:, 1:]) # Output: [[5 6] [8 9]]

5. Array Operations

NumPy provides a vast set of mathematical functions for performing operations on arrays.

5.1. Element-wise Operations

You can perform element-wise operations like addition, subtraction, multiplication, division, etc., directly on arrays.

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Addition
print(arr1 + arr2) # Output: [5 7 9]

# Subtraction
print(arr1 - arr2) # Output: [-3 -3 -3]

# Multiplication
print(arr1 * arr2) # Output: [4 10 18]

# Division
print(arr1 / arr2) # Output: [0.25 0.4 0.5]

5.2. Mathematical Functions

NumPy provides several mathematical functions like np.sqrt(), np.sin(), np.cos(), np.log(), etc.

arr = np.array([1, 4, 9, 16])
print(np.sqrt(arr)) # Output: [1. 2. 3. 4.]

arr2 = np.array([0, np.pi / 2, np.pi])
print(np.sin(arr2)) # Output: [0. 1. 0.]

5.3. Aggregate Functions

NumPy offers functions for statistical and aggregate operations, such as np.sum(), np.mean(), np.min(), and np.max().

arr = np.array([1, 2, 3, 4, 5])

print(np.sum(arr)) # Output: 15
print(np.mean(arr)) # Output: 3.0
print(np.min(arr)) # Output: 1
print(np.max(arr)) # Output: 5

6. Broadcasting

Broadcasting is a powerful feature in NumPy that allows you to perform arithmetic operations on arrays of different shapes. NumPy automatically adjusts the shapes of the arrays to make them compatible for element-wise operations.

Example:

arr = np.array([[1, 2, 3], [4, 5, 6]])
scalar = 10

# Broadcasting scalar to each element of the array
result = arr + scalar
print(result)

Output:

[[11 12 13]
[14 15 16]]

7. Reshaping Arrays

You can reshape NumPy arrays to change their dimensions without modifying their data using the reshape() function.

arr = np.array([1, 2, 3, 4, 5, 6])
reshaped_arr = arr.reshape(2, 3)
print(reshaped_arr)

Output:

[[1 2 3]
[4 5 6]]

8. Random Number Generation

NumPy provides the np.random module, which allows you to generate random numbers and perform various statistical operations.

# Generate random numbers from a uniform distribution
random_arr = np.random.rand(3, 3)
print(random_arr)

# Generate random integers
random_int_arr = np.random.randint(1, 10, size=(2, 2))
print(random_int_arr)

Leave a Reply

Your email address will not be published. Required fields are marked *