Introduction
NumPy, short for Numerical Python, is a library that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Created by Travis Oliphant in 2005, NumPy has become the backbone of many scientific computing libraries in Python.
Getting Started with NumPy
Installation
Before diving into NumPy, you need to install it. You can install NumPy using pip.
pip install numpy
Once installed, you can import NumPy in your Python script.
import numpy as np
Basic Operations
NumPy’s primary object is the homogeneous multidimensional array called an array. Let’s start by creating an array.
import numpy as np
# Creating a 1D array
arr = np.array([1, 2, 3, 4, 5])
# Creating a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)
print(arr_2d)
NumPy Basics
Array Creation
NumPy provides several methods to create arrays.
- Zeros and Ones: Creates arrays filled with zeros or ones.
zeros_array = np.zeros((3, 3))
ones_array = np.ones((2, 4))
- Empty Array: Creates an uninitialized array.
empty_array = np.empty((2, 3))
- Range and Linspace: Creates arrays with a sequence of numbers.
range_array = np.arange(0, 10, 2)
linspace_array = np.linspace(0, 1, 5)
Array Attributes
NumPy arrays have attributes that provide information about the array.
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.shape) # Shape of the array
print(arr.size) # Total number of elements
print(arr.ndim) # Number of dimensions
print(arr.dtype) # Data type of elements
Array Manipulation
Reshaping and Flattening
You can reshape arrays without changing their data.
arr = np.array([[1, 2, 3], [4, 5, 6]])
reshaped_arr = arr.reshape((3, 2))
# Flattening the array
flattened_arr = arr.flatten()
Stacking and Splitting
You can stack arrays vertically or horizontally and split them.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
# Vertical stack
vstacked = np.vstack((arr1, arr2))
# Horizontal stack
hstacked = np.hstack((arr1, arr2))
# Splitting arrays
split_arr = np.array_split(arr, 2)
Mathematical Operations
NumPy supports element-wise operations, matrix operations, and more.
Element-wise Operations
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
sum_arr = arr1 + arr2
product_arr = arr1 * arr2
Matrix Operation
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])
dot_product = np.dot(matrix1, matrix2)
Broadcasting
Broadcasting allows you to perform operations on arrays of different shapes:
arr = np.array([[1, 2, 3], [4, 5, 6]])
scalar = 2
result = arr * scalar # Multiplies each element by 2
Advanced NumPy
Universal Functions (ufuncs).
Universal functions (ufuncs) operate element-wise on arrays, offering fast vectorized operations.
arr = np.array([1, 2, 3])
sqrt_arr = np.sqrt(arr) # Square root
exp_arr = np.exp(arr) # Exponential
Linear Algebra
NumPy provides a submodule for linear algebra.
from numpy import linalg
matrix = np.array([[1, 2], [3, 4]])
# Determinant
det = linalg.det(matrix)
# Eigenvalues and eigenvectors
eigenvalues, eigenvectors = linalg.eig(matrix)
# Inverse
inverse_matrix = linalg.inv(matrix)
Random Number Generation
NumPy’s random module offers tools for generating random numbers.
random_array = np.random.random((2, 3))
# Random integers
randint_array = np.random.randint(1, 10, (2, 2))
# Random choice
choices = np.random.choice([10, 20, 30], size=5)
Performance Optimization
Vectorization
Vectorization refers to replacing explicit loops with array expressions, leading to faster code execution.
arr = np.array([1, 2, 3, 4, 5])
# Loop-based operation
result = []
for i in arr:
result.append(i**2)
# Vectorized operation
result = arr**2 # Much faster
Memory Layout
Understanding memory layout helps optimize performance.
- Contiguous Arrays: Arrays stored in contiguous memory blocks are faster to access.
- Strided Arrays: Using np. strides, you can control how the data is laid out in memory.
Conclusion
NumPy is a versatile and powerful library that every data scientist, engineer, and researcher should master. From simple array creation to complex linear algebra operations, NumPy provides the tools needed to handle numerical data efficiently. As you explore more advanced topics, such as broadcasting and vectorization, you'll unlock the full potential of Python for scientific computing.
About Author: Linkedin