Python Libraries for Machine Learning: Numpy

Introduction

 
Till now you configured an ML environment using Anaconda.
 
Python provides various functionalities to support implementing machine learning, with the help of different python libraries. From this chapter onwards, we will start exploring and studying each of them one by one.
 
We will start with NumPy or Numerical Python.

 
What is Python NumPy? 

 
Numeric, the ancestor of NumPy, was developed by Jim Hugunin. Another package Numarray was also developed, having some additional functionalities. In 2005, Travis Oliphant created the NumPy package by incorporating the features of Numarray into the Numeric package. There are many contributors to this open-source project.
 
NumPy or Numerical Python is a python library that provides the following
  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • tools for integrating C/C++ and Fortran code
  • useful linear algebra, Fourier Transform, and random number capabilities.
It can also provide an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. The official website is www.numpy.org
  

Installing NumPy in Python

 
1. Ubuntu/ Linux
  1. sudo apt update -y  
  2. sudo apt upgrade -y  
  3. sudo apt install python3-tk python3-pip -y  
  4. sudo pip install numpy -y
2. Anaconda
  1. conda install -c anaconda numpy  

NumPy Array

 
It is a powerful N-dimensional array which is in the form of rows and columns. We can initialize NumPy arrays from the nested Python list and access its elements.
 
NumPy array is not the same as the Standard Python Library Class array.array, which only handles 1D arrays.
  1. Single Dimensional NumPy Array
    1. import numpy as np  
    2. a = np.array([1,2,3])  
    3. print(a)  
  2. the above code will result in [1 2 3]
     
  3. Multi-Dimensional arrays
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a)  
    the above code will result in [[1 2 3] [4  5 6]]

NumPy Array Attributes

  1. ndarray.ndim
     
    It returns the number of axes (dimensions) of the array.
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a.ndim)  
    The output of the above code will be 2, since 'a' is a 2D array
     
  2. ndarray.shape 
     
    It returns a tuple of the dimension of the array, i.e. (n,m), where n is number of rows and m is the number of columns
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a.shape)  
    The output of the above code will be (2,3), i.e. 2 rows and 3 columns
     
  3. ndarray.size 
     
    It returns the total number of elements of the array.
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a.size)  
    The output of the above code will be 6 i.e. 2 x 3
     
  4. ndarray.dtype
     
    It returns an object describing the type of the elements in the array.
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a.dtype)  
    The output of the above code will be "int32" i.e. 32-bit integer
     
    we can explicitly define the data type of a NumPy array
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]], dtype = float)  
    3. print(a.dtype)  
    The above code will return "float64" i.e. 64-bit float
     
  5. ndarray.itemsize
     
    It returns the size in bytes of each element of the array.
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a.itemsize)  
    The output of the above code will be 4 i.e. 32/8 
     
  6. ndarray.data 
     
    It returns the buffer containing the actual elements of the array. This is an alternative of accessing the elements through indexing
    1. import numpy as np  
    2. a = np.array([[1,2,3],[4,5,6]])  
    3. print(a.data)  
    The above code will return the list of elements
     
  7. ndarray.sum()
     
    The function will return the sum of all the elements of the ndarray
    1. import numpy as np  
    2. a = np.random.random( (2,3) )  
    3. print(a)
    4. print(a.sum())  
    The matrix generated for me is [[0.46541517 0.66668157 0.36277909]
                                                         [0.7115755 0.57306008 0.64267163]],
     
    hence for me above code will return 3.422183052180838. Since the random number is used here, hence you may not get the same output
     
  8. ndarray.min()
     
    The function will return the minimum element value from the ndarray
    1. import numpy as np  
    2. a = np.random.random( (2,3) )  
    3. print(a.min())  
    The matrix generated for me is [[0.46541517 0.66668157 0.36277909]
                                                         [0.7115755 0.57306008 0.64267163]],
     
    hence for me above code will return 0.36277909. Since random number is used here, hence you may not get the same output
     
  9. ndarray.max() 
     
    The function will return the maximum element value from the ndarray
    1. import numpy as np  
    2. a = np.random.random( (2,3) )  
    3. print(a.max())  
    The matrix generated for me is [[0.46541517 0.66668157 0.36277909]
                                                          [0.7115755 0.57306008 0.64267163]],
     
    hence for me above code will return 0.7115755. Since random number is used here, hence you may not get the same output
     

NumPy Functions 

 

1. numpy.type()

 
Syntax 
 
type(numpy.ndarray) 
 
It is a python function is used to return the type of the parameter passed. In the case of numpy array, it will return numpy.ndarray
  1. import numpy as np  
  2. a = np.array([[1,2,3],[4,5,6]])  
  3. print(type(a))  
The above code will return numpy.ndarray
 

2. numpy.zeros() 

 
Syntax
 
numpy.zeros((rows,columns), dtype)
 
The above function will create a numpy array of the given the dimensions with each element being zero. If no dtype is defined, default dtype is taken
  1. import numpy as np  
  2. np.zeros((3,3))  
  3. print(a)  
The above code will result in a 3x3 numpy array with each element being zero.
 

3. numpy.ones()

 
Syntax
 
numpy.ones((rows,columns), dtype)
  
The above function will create a numpy array of the given dimensions. If no dtype is defined with each element being one, default dtype is taken. 
  1. import numpy as np    
  2. np.ones((3,3))    
  3. print(a)    
The above code will result in a 3x3 numpy array with each element being one. 
 

4. numpy.empty() 

 
Syntax
 
numpy.empty((rows,columns)) 
 
The above function creates an array whose initial content is random and depends on the state of the memory.
  1. import numpy as np      
  2. np.empty((3,3))      
  3. print(a)   
The above code will result in a 3x3 numpy array with each element being random. 
 

5. numpy.arange() 

 
Syntax
 
numpy.arange(start, stop, step)
 
The above function is used to make a numpy array with elements in the range between the start and stop value with the difference of step value.
  1. import numpy as np  
  2. a=np.arange(5,25,4)  
  3. print(a)  
The output of the above code will be [ 5 9 13 17 21 ] 
 

6. numpy.linspace() 

 
Syntax 
 
numpy.linspace(start, stop, num_of_elements)
 
The above function is used to make a numpy array with elements in the range between the start and stop value and num_of_elements as the size of the numpy array. The default dtype of numpy array is float64
  1. import numpy as np  
  2. a=np.linspace(5,25,5)  
  3. print(a)  
The output of the above code will be [ 5 10 15 20 25 ] 
 

7. numpy.logspace() 

 
Syntax
 
numpy.logspace(start, stop, num_of_elements)
 
The above function is used to make a numpy array with elements in the range between the start and stop value and num_of_elements as the size of the numpy array. The default dtype of numpy array is float64. All the elements will be spanned over the logarithmic scale i.e the resulting elements are the log of the corresponding element.
  1. import numpy as np  
  2. a=np.logspace(5,25,5)  
  3. print(a)  
The output of the above code will be [1.e+05 1.e+10 1.e+15 1.e+20 1.e+25]
 

8. numpy.sin() 

 
Syntax
 
numpy.sin(numpy.ndarray)
 
The above code will return the sin of the given parameter.
  1. import numpy as np  
  2. a=np.logspace(5,25,2)  
  3. print(np.sin(a))  
The output of the above code will be [ 0.0357488 -0.3052578]
 
Similarly, there are cos() ,tan(), etc.
 

9. numpy.reshape() 

 
Syntax
 
numpy.resahpe(dimensions)
 
The above function is used to change the dimension of a numpy array. The number of arguments in the reshape decides the dimensions of the numpy array.
  1. import numpy as np  
  2. a=np.arange(9).reshape(3,3)  
  3. print(a)  
The output of the above code will be a 2D array with 3x3 dimensions
 

10. numpy.random.random() 

 
Syntax 
 
numpy.random.random( (rows, column) )
 
The above function is used to return a numpy ndarray with the given dimensions and each element of ndarray being randomly generated.
  1. a = np.random.random((2,2))  
The above code will return a  2x2 ndarray
 

11. numpy.exp() 
 

Syntax
 
numpy.exp(numpy.ndarray)
 
The above function returns a ndarray with exponential of every element 
  1. b = np.exp([10])  
The above code returns the value 22026.4657948
 

12. numpy.sqrt()
 

Syntax
 
numpy.sqrt(numpy.ndarray)
 
The above function returns a ndarray with ex of every element
  1. b = np.sqrt([16])   
The above code returns the value 4
 

NumPy Basic Operations  

  1. a = np.array( [ 510152025] )  
  2. b = np.array( [ 0, 1, 2, 3 ] )  
1. The below code will return the difference between the two arrays
  1. c = a - b  
2. The below code will return the arrays containing the square of each element
  1. b**2  
3. The below code will return the value according to the given expression
  1. 10* np.sin(a)  
4. The below code will return "true" at every element position which satisfies the given condition
  1. a<15  

NumPy Array Basic Operations

  1. a = np.array( [[1,1], [0,1]])  
  2. b = np.array( [[2,0],[3,4]])  
1. The below code will return the elementwise product of both the arrays
  1. a * b  
2. The below code will return the matrix product of both the arrays
  1. a @ b  
or
  1. a.dot(b)  
 

Conclusion 

 
In this chapter, we studied Python NumPy. In the next chapter, we will learn about Python Pandas.
 
Python Pandas is an excellent library used majority for data manipulation and analysis. 
Author
Rohit Gupta
66 27.5k 3m
Next » Python Libraries for Machine Learning: Pandas