Table of Contents

Introduction to NumPy

What is NumPy?

NumPy is a Python package which stands for ‘Numerical Python’. It is the core library for scientific computing, which contains a powerful n-dimensional array object, provide tools for integrating C, C++ etc. It is also useful in linear algebra, random number capability etc. NumPy array can also be used as an efficient multi-dimensional container for generic data. Now, let me tell you what exactly is a python numpy array.

Keypoints

  • Numpy stands for numerical Python

  • Fundamental package for numerical computations in Python

  • a powerful N-dimensional array object

  • sophisticated (broadcasting) functions

  • tools for integrating C/C++ and Fortran code

  • useful linear algebra, Fourier transform, and random number capabilities

NumPy Array

Numpy array is a powerful N-dimensional array object which is in the form of rows and columns. We can initialize numpy arrays from nested Python lists and access it elements. In order to perform these numpy operations.

N-dimensional Array

  • 1Dimensional(1D) Array

  • 2Dimensional(2D) Array

  • 3Dimensional(3D) Array NdArray

Getting Started

Use the following import convention

import numpy as np

Why Numpy?

  • Less Memory

  • Fast

  • Convenient

Calculation

  • Element wise sum is not possible in Python list. But numpy can do that it is an advantage of numpy array

# add 2 lists 
L1 = [1, 2, 3]
L2 = [4, 5, 6]
print(L1+L2)
[1, 2, 3, 4, 5, 6]
# element wise sum using numpy array 
import numpy as np 
A1 = np.array([1, 2, 3])
A2 = np.array([4, 5, 6])
print(A1+A2)
[5 7 9]

Less Memory

import numpy as np
import time
import sys
S = range(1000)
print("Python List: ", sys.getsizeof(5)*len(S))
 
D = np.arange(1000)
print("Numpy Array: ", D.size*D.itemsize)
Python List:  28000
Numpy Array:  8000

Faster

import time
import sys
 
SIZE = 1000000
 
L1 = range(SIZE)
L2 = range(SIZE)
A1 = np.arange(SIZE)
A2 = np.arange(SIZE)
 
start= time.time()
result=[(x,y) for x,y in zip(L1,L2)]
# time in ms 
print((time.time()-start)*1000)
 
start = time.time()
result = A1+A2
# time in ms 
print((time.time()-start)*1000)
237.50925064086914
58.170318603515625
%timeit sum(range(1000))
16.8 µs ± 765 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.sum(np.arange(1000))
9.9 µs ± 2.88 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Creating Arrays

  • Array: Ordered collection of elements of basic data types of given length.

  • Syntax

np.array(object)
# import numpy 
import numpy as np 
# Creating 1D array
A = np.array([1, 2, 3])
print(A)
[1 2 3]
# type 
print(type(A))
<class 'numpy.ndarray'>

Array with Categorical Entities

  • Numpy can handle different categorical entities.

  • All elements are coerced into same data type

# create an array with categorical entities. 
X = np.array([12, 13, "n"])
print(X)
['12' '13' 'n']
# type 
print(type(X))
<class 'numpy.ndarray'>
# Creating 2D array
A2 = np.array([[3, 4, 5], [7, 8, 9]])
print(A2) 
[[3 4 5]
 [7 8 9]]
# Creating 3D array
A3 = np.array([[(1, 2, 3), (4, 5, 6)], [(7, 8, 9), (10, 11, 12)]])
print(A3) 
[[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]

Inspecting array properties

Size

  • Returns number of elements in array

  • Syntax: array.size

A1 = np.array([1, 2, 3,4, 5])
# size 
A1.size
5

Shape

  • Returns dimensions of array (rows,columns)

  • Syntax: array.shape

A2 = np.array([[4, 5, 6], [7, 8, 9]])
# shape 
A2.shape 
(2, 3)
# get row 
A2.shape[0]
2
# get column
A2.shape[1]
3

Data Type

  • Returns type of elements in array

  • Syntax: array.size

A3 = np.linspace(0, 100, 6)
# dtypes 
A3.dtype
dtype('float64')

Type Conversion

  • Convert array elements to type dtype

  • Syntax: array.astype(dtype)

    • dtype - data type

A4 = np.ones((2,3))
# convert 
A4.astype(np.float16)
array([[1., 1., 1.],
       [1., 1., 1.]], dtype=float16)

Numpy array to Python List

  • Returns the Python list

  • Syntax: array.tolist()

A5 = np.linspace(0, 100, 20)
# array to list 
A5.tolist() 
[0.0,
 5.2631578947368425,
 10.526315789473685,
 15.789473684210527,
 21.05263157894737,
 26.315789473684212,
 31.578947368421055,
 36.8421052631579,
 42.10526315789474,
 47.36842105263158,
 52.631578947368425,
 57.89473684210527,
 63.15789473684211,
 68.42105263157896,
 73.6842105263158,
 78.94736842105263,
 84.21052631578948,
 89.47368421052633,
 94.73684210526316,
 100.0]

Get Help: View documentation

  • Returns a documentation

  • Syntax: np.info(np.function)

    • function - linspace, logspace, eye, ones, zeros etc.

np.info(np.linspace)
 linspace(*args, **kwargs)

Return evenly spaced numbers over a specified interval.

Returns `num` evenly spaced samples, calculated over the
interval [`start`, `stop`].

The endpoint of the interval can optionally be excluded.

.. versionchanged:: 1.16.0
    Non-scalar `start` and `stop` are now supported.

Parameters
----------
start : array_like
    The starting value of the sequence.
stop : array_like
    The end value of the sequence, unless `endpoint` is set to False.
    In that case, the sequence consists of all but the last of ``num + 1``
    evenly spaced samples, so that `stop` is excluded.  Note that the step
    size changes when `endpoint` is False.
num : int, optional
    Number of samples to generate. Default is 50. Must be non-negative.
endpoint : bool, optional
    If True, `stop` is the last sample. Otherwise, it is not included.
    Default is True.
retstep : bool, optional
    If True, return (`samples`, `step`), where `step` is the spacing
    between samples.
dtype : dtype, optional
    The type of the output array.  If `dtype` is not given, infer the data
    type from the other input arguments.

    .. versionadded:: 1.9.0

axis : int, optional
    The axis in the result to store the samples.  Relevant only if start
    or stop are array-like.  By default (0), the samples will be along a
    new axis inserted at the beginning. Use -1 to get an axis at the end.

    .. versionadded:: 1.16.0

Returns
-------
samples : ndarray
    There are `num` equally spaced samples in the closed interval
    ``[start, stop]`` or the half-open interval ``[start, stop)``
    (depending on whether `endpoint` is True or False).
step : float, optional
    Only returned if `retstep` is True

    Size of spacing between samples.


See Also
--------
arange : Similar to `linspace`, but uses a step size (instead of the
         number of samples).
geomspace : Similar to `linspace`, but with numbers spaced evenly on a log
            scale (a geometric progression).
logspace : Similar to `geomspace`, but with the end points specified as
           logarithms.

Examples
--------
>>> np.linspace(2.0, 3.0, num=5)
array([2.  , 2.25, 2.5 , 2.75, 3.  ])
>>> np.linspace(2.0, 3.0, num=5, endpoint=False)
array([2. ,  2.2,  2.4,  2.6,  2.8])
>>> np.linspace(2.0, 3.0, num=5, retstep=True)
(array([2.  ,  2.25,  2.5 ,  2.75,  3.  ]), 0.25)

Graphical illustration:

>>> import matplotlib.pyplot as plt
>>> N = 8
>>> y = np.zeros(N)
>>> x1 = np.linspace(0, 10, N, endpoint=True)
>>> x2 = np.linspace(0, 10, N, endpoint=False)
>>> plt.plot(x1, y, 'o')
[<matplotlib.lines.Line2D object at 0x...>]
>>> plt.plot(x2, y + 0.5, 'o')
[<matplotlib.lines.Line2D object at 0x...>]
>>> plt.ylim([-0.5, 1])
(-0.5, 1)
>>> plt.show()

References

  • https://numpy.org/

  • https://www.edureka.co/blog/python-numpy-tutorial/

  • https://github.com/enthought/Numpy-Tutorial-SciPyConf-2019

  • Python Machine Learning Cookbook


This notebook was created by Jubayer Hossain | Copyright © 2020, Jubayer Hossain