# Introducing NumPy in Python for Data Science

### Introduction

Numpy is a python library that provides computation on large array elements and matrices. Numpy provides fast and efficient processing on n-dimensional arrays.

Array elements in numpy are stored in contiguous memory locations so that the processing of elements is fast and efficient, unlike in a list where the elements are stored in a random memory location.

### Installation of numpy

`\$ pip install numpy`

### Creation of numpy array

First of all, we’ve to import the numpy module

`import numpy as np`

Here, the numpy module is imported as np means numpy is shortened to np.

### One dimensional array

Let’s create a numpy array

```arr = np.array([1,3,5,6,8,9])
print("Array Created: ", arr)```

Output

`Array Created:  [1 3 5 6 8 9]`

Now, let’s see the type of arr we just created

`print("Type of arr is: ", type(arr))`

Output

`Type of arr is:  <class 'numpy.ndarray'>`

We can create one-dimensional array using arange() function

```arr = np.arange(10)
print("Array created: ", arr)```

Output

`Array created:  [0 1 2 3 4 5 6 7 8 9]`

Note: arange() function in numpy is similar to range() function in python

Now, let’s use a list to create an array using numpy

```a = [1,2,3,4,5]
arr = np.array(a)
print("Array created from list: ", arr)```

Output

`Array created from list:  [1 2 3 4 5]`

### Two dimensional array

Previously we’ve seen how a one-dimensional array is created using the numpy module in python. Now, we’re going to see how a two-dimensional array is created using numpy

```a  = [1,2,3,4,5]
b = [6,7,8,9,10]
arr = np.array([a, b])
print( arr)```

Output

```[[ 1  2  3  4  5]
[ 6  7  8  9 10]]```

This is how we can create a two-dimensional array in python using a list

```arr = np.array([[1,2,3,4,5],[6,7,8,9,10]])
print( arr)```

Output

```[[ 1  2  3  4  5]
[ 6  7  8  9 10]]```

This is another method where we can give elements directly to create two dimensional array

### The shape of an array

We can determine the shape of an array using shape() function

```arr = np.array([[1,2,3,4,5],[6,7,8,9,10]])
print("Shape of array is: ", arr.shape)```

Output

`Shape of array is:  (2, 5)`

The shape function returns the no of rows and columns present in an array. In this case, there are 2 rows and 5 columns

### Dimension of an array

we can find the dimension of a numpy array

```arr = np.array([[1,2,3,4,5],[6,7,8,9,10]])
print("Dimension of array is: ", arr.ndim)```

Output

`Dimension of array is:  2`

Let’s see another example

```arr = np.array([1,2,3,4,5])
print("Dimension of array is: ", arr.ndim)```

Output

`Dimension of array is:  1`
```arr = np.array([[[1,2,3,4,5], [1,2,3,4,5]], [[1,2,3,4,5], [1,2,3,4,5]]])
print(arr)
print("Dimension of array is: ", arr.ndim)```

Output

```[[[1 2 3 4 5]
[1 2 3 4 5]]

[[1 2 3 4 5]
[1 2 3 4 5]]]
Dimension of array is:  3```

### Size of an array

we can also check the size of a numpy array

```a = [1,3,5,6,8,9,6,8]
arr = np.array(a)
print("size of arr is: ", arr.size)```

Output

`size of arr is:  8`

The size function returns the number of elements in an array

```arr = np.array([[[1,2,3,4,5], [1,2,3,4,5]], [[1,2,3,4,5], [1,2,3,4,5]]])
print("size of array is: ", arr.size)```

Output

`size of array is:  20`

### Accessing elements of an array

Array elements can be accessed using index same as list and tuple

```a = [1,2,3,4,5]
arr = np.array(a)
print(arr[0])
print(arr[2])
print(arr[4])```

Output

```1
3
5```

Elements can also be accessed using a loop

```for ele in arr:
print(ele)```

Output

```1
2
3
4
5```

Let’s see an example of a 2-D array

```arr = np.array([[1,2,3,4,5], [23, 45, 67 ,98, 100]])
print(arr[0][1])
print(arr[0][0])
print(arr[1][3])
print(arr[1][0])```

Output

```2
1
98
23```

Explanation

In the above example, we passed two lists inside a single list where the first list index is 0 and another one index is 1 to make a 2-dimensional array with 2 rows and 5 columns.

Thus, if we have to access elements from the first row two indices must be passed i.e. the first index for the selection of row and the second index for selecting column. The first ‘print statement’ prints the value of first row and second column which is 2. Like as, third ‘print statement’ prints elements from the second row and fourth column which is 98.

### Slicing array

Syntax

``array_name[start : end : step]``

Let’s see slicing of one dimensional array

```a = [1,2,3,4,5]
arr = np.array(a)
print(arr[0:4])```

Output

`array([1, 2, 3, 4])`

Index of array starts with 0 and end with one less than length of an array. Here in this example index starts from 0 and end with 4. We know the end index or upper bound is exclusive the above example retrieves the elements indexing 0 to 3.

`print(arr[-5:-1])`

Output

`array([1, 2, 3, 4])`

Indexing is assigned in negative as well. The negative indexing starts from -1 which is assigned to last element of array and ends with a negative length of an array.

Let’s see slicing of two dimensional array

Syntax

`array_name[start_row : end_row: step_row, start_column : end_column: step_column]`
```arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(arr)```

Output

```[[1 2 3]
[4 5 6]
[7 8 9]]```

Now, let’s take elements from first two rows and last two columns

`print(arr[0:2, 1:])`

Output

```[[2 3]
[5 6]]```

Lets take out last element of array

`print(arr[2:, 2:])`

Output

`[[9]]`

Let’s take out last two elements from the second row

`print(arr[1:2, 1:])`

Output

`[[5 6]]`

### Reshaping an array

Using reshape() function, we can define a new array from a previously defined array

```arr1 = np.array([1,2,3,4,5,6])
arr2 = arr1.reshape(3,2)
print(arr2)```

Output

```[[1 2]
[3 4]
[5 6]]```
```arr1 = np.array([1,2,3,4,5,6])
print("shape of arr1: ", arr1.shape, "\n")
arr2 = arr1.reshape(3,2)
print(arr2, "\n")
print("shape of arr2: ", arr2.shape)```

Output

```shape of arr1:  (6,)

[[1 2]
[3 4]
[5 6]]

shape of arr2:  (3, 2)```

Here we’ve changed the shape of arr1. All we have to care about during reshaping is that the no of elements must be the same in both the new and previous array

```arr2 = arr1.reshape(2,2)
print(arr2)```

Output

```---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-70-a75024147a88> in <module>
1 arr1 = np.array([1,2,3,4,5,6])
----> 2 arr2 = arr1.reshape(2,2)
3 print(arr2)

ValueError: cannot reshape array of size 6 into shape (2,2)```

Here, we got ValueError because we are trying to reshape an array having 6 elements to an array with 4 elements in it.

```arr1 = np.array([[1,2,3,4,5,6], [7,8,9,10,11,12]])
arr2 = arr1.reshape(12,1)
print(arr2)```

Output

```[[ 1]
[ 2]
[ 3]
[ 4]
[ 5]
[ 6]
[ 7]
[ 8]
[ 9]
[10]
[11]
[12]]```

In both arrays, the number of elements is same just the shape is changed.

```arr2 = arr1.reshape(4, 3)
print(arr2)```

Output

```[[ 1  2  3]
[ 4  5  6]
[ 7  8  9]
[10 11 12]]```

### Using append() function

Row wise appending

Syntax

`np.append(previous_array, [array_to_be_add], axis =0)`
```a = np.array([20,21,22])
np.append(arr2,[a],axis=0)```

Output

```array([[ 1,  2,  3],
[ 4,  5,  6],
[ 7,  8,  9],
[10, 11, 12],
[20, 21, 22]])```

Column wise appending

Syntax

` np.append(previous_array, array_to_be_add, axis =1)`
```a = np.array([20,21,22,23])
b= a.reshape(4,-1)
np.append(arr2,b,axis=1)```

Output

```array([[ 1,  2,  3, 20],
[ 4,  5,  6, 21],
[ 7,  8,  9, 22],
[10, 11, 12, 23]])```

### Using insert()function

Row wise inserting

Syntax

`np.insert(previous_array, inserting_index, array_tobe_inserted, axis=0)`
```arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
inserting_arr = np.array([11,12,13])
print("Before insertion: ")
print(arr)
print("after insertion at index 2: ")
print(np.insert(arr, 2, inserting_arr, axis=0))```

Output

```Before insertion:
[[1 2 3]
[4 5 6]
[7 8 9]]
after insertion at index 2:
[[ 1  2  3]
[ 4  5  6]
[11 12 13]
[ 7  8  9]]```

Column wise inserting

Syntax

`np.insert(previous_array, inserting_index, array_tobe_inserted, axis=1)`
```arr = np.array([[1,2,3],[4,5,6],[7,8,9]])
inserting_arr = np.array([11,12,13])
print("Before insertion: ")
print(arr)
print("after insertion at index 1: ")
print(np.insert(arr, 1, inserting_arr, axis=1))```

Output

```Before insertion:
[[1 2 3]
[4 5 6]
[7 8 9]]
after insertion at index 1:
[[ 1 11  2  3]
[ 4 12  5  6]
[ 7 13  8  9]]```

### Matrix generation using numpy

We can generate matrices having elements all one and zero using the ones() and zeros() function

```zero_matrix = np.zeros([3,3], dtype=int)
print(zero_matrix)```

Output

```[[0 0 0]
[0 0 0]
[0 0 0]]```

Let’s take another example

```ones_matrix = np.ones([4,5], dtype=float)
print(ones_matrix)```

Output

```[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]```

Using random, we can generate matrices too

Syntax

`np.random.rand(size of array)`
```arr = np.random.rand(3,5)
print(arr)
print(arr.shape)```

Output

```[[0.74206847 0.44733595 0.10237527 0.34372174 0.87838503]
[0.48042584 0.46966427 0.318181   0.88341896 0.46838867]
[0.02591508 0.58777176 0.07273747 0.80669176 0.69172011]]
(3, 5)```

#### Element wise operation of an array

One dimensional array

```arr1 = np.array([1,2,3,4])
arr2 = np.array([4,3,2,1])
print("sum: ", arr1+arr2)
print("Difference: ", arr1-arr2)
print("Multiplication: ", arr1*arr2)
print("Division: ", arr1/arr2)```

Output

```sum:  [5 5 5 5]
Difference:  [-3 -1  1  3]
Multiplication:  [4 6 6 4]
Division:  [0.25       0.66666667 1.5        4.        ]```

##### Two-dimensional array
```arr1 = np.array([[1,2,3,4], [4,3,2,1]])
arr2 = np.array([[4,3,2,1], [4,3,2,1]])
print("sum:\n ", arr1+arr2)
print("Difference:\n ", arr1-arr2)
print("Multiplication:\n ", arr1*arr2)
print("Division:\n ", arr1/arr2)```

Output

```sum:
[[5 5 5 5]
[8 6 4 2]]
Difference:
[[-3 -1  1  3]
[ 0  0  0  0]]
Multiplication:
[[ 4  6  6  4]
[16  9  4  1]]
Division:
[[0.25       0.66666667 1.5        4.        ]
[1.         1.         1.         1.        ]]```

### Conclusion

Numpy is a powerful library that provides fast computation on a large array of elements and matrices. Elements are stored in contiguous memory locations so the processing of array elements is faster than in the list.

Numpy has a use case in image processing too as OpenCV sees images as an array of 1’s and 0’s. Numpy can be used with the matplotlib library to plot various bar charts, histograms, etc. So numpy is the powerful and most useable library in python.

Happy Learning 🙂

Reference

https://numpy.org/