## Introduction

Distance is the separation of two or more points in space. Machine learning models use different types of distances such as Euclidean distance, Manhattan distance, Minkowski, cosine distance, etc. Machine learning algorithms like k-nearest neighbor, k-means clustering uses Euclidean, Manhattan, Minkowski methods for distance calculation. Cosine distance is used in recommendation algorithms.

In this blog, we will learn about some of the different types of distances that are widely used in machine learning algorithms. We will look at mathematical formulas associated with them and we will also briefly discuss their use cases of them.

### Types of distance

**Euclidean distance (L2 Norms)**

Euclidean distance is calculated using the Pythagorean theorem. For better understanding, let’s look figure below

Let’s say that P and Q are the two features of any dataset with coordinates (x1,y1) and (x2,y2) respectively. The distance between P and R is **PR = x2-x1**. The distance between P and Q is **PQ = y2-y1**. To calculate the distance between these two points we use the Pythagorean theorem.

According to the Pythagorean theorem,

*PQ ^{2 } = PR^{2} + PQ^{2}*

*PQ ^{2 } = (x2-x1)^{2 }+ (y2-y1)^{2}*

*PQ = [(x2-x1) ^{2 }+ (y2-y1)^{2}]^{1/2}*

This equation gives the required distance in 2-D. For 3-D, add the z component so that the formula for distance becomes

*[(x2-x1) ^{2 }+ (y2-y1)^{2 } + (z2-z1)]^{1/2}*

**Manhattan distance(L1 Norms)**

Unlike Euclidean distance, Manhattan distance doesn’t use Pythagoras’ theorem. Instead, the distance is calculated using absolute values of the distance of the line projected into the coordinate axes.

P and Q are the two points. PQ is the shortest line joining between those points. P’Q’ is the projection of line PQ on the x-axis and p”Q” is the projection of line PQ on the y-axis. The Sum of the absolute length of P’Q’ and P”Q” gives the Manhattan distance.

*PQ = |x2-x1| + |y2-y1|*

**Minkowski distance**

Minkowski distance is the distance between any two points in normed vector space. Minkowski distance is the generalization of Euclidean distance and Manhattan distance. The formula to compute Minkowski distance is

X = (x1, x2, x3,…….,x_{n}) and Y = (y1, y2, y3,……..,y_{n}) ∈ R

**Cosine similarity and Cosine distance**

Cosine similarity gives the measure of the closeness or degree of similarity between two or more two points. Cosine similarity and cosine distance are widely used in recommendation systems such as movie recommendation systems.

Theta(θ) is the angle between the two points P and Q. Cosine similarity measures the similarity between these two points P and Q using the formula cosine-similarity = cosθ. Cosine distance is given by 1-cosθ.

For example:

Let’s say the angle between points P and Q is, θ = 90 degrees. The cosine-similarity is cosθ which gives the value of 0. This shows that there is no similarity between these points. The cosine distance is 1-cosθ which gives the value of 1 which indicates that the distance between these points is large.

**Jaccard distance**

Jaccard distance measures the distance between two sets. Same as cosine similarity and cosine distance, it also measures the similarity between two sets and finally computes the distance between two sets. The formula for Jaccard distance is given by the given formula:

Jaccard distance = 1 – SIM(X, Y), where SIM(X, Y) is the closeness of the two sets X and Y.

The formula of Jaccard similarity is given by the formula below

For example:

A = {1,2,3,4,5,6,7,8} and B = {1,3,5,10,15}

The Jaccard similarity is calculated as

And hence the Jaccard distance is 0.7. This result shows that the two sets have 30% similarity.