Introduction
The Median is the value that lies in the center when the numbers are arranged in ascending order.
General steps to find Median in Mathematical problems:
1. First count the number of elements(N) that lie in your collection(list, tuple, set) and sort them in ascending order.
2. Check whether the number of elements is odd or even.
3. If the number of elements(N) is odd.
The median is the middle value, which is at position (N + 1)/2
4. If the number of elements(N) is even.
The median is the average of the two middle values. a. Find the value at position (N/2) b. Find the value at position (N/2) + 1 c. Find the average of the two values to get the median.
For example:
1. list_1 = [2, 4, 5, 8, 14] Since N = 5 which is odd, the Median is the middle value that is 5
2. list_2 = [9, 3, 1, 8, 3, 6] Here, N = 6 which is even. Sort the list as: [1, 3, 3, 6, 8, 9] Median = (3 + 6)/2 = 4.5
Python ways to find the Median
In Python, we can use different methods in order to find the median of the list. Here in the blog post, we are discussing three methods to calculate the median in the list and they are:
A. Using user-defined function
Despite using the library, we can make our own function that can calculate the median from the list simply by reflecting the above processes.
def calculate_median(nums): """function to calculate the median of the list""" print(f"The median of the list {nums} is", end=" ") # calculate length of list N = len(nums) # sort the list in ascending order nums.sort() mid_index = N // 2 # floor division // rounds the result down to the nearest whole number if N % 2 != 0: print(nums[mid_index]) else: print((nums[mid_index - 1] + nums[mid_index]) / 2) # list containing all unique items calculate_median([7, 2, 1, 9, 4]) # list containing two or similar items calculate_median([11, 7, 11, 18, 9, 7, 6, 23, 7]) # list containing floating point numbers calculate_median([1.7, 3.4, 10.9, 4.2])
Output:
The median of the list [7, 2, 1, 9, 4] is 4 The median of the list [11, 7, 11, 18, 9, 7, 6, 23, 7] is 9 The median of the list [1.7, 3.4, 10.9, 4.2] is 3.8
B. Using Python built-in module named statistics
Statistics is the Python built-in module that provides functions for calculating mathematical statistics of numeric data.
This module includes functions that can calculate the arithmetic mean, geometric mean, harmonic mean, median, mode, quantiles, and many others.
from statistics import median def calculate_median(nums): """function to calculate the median of the list using Python statistics module""" print(f"The median of the list {nums} is", median(nums)) # list containing all unique items calculate_median([33, 55, 77, 22, 11, 88, 44]) # list containing two or similar items calculate_median([4, 12, 14, 17, 22, 23, 23, 24, 25, 29, 40, 67, 77, 82, 92]) # list containing floating point numbers calculate_median([1.79, 1.61, 2.09, 1.84, 1.96, 2.11, 1.75])
Output:
The median of the list [33, 55, 77, 22, 11, 88, 44] is 44 The median of the list [4, 12, 14, 17, 22, 23, 23, 24, 25, 29, 40, 67, 77, 82, 92] is 24 The median of the list [1.79, 1.61, 2.09, 1.84, 1.96, 2.11, 1.75] is 1.84
Note: if you pass the empty list, then the statistics module raises StatisticsError.
..... ..... ..... # calling function with the empty list calculate_median([])
C. Using a third-party library called NumPy
NumPy is the third-party library for scientific computing in Python. It can be used to perform various mathematical operations on arrays.
We can simply install it using pip on a virtual environment as:
pip install numpy
import numpy as np def calculate_median(nums): """function to calculate the median of the list using third-party library NumPy""" print(f"The median of the list {nums} is", np.median(nums)) # list containing all unique items calculate_median([34, 344, 12, 77, 45, 90]) # list containing two or similar items calculate_median([1001, 777, 1001, 189, 189, 777, 6098, 23, 777]) # list containing floating point numbers calculate_median([1.47, 56.84, 134.9, 565.2]) calculate_median([])
Output:
The median of the list [34, 344, 12, 77, 45, 90] is 61.0 The median of the list [1001, 777, 1001, 189, 189, 777, 6098, 23, 777] is 777.0 The median of the list [1.47, 56.84, 134.9, 565.2] is 95.87
Conclusion
So far we have discussed three methods for calculating the median of the list. Every method is suitable to use based on our use cases. For example, if you don’t want to depend on Python’s built-in module and a third-party one, you can create your own function and do some tweaks to get the formatted output you want.
References:
Statistics Module
NumPy