Introduction
Hyperparameters are the set of parameters that are used for controlling the learning process of the machine learning algorithm. Hyperparameter tuning is the process of selecting a set of parameters for a machine learning algorithm. It is because algorithms can learn or identify the pattern in data efficiently and provide a good-performing model.
Why do we need to perform hyperparameter tuning?
A machine-learning algorithm may need different constraints or weights to identify the pattern present in the datasets. Training a machine learning model with default parameters may not be suitable for all kinds of data present in the datasets.
Selecting the best parameter for an algorithm is essential as it determines the learning process of the algorithm and its performance. With the help of hyperparameter tuning, we can choose the best parameter for an algorithm so that model can give a good prediction and perform well enough to solve a problem.
Things on hyperparameter tuning one should know
- Hyperparameter tuning is computationally expensive
- A small improvement in model performance
- Time-consuming
Hyperparameter tuning using GridSearchCV
This method tries all the possible permutations and combinations of parameters will be to train the model and compute the corresponding accuracy of the model. This is time-consuming and computationally very expensive as this method use all the possible permutation and combination of the parameters.
For demonstration, I’ll use Jupyter Notebook and the heart disease prediction dataset is taken from Kaggle
import pandas as pd from sklearn.metrics import accuracy_score from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split df = pd.read_csv('heart.csv') X = df.iloc[:, :-1] y = df.iloc[:, -1] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2) LR = LogisticRegression() model = LR.fit(X_train, y_train) pred = model.predict(X_test) accuracy_score(y_test, pred)
Output
0.819672131147541
Logistic Regression gives an accuracy of 81% without hyperparameter tuning. Let’s take look at how the accuracy of the model increases after hyperparameter tuning.
Implementation of GridSearchCV
from sklearn.model_selection import GridSearchCV parameters = { 'penalty' : ['l1', 'l2', 'elasticnet', 'none'], 'C' : [0.8, 0.9, 1.0, 1.2, 1.4], 'solver': ['newton-cg','lbfgs', 'liblinear','sag', 'saga'] } LR = LogisticRegression() clf = GridSearchCV(LR, parameters) clf.fit(X, y) clf.best_params_
Output
{'C': 0.9, 'penalty': 'l2', 'solver': 'newton-cg'}
These are the parameters that are best suited for this model. Now let’s implement the model using these parameters and see how much accuracy improves
LR = LogisticRegression(C = 0.9, penalty = 'l2', solver = 'newton-cg') model = LR.fit(X_train, y_train) pred = model.predict(X_test) accuracy_score(y_test, pred)
Output
0.8360655737704918
After hyperparameter tuning, the accuracy of the model has increased from 81% to 83%.
Hyperparameter tuning using RandomizedSearchCV
RandomizedSearchCV use only randomly selected sets of parameters to train the model and check its accuracy of the model. This is less time-consuming than the GridSearchCV.
One of the main disadvantages of this method is the parameters given by this method may not be the best parameters as this method selects only some set of the parameters to check the performance of the model.
Note: We will use the same model and same data for hyperparameter tuning with RandomizedSearchCV.
Implementation of RandomizedSearchCV
from sklearn.model_selection import RandomizedSearchCV clf = RandomizedSearchCV(LR, parameters, n_iter= 6) clf.fit(X, y) clf.best_params_
Output
{'solver': 'newton-cg', 'penalty': 'l2', 'C': 1.2}
n_iter is used for selecting the number of combinations of the parameters to evaluate the model i.e to check the accuracy of the model. Let’s use these parameters and check the accuracy of the model.
LR = LogisticRegression(C = 1.2, penalty = 'l2', solver = 'newton-cg') model = LR.fit(X_train, y_train) pred = model.predict(X_test) accuracy_score(y_test, pred)
Output
0.8360655737704918
So, the accuracy of the model has improved from 81% to 83% which is quite good.
from sklearn.model_selection import RandomizedSearchCV clf = RandomizedSearchCV(LR, parameters, n_iter= 2) clf.fit(X, y) clf.best_params_
Output
{'solver': 'saga', 'penalty': 'l1', 'C': 1.4}
Using only 2 iterations we got these parameters as the best parameters. Let’s use these parameters to evaluate the model
LR = LogisticRegression(C = 1.4, penalty = 'l1', solver = 'saga') model = LR.fit(X_train, y_train) pred = model.predict(X_test) accuracy_score(y_test, pred)
Output
0.6721311475409836
Using the parameters obtained above has degraded the accuracy of the model. So, RandomizedSearchCV sometimes might not end up with the best parameters for the model. It seems that n_iters must be chosen wisely to get better parameters.
Conclusion
Hyperparameter tuning is an important step in machine learning. It is used for selecting the best parameters for a machine learning algorithm so that the algorithm can learn the pattern and perform efficiently to solve a problem.
GridSearchCV makes all the possible combinations and permutations of parameters to select the best one while RandomizedSearchCV randomly selects a set of parameters to select the best one among them. It is more computationally expensive than RandomizedSearchCV but always ends up with the best parameters. Hence, hyperparameter tuning is very essential in machine learning.
Happy Learning 🙂