Model Evaluation in Machine Learning

One of the most important activities for a Data Scientist to perform, is to measure and optimize the prediction accuracy of Machine Learning models one has built. Though there are various approaches to do this, they can be grouped into three major key steps.

Sebastian Raschka, the author of the bestselling book “Python Machine Learning”, who is a Ph.D. candidate at Michigan State University, developing new computational methods in the field of computational biology, has published an excellent article describing these steps.

In a nutshell, he breaks down the evaluation process into three main steps:

Data generalisation – ensure that the training data and the test data have good ‘variance’ and a fair proportion of various classifications’. This could be achieved by a couple of techniques:
- Stratification
- Cross validation – k fold or bootstrap
- Hold out method – training data set, hold out data set, test data set
- Bias variance trade-off
Algorithm selection – picking the right algorithm that is best suited for the use case in hand
Model selection
1. Hyper parameters tuning – cross validation techniques
2. ‘Model parameters’ are of models and ‘Hyper parameters’ (also called tuning parameters) are of algorithms; for ex., the depth of Trees in Random Forests

Sebastian has put together a detailed 3 part tutorial where he goes into the details of each of these steps:

These are great reads for anyone who is having a tough time picking the right model for their ML project, and also having difficulty measuring its efficiency and accuracy.

Title Image courtesy: biguru.wordpress.com

Discover more from Hari Notes

Subscribe to get the latest posts sent to your email.

Model Evaluation in Machine Learning

Discover more from Hari Notes

Published by Hari

Leave a comment Cancel reply

Discover more from Hari Notes

Share this:

Related

Published by Hari

Leave a comment Cancel reply

Discover more from Hari Notes