Class notes

Model validation

Rating

Sold

Pages

Uploaded on

29-01-2024

Written in

2023/2024

This document tells about model validation

Institution

Course

Content preview

MACHINE LEARNING LECTURE NOTES

UNIT – IV

Model Validation in Classification : Cross Validation - Holdout Method, K-Fold, Stratified K-Fold,
Leave-One-Out Cross Validation. Bias-Variance tradeoff, Regularization , Overfitting, Underfitting.
Ensemble Methods: Boosting, Bagging, Random Forest.

Cross-Validation in Machine Learning

Cross-validation is a technique for validating the model efficiency by training it on the subset of input
data and testing on previously unseen subset of the input data. We can also say that it is a technique to
check how a statistical model generalizes to an independent dataset.

In machine learning, there is always the need to test the stability of the model. It means based only on the
training dataset; we can't fit our model on the training dataset. For this purpose, we reserve a particular
sample of the dataset, which was not part of the training dataset. After that, we test our model on that
sample before deployment, and this complete process comes under cross-validation. This is something
different from the general train-test split.

Hence the basic steps of cross-validations are:

o Reserve a subset of the dataset as a validation set.
o Provide the training to the model using the training dataset.
o Now, evaluate model performance using the validation set. If the model performs well with the
validation set, perform the further step, else check for the issues.

Cross-validation is a technique for validating the model efficiency by training it on the subset of input
data and testing on previously unseen subset of the input data.
We can also say that it is a technique to check how a statistical model generalizes to an independent
dataset.
Data needs to split into:
 Training data: Used for model development
 Validation data: Used for validating the performance of the same model

BY
B SARITHA
1

, MACHINE LEARNING LECTURE NOTES

Extended version of Cross validation

there is always a need to validate the stability of your machine learning model. I mean you just can’t fit the
model to your training data and hope it would accurately work for the real data it has never seen
before. You need some kind of assurance that your model has got most of the patterns from the data
correct, and its not picking up too much on the noise, or in other words its low on bias and variance.

Validation

This process of deciding whether the numerical results quantifying hypothesized relationships between
variables, are acceptable as descriptions of the data, is known as validation. Generally, an error estimation
for the model is made after training, better known as evaluation of residuals. In this process, a numerical
estimate of the difference in predicted and original responses is done, also called the training error.
However, this only gives us an idea about how well our model does on data used to train it. Now its
possible that the model is underfitting or overfitting the data. So, the problem with this evaluation
BY
B SARITHA
2

, MACHINE LEARNING LECTURE NOTES

technique is that it does not give an indication of how well the learner will generalize to an independent/
unseen data set. Getting this idea about our model is known as Cross Validation.

Methods used for Cross-Validation
There are some common methods that are used for cross-validation. These methods are given below:

 Leave-P-out cross-validation

 Leave one out cross-validation

 K-fold cross-validation

 Stratified k-fold cross-validation
 Holdout Method

Leave-P-out cross-validation

This approach leaves p data points out of training data, i.e. if there are n data points in the original sample
then, n-p samples are used to train the model and p points are used as the validation set. This is repeated
for all combinations in which original sample can be separated this way, and then the error is averaged for
all trials, to give overall effectiveness.

This method is exhaustive in the sense that it needs to train and validate the model for all possible
combinations, and for moderately large p, it can become computationally infeasible.

A particular case of this method is when p = 1. This is known as Leave one out cross validation. This
method is generally preferred over the previous one because it does not suffer from the intensive
computation, as number of possible combinations is equal to number of data points in original sample or n.

Cross Validation is a very useful technique for assessing the effectiveness of your model, particularly in
cases where you need to mitigate overfitting. It is also of use in determining the hyper parameters of your
model, in the sense that which parameters will result in lowest test error. This is all the basic you need to
get started with cross validation. You can get started with all kinds of validation techniques using Scikit-
Learn, that gets you up and running with just a few lines of code in python.

BY
B SARITHA
3

Report Copyright Violation

Written for

Institution: Jntuh/hyderabad
Course: CSE

All documents for this subject (6)

Document information

Uploaded on: January 29, 2024
Number of pages: 24
Written in: 2023/2024
Type: Class notes
Professor(s): B saritha
Contains: All classes

Subjects

ensemble
methods

$11.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

sarithabattu

Get to know the seller

sarithabattu Malla Reddy engineering College

View profile

Sold

Member since

2 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller sarithabattu. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $11.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 47577 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Model validation

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?