Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Class notes

SUPERVISED LEARNING

Rating
-
Sold
-
Pages
33
Uploaded on
26-04-2025
Written in
2024/2025

These notes focus exclusively on Supervised Learning techniques in Machine Learning. Topics include Bayesian Linear Regression, Gradient Descent optimization, Linear Classification Models like Perceptron Algorithm, Support Vector Machine (SVM), Decision Trees, Random Forests, and Instance-Based Learning with K-Nearest Neighbors (KNN). Additionally, the notes explain Probabilistic Discriminative Models such as Logistic Regression, Probabilistic Generative Models like Naive Bayes, and Maximum Margin Classifiers. Each topic is explained clearly with formulas, theory, and examples, making it ideal for students preparing for exams, interviews, and project work.

Show more Read less
Institution
Course

Content preview

AM3403 Machine Learning: Concepts and
Application




LECTURE NOTES


UNIT-2

SUPERVISED LEARNING

,Syllabus
SUPERVISED LEARNING

Bayesian linear regression, gradient descent, Linear Classification Models: Discriminant
function –Perceptron algorithm,–Support vector machine, Decision Tree, Random Forests,
Instance Based Learning-KNN. Probabilistic discriminative model -Logistic regression,
Probabilistic generative model –Naive Bayes, Maximum margin classifier

2.1 What is Supervised Machine Learning?

Supervised machine learning learns patterns and relationships between input and output data. It
is defined by its use of labeled data. A labeled data is a dataset that contains a lot of examples of
Features and Target. Specifically, a supervised learning algorithm takes a known set of input data
and known responses to the data (output), and trains a model to generate reasonable predictions
for the response to new data. This process is referred to as Training or Fitting.




There are two types of supervised learning algorithms:

 Classification
 Regression
Classification Algorithms
Classification algorithms are used for predicting discrete outcomes, if the outcome can take two possible
values such as True or False, Default or No Default, Yes or No, it is known as Binary Classification.
When the outcome contains more than two possible values, it is known as Multiclass Classification. There
are many machine learning algorithms that can be used for classification tasks.

 Logistic regression
 Support vector machines (SVM)
 Neural networks
 Naïve Bayes classifier
 Decision trees

,  Discriminant analysis
 Nearest neighbors (kNN)
 Ensemble Classification
 Generalized Additive Model (GAM)

Regression Algorithms
Regression is a type of supervised machine learning where algorithms learn from the data to
predict continuous values such as sales, salary, weight, or temperature. For example: A dataset
containing features of the house such as lot size, number of bedrooms, number of baths,
neighborhood, etc. and the price of the house, a Regression algorithm can be trained to learn the
relationship between the features and the price of the house.

Common regression algorithms include:

 Linear regression
 Nonlinear regression
 Generalized linear models
 Decision trees
 Neural networks
 Gaussian Process Regression
 Support Vector Machine Regression
 Ensemble Regression

Steps in Supervised Learning

Supervised learning involves training a model to learn patterns from labeled data and making
predictions on new inputs. While different algorithms have unique implementations, the overall
process follows a structured workflow:

1. Data Preparation

The first step in supervised learning is organizing the input data:

 The dataset consists of an input feature matrix X(where each row represents an observation and
each column represents a feature) and an output response vector Y.
 Missing values in X or Y should be appropriately handled, either by ignoring incomplete rows or
imputing missing data.
 The response variable Y varies based on the task:
o Regression: Y is a numeric vector.
o Classification: Y can be categorical, binary, or multi-class labels.

2. Choosing an Algorithm

The selection of a suitable learning algorithm depends on multiple factors, including:

 Training speed: Some models train faster than others, depending on complexity and dataset size.
 Memory usage: Resource-efficient algorithms are preferable for large datasets.

,  Predictive accuracy: The model should generalize well to unseen data.
 Interpretability: Some models (e.g., decision trees) provide clear insights, while others (e.g.,
deep learning) act as black boxes.

3. Model Training (Fitting)

The training process involves applying the chosen algorithm to fit the model using the given
dataset. Common types of models include:

 Decision Trees
 Linear and Logistic Regression
 Support Vector Machines (SVM)
 Neural Networks
 k-Nearest Neighbors (k-NN)
 Naïve Bayes Classifier
 Ensemble Methods (e.g., Random Forest, Boosting)

Each algorithm has its own method for fitting a model to the training data.

4. Model Validation

To assess model performance, different validation techniques can be used:

 Resubstitution Error: Evaluating the model on the same training data.
 Cross-Validation: Splitting data into training and validation sets multiple times to
estimate performance on new data.
 Out-of-Bag Error: Specific to ensemble methods like bagging, evaluating performance
using data points not included in each subset during training.

5. Model Evaluation and Optimization

Once validated, the model can be fine-tuned for better accuracy, efficiency, or robustness. This
can involve:

 Adjusting hyperparameters (e.g., learning rate, tree depth).
 Pruning or regularizing the model to reduce complexity.
 Trying alternative algorithms for comparison.

For models that support optimization, compacting the model by removing unnecessary training
data or parameters can improve efficiency.

6. Making Predictions

After training and validating the model, it is used to make predictions on new data:

 For classification tasks, the model assigns labels to new observations.

Written for

Institution
Course

Document information

Uploaded on
April 26, 2025
Number of pages
33
Written in
2024/2025
Type
Class notes
Professor(s)
Dr.gowri
Contains
All classes

Subjects

$8.49
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
gvarshini

Get to know the seller

Seller avatar
gvarshini CHENNAI INSTITUTE OF TECHNOLOGY
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
1 year
Number of followers
0
Documents
5
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions