MACHINE LEARNING LECTURE NOTES
UNIT-II
Supervised Learning – I (Regression/Classification)
Regression models: Simple Linear Regression, multiple linear Regression. Cost Function,
Gradient Descent, Performance Metrics: Mean Absolute Error(MAE),Mean Squared Error(MSE)
R-Squared error, Adjusted R Square.
Classification models: Decision Trees-ID3,CART, Naive Bayes, K-Nearest-Neighbours (KNN),
Logistic Regression, Multinomial Logistic Regression Support Vector Machines (SVM) -
Nonlinearity and Kernel Methods
What is a Regression
Regression is a supervised learning technique that supports finding the correlation among
variables.
In Regression, we plot a graph between the variables which best fit the given data points. The
machine learning model can deliver predictions regarding the data.
“Regression shows a line or curve that passes through all the data points on a target-
predictor graph in such a way that the vertical distance between the data points and the
regression line is minimum.”
Types of Regression models
Linear Regression
Polynomial Regression
Logistics Regression
Linear Regression-
Linear Regression is a supervised machine learning algorithm.
It tries to find out the best linear relationship that describes the data you have.
Linear regression shows the linear relationship between the independent variable (X-axis) and
the dependent variable (Y-axis), consequently called linear regression.
The value of the dependent variable of a linear regression model is a continuous value i.e. real
numbers.
If there is a single input variable (x), such linear regression is called simple linear regression.
And if there is more than one input variable, such linear regression is called multiple linear
regression.
Representing Linear Regression Model-
Linear regression model represents the linear relationship between a dependent variable and
independent variable(s) via a sloped straight line.
BY
B SARITHA 1
,MACHINE LEARNING LECTURE NOTES
The sloped straight line representing the linear relationship that fits the given data best is called
as a regression line.
It is also called as best fit line.
Types of Linear Regression-
Based on the number of independent variables, there are two types of linear regression-
1. Simple Linear Regression-
In simple linear regression, the dependent variable depends only on a single independent variable.
For simple linear regression, the form of the model is-
Y = β0 + β1X
Here,
Y is a dependent variable.
X is an independent variable.
β0 and β1 are the regression coefficients.
There are following 3 cases possible-
Case-01: β1 < 0
It indicates that variable X has negative impact on Y.
If X increases, Y will decrease and vice-versa.
BY
B SARITHA 2
,MACHINE LEARNING LECTURE NOTES
Case-02: β1 = 0
It indicates that variable X has no impact on Y.
If X changes, there will be no change in Y.
Case-03: β1 > 0
It indicates that variable X has positive impact on Y.
If X increases, Y will increase and vice-versa.
2. Multiple Linear Regression-
In multiple linear regression, the dependent variable depends on more than one independent
variables.
BY
B SARITHA 3
, MACHINE LEARNING LECTURE NOTES
For multiple linear regression, the form of the model is-
Y = β0 + β1X1 + β2X2 + β3X3 + …… + βnXn
Here,
Y is a dependent variable.
X1, X2, …., Xn are independent variables.
β0, β1,…, βn are the regression coefficients.
βj (1<=j<=n) is the slope or weight that specifies the factor by which X j has an impact on Y.
We have a dataset of 50 start-up companies. This dataset contains five main information: R&D
Spend, Administration Spend, Marketing Spend, State, and Profit for a financial year. Our
goal is to create a model that can easily determine which company has a maximum profit, and
which is the most affecting factor for the profit of a company.
Since we need to find the Profit, so it is the dependent variable, and the other four variables are
independent variables.
Below are the main steps of deploying the MLR model:
Data Pre-processing Steps:
(Importing libraries, Importing dataset, Extracting dependent and independent Variables,
Encoding Dummy Variables:)
Fitting the MLR model to the training set
Predicting the result of the test set
Cost Function in Machine Learning
A Machine Learning model should have a very high level of accuracy in order to perform well
with real-world applications.
But how to calculate the accuracy of the model, i.e., how good or poor our model will perform
in the real world?
In such a case, the Cost function comes into existence. It is an important machine learning
parameter to correctly estimate the model.
BY
B SARITHA 4
UNIT-II
Supervised Learning – I (Regression/Classification)
Regression models: Simple Linear Regression, multiple linear Regression. Cost Function,
Gradient Descent, Performance Metrics: Mean Absolute Error(MAE),Mean Squared Error(MSE)
R-Squared error, Adjusted R Square.
Classification models: Decision Trees-ID3,CART, Naive Bayes, K-Nearest-Neighbours (KNN),
Logistic Regression, Multinomial Logistic Regression Support Vector Machines (SVM) -
Nonlinearity and Kernel Methods
What is a Regression
Regression is a supervised learning technique that supports finding the correlation among
variables.
In Regression, we plot a graph between the variables which best fit the given data points. The
machine learning model can deliver predictions regarding the data.
“Regression shows a line or curve that passes through all the data points on a target-
predictor graph in such a way that the vertical distance between the data points and the
regression line is minimum.”
Types of Regression models
Linear Regression
Polynomial Regression
Logistics Regression
Linear Regression-
Linear Regression is a supervised machine learning algorithm.
It tries to find out the best linear relationship that describes the data you have.
Linear regression shows the linear relationship between the independent variable (X-axis) and
the dependent variable (Y-axis), consequently called linear regression.
The value of the dependent variable of a linear regression model is a continuous value i.e. real
numbers.
If there is a single input variable (x), such linear regression is called simple linear regression.
And if there is more than one input variable, such linear regression is called multiple linear
regression.
Representing Linear Regression Model-
Linear regression model represents the linear relationship between a dependent variable and
independent variable(s) via a sloped straight line.
BY
B SARITHA 1
,MACHINE LEARNING LECTURE NOTES
The sloped straight line representing the linear relationship that fits the given data best is called
as a regression line.
It is also called as best fit line.
Types of Linear Regression-
Based on the number of independent variables, there are two types of linear regression-
1. Simple Linear Regression-
In simple linear regression, the dependent variable depends only on a single independent variable.
For simple linear regression, the form of the model is-
Y = β0 + β1X
Here,
Y is a dependent variable.
X is an independent variable.
β0 and β1 are the regression coefficients.
There are following 3 cases possible-
Case-01: β1 < 0
It indicates that variable X has negative impact on Y.
If X increases, Y will decrease and vice-versa.
BY
B SARITHA 2
,MACHINE LEARNING LECTURE NOTES
Case-02: β1 = 0
It indicates that variable X has no impact on Y.
If X changes, there will be no change in Y.
Case-03: β1 > 0
It indicates that variable X has positive impact on Y.
If X increases, Y will increase and vice-versa.
2. Multiple Linear Regression-
In multiple linear regression, the dependent variable depends on more than one independent
variables.
BY
B SARITHA 3
, MACHINE LEARNING LECTURE NOTES
For multiple linear regression, the form of the model is-
Y = β0 + β1X1 + β2X2 + β3X3 + …… + βnXn
Here,
Y is a dependent variable.
X1, X2, …., Xn are independent variables.
β0, β1,…, βn are the regression coefficients.
βj (1<=j<=n) is the slope or weight that specifies the factor by which X j has an impact on Y.
We have a dataset of 50 start-up companies. This dataset contains five main information: R&D
Spend, Administration Spend, Marketing Spend, State, and Profit for a financial year. Our
goal is to create a model that can easily determine which company has a maximum profit, and
which is the most affecting factor for the profit of a company.
Since we need to find the Profit, so it is the dependent variable, and the other four variables are
independent variables.
Below are the main steps of deploying the MLR model:
Data Pre-processing Steps:
(Importing libraries, Importing dataset, Extracting dependent and independent Variables,
Encoding Dummy Variables:)
Fitting the MLR model to the training set
Predicting the result of the test set
Cost Function in Machine Learning
A Machine Learning model should have a very high level of accuracy in order to perform well
with real-world applications.
But how to calculate the accuracy of the model, i.e., how good or poor our model will perform
in the real world?
In such a case, the Cost function comes into existence. It is an important machine learning
parameter to correctly estimate the model.
BY
B SARITHA 4