GUIDE QUESTIONS WITH VERIFIED
SOLUTIONS NEW MODIFIED BEST QUALITY
EXAM 2026 LATEST
What are the two types of variables in regression?
Response (dependent): variable that we are interested in understanding or
modelling, usually represented as Y. Response variable is random variable,
varies with changes in the predictor/s along with other random changes
Predicting/explanatory variables: set of variables that we think might be useful
in predicting or modelling the response variable (say the price of the product,
competitor's price, etc. )
Predicting variables are fixed variables, do not change with the response, but it
is set fixed before the response is measured
What are the three objectives in regression analysis?
1. Prediction of response variable
2. Modelling the relationship between the response variable and the explanatory
variables
3. Testing hypotheses of association relationships
Page 1 of 58
,What are the 4 assumptions of linear regression?
o Linearity/ mean zero assumption
o Constant variance assumption
o Independence assumption: are independent random variables
o Later we assume that errors are normally distributed
What are the unknown parameters in linear regression?
· The unknown parameters are intercept, slope, and variance
o Unknown regardless how much data is observed
o Estimated given the model assumptions
o Estimated based on data
What are the steps to evaluate prediction accuracy?
Step 1: Divide data into testing and training data.
Random subsampling: allocate a specific percentage of data randomly to each
(e.g., 75% training and 25% testing)
Page 2 of 58
,K-fold cross-validation: divide data into K folds of of approximately equal sizes
then allocate (K-1) folds for training and 1-fold for testing
Random subsampling is computationally more expensive than k-fold CV but
less subjective to random folds in the data The larger the k, the less bias, but the
more variance
Step 2: Fit regression model to training data.
Step 3: Evaluate prediction accuracy based on testing data using an accuracy
Step 4: Apply Steps 1-3 multiple times then average the prediction accuracy
measure over all repetitions.
What is regression?
A non-deterministic modeling technique that we use to analyze and estimate the
values of a (response) variable by using other variables that it's correlated with
By using a regression model, we try to explain and predict the total variability
of y (response/dependent variable) using x (predictor/independent/explanatory
variable)
Linear models are simple to understand and tend to work well even if they don't
fully represent reality
Page 3 of 58
, How do we define the "best fit" linear regression line?
Line that minimizes the sum of squared errors
What is the variance sampling distribution for SLR?
· Variance sampling follows a chi-squared distribution with n-2 degrees of
freedom
Estimator of the variance of the error terms is estimated to be the variance: SSE/
(n-2)
N-2: lose two degrees of freedom because we are estimating two variables
(beta_0, beta_1)
How do we interpret model parameters computed in linear regression?
β_1 > 0: direct relationship between x and y
β_1 < 0: inverse relationship between x and y
β_1 ~ 0: not significant association between x and y
"β" _̂ 1 is the estimated expected change in the response variable associated with
one unit of change in the predicting variable
Page 4 of 58