Because linear regression assumes that there is only one variable to
cause a change in an outcome or the fact that there is only one
predictor used to predict an outcome is only a simplification of a real-
life setting thus, there will be a limitation in predicting something to
happen (James et al., 2021, p.87). This problem can be mitigated by
using several related predictors to widen its capability in prediction
(like using polynomial regression or multiple linear regression with
interaction terms) (James et al., 2021, p.91).
According to James et al. (2021), there are several potential problems
which need special attention when using linear regression for analysis
(pp.92-103):
Non-linearity of the response-predictor relationships
o Like its name, using linear regression for predicting a
relation between the predictor and the outcome will create
a flawed model. A residual plot can be used to analyze
whether a relation is linear or not.
Correlation of error terms
o When there is a correlation between error values, there
might be inaccuracy in some calculations like standard
error (SE), confidence, and prediction intervals.
Non-constant variance of error terms.
o Similar to the previous problem, variation of error values is
taken as consistent to predict the value of SE, confidence,
and hypothesis test. With the appearance of inconsistency,
those values will be flawed. A residual plot can help
determine this heteroscedasticity.
Outliers
o Outliers or an outcome value (y) that is very unusual will
cause a flaw in computing residual standard error (RSE).
This in turn will cause a flaw in the computation of
confidence intervals and p-values. Studentized residuals
calculation can be used to detect an outlier but an outlier
may also happen because there is a lack of predictor, so
extra care must be taken when analyzing this problem.
High-leverage points
o Different from outliers, high-leverage points happen to the
predictors or the x value. As the unusual value comes from
the predictor, it has a more significant influence on the
calculation of the least square line or the projected
regression line than the outlier problem, which in turn will
create a flawed analysis. High-leverage points can be
detected by observing a value that is outside the common
bound of the predictors and calculating the leverage
statistic value.
cause a change in an outcome or the fact that there is only one
predictor used to predict an outcome is only a simplification of a real-
life setting thus, there will be a limitation in predicting something to
happen (James et al., 2021, p.87). This problem can be mitigated by
using several related predictors to widen its capability in prediction
(like using polynomial regression or multiple linear regression with
interaction terms) (James et al., 2021, p.91).
According to James et al. (2021), there are several potential problems
which need special attention when using linear regression for analysis
(pp.92-103):
Non-linearity of the response-predictor relationships
o Like its name, using linear regression for predicting a
relation between the predictor and the outcome will create
a flawed model. A residual plot can be used to analyze
whether a relation is linear or not.
Correlation of error terms
o When there is a correlation between error values, there
might be inaccuracy in some calculations like standard
error (SE), confidence, and prediction intervals.
Non-constant variance of error terms.
o Similar to the previous problem, variation of error values is
taken as consistent to predict the value of SE, confidence,
and hypothesis test. With the appearance of inconsistency,
those values will be flawed. A residual plot can help
determine this heteroscedasticity.
Outliers
o Outliers or an outcome value (y) that is very unusual will
cause a flaw in computing residual standard error (RSE).
This in turn will cause a flaw in the computation of
confidence intervals and p-values. Studentized residuals
calculation can be used to detect an outlier but an outlier
may also happen because there is a lack of predictor, so
extra care must be taken when analyzing this problem.
High-leverage points
o Different from outliers, high-leverage points happen to the
predictors or the x value. As the unusual value comes from
the predictor, it has a more significant influence on the
calculation of the least square line or the projected
regression line than the outlier problem, which in turn will
create a flawed analysis. High-leverage points can be
detected by observing a value that is outside the common
bound of the predictors and calculating the leverage
statistic value.