11/10/21, 1:40 PM
ISYE 6501 - Midterm 2
Terms in this set (160)
when the # of factors is close to or larger than the #
when might overfitting occur of data points causing the model to potentially fit
too closely to random effects
Why are simple models better less data is required; less chance of insignificant
than complex ones factors and easier to interpret
we select the best new factor and see if it's good
enough (R^2, AIC, or p-value) add it to our model
and fit the model with the current set of factors.
Then at the end we remove factors that are lower
what is forward selection than a certain threshold
we start with all factors and find the worst on a
supplied threshold (p = 0.15). If it is worse we
remove it and start the process over. We do that
until we have the number of factors that we want
and then we move the factors lower than a second
what is backward elimination threshold (p = .05) and fit the model with all set of
factors
/ 1/21
,11/10/21, 1:40 PM
ISYE 6501 - Midterm 2 it is a combination of forward selection and Study
backward elimination. We can either start with all
factors or no factors and at each step we remove or
add a factor. As we go through the procedure after
adding each new factor and at the end we eliminate
what is stepwise regression
right away factors that no longer appear.
what type of algorithms are Greedy algorithms - at each step they take one
stepwise selection? thing that looks best
a variable selection method where the coefficients
are determined by both minimizing the squared
error and the sum of their absolute value not being
over a certain threshold t
what is LASSO
How do you choose t in use the lasso approach with different values of t and
LASSO see which gives the best trade off
why do we have to scale the if we don't the measure of the data will artificially
data for LASSO affect how big the coefficients need to be
/ 2/21
, 11/10/21, 1:40 PM
ISYE 6501 - Midterm 2 A variable selection method that works by Study
minimizing the squared error and constraining the
combination of absolute values of coefficients and
their squares
What is elastic net?
If the data is not scaled, the coefficients can have
what is a key difference
artificially different orders of magnitude, which
between stepwise regresson
means they'll have unbalanced effects on the lasso
and lasso regression
constraint.
The coefficients values are squared so they go
closer to zero or regularizes them
Why doesn't Ridge Regression
perform variable selection?
What are the pros and cons of Good for initial analysis but often don't perform as
Greedy Algorithms (Forward well on other data because they fit more to random
selection, stepwise elimination, effects than you'd like and appear to have a better
stepwise regression) fit
What are the pros and cons of They are slower but help make models that make
LASSO and elastic net better predictions
/ 3/21
ISYE 6501 - Midterm 2
Terms in this set (160)
when the # of factors is close to or larger than the #
when might overfitting occur of data points causing the model to potentially fit
too closely to random effects
Why are simple models better less data is required; less chance of insignificant
than complex ones factors and easier to interpret
we select the best new factor and see if it's good
enough (R^2, AIC, or p-value) add it to our model
and fit the model with the current set of factors.
Then at the end we remove factors that are lower
what is forward selection than a certain threshold
we start with all factors and find the worst on a
supplied threshold (p = 0.15). If it is worse we
remove it and start the process over. We do that
until we have the number of factors that we want
and then we move the factors lower than a second
what is backward elimination threshold (p = .05) and fit the model with all set of
factors
/ 1/21
,11/10/21, 1:40 PM
ISYE 6501 - Midterm 2 it is a combination of forward selection and Study
backward elimination. We can either start with all
factors or no factors and at each step we remove or
add a factor. As we go through the procedure after
adding each new factor and at the end we eliminate
what is stepwise regression
right away factors that no longer appear.
what type of algorithms are Greedy algorithms - at each step they take one
stepwise selection? thing that looks best
a variable selection method where the coefficients
are determined by both minimizing the squared
error and the sum of their absolute value not being
over a certain threshold t
what is LASSO
How do you choose t in use the lasso approach with different values of t and
LASSO see which gives the best trade off
why do we have to scale the if we don't the measure of the data will artificially
data for LASSO affect how big the coefficients need to be
/ 2/21
, 11/10/21, 1:40 PM
ISYE 6501 - Midterm 2 A variable selection method that works by Study
minimizing the squared error and constraining the
combination of absolute values of coefficients and
their squares
What is elastic net?
If the data is not scaled, the coefficients can have
what is a key difference
artificially different orders of magnitude, which
between stepwise regresson
means they'll have unbalanced effects on the lasso
and lasso regression
constraint.
The coefficients values are squared so they go
closer to zero or regularizes them
Why doesn't Ridge Regression
perform variable selection?
What are the pros and cons of Good for initial analysis but often don't perform as
Greedy Algorithms (Forward well on other data because they fit more to random
selection, stepwise elimination, effects than you'd like and appear to have a better
stepwise regression) fit
What are the pros and cons of They are slower but help make models that make
LASSO and elastic net better predictions
/ 3/21