VERIFIED ANSWERS GUARANTEED PASS
Which two methods does elastic net look like it combines
and what are the downsides from it?
Ridge Regression and LASSO.
Advantages: variable selection from LASSO and
Predictive benefits of LASSO.
Disadvantages: Arbitrarily rules out some correlated
variables like LASSO (don't know which one that is left
out should be); Underestimates coefficients of very
predictive variables like Ridge Regresison
What are some downsides of surveys?
Even if you what appears to be a representative sample in
simple ways, maybe it isn't in more complex ways.
when might overfitting occur
when the # of factors is close to or larger than the # of
data points causing the model to potentially fit too closely
to random effects
Why are simple models better than complex ones
less data is required; less chance of insignificant factors
and easier to interpret
,what is forward selection
we select the best new factor and see if it's good enough
(R^2, AIC, or p-value) add it to our model and fit the
model with the current set of factors. Then at the end we
remove factors that are lower than a certain threshold
what is backward elimination
we start with all factors and find the worst on a supplied
threshold (p = 0.15). If it is worse we remove it and start
the process over. We do that until we have the number of
factors that we want and then we move the factors lower
than a second threshold (p = .05) and fit the model with all
set of factors
what is stepwise regression
it is a combination of forward selection and backward
elimination. We can either start with all factors or no
factors and at each step we remove or add a factor. As
we go through the procedure after adding each new
factor and at the end we eliminate right away factors that
no longer appear.
what type of algorithms are stepwise selection?
Greedy algorithms - at each step they take one thing that
looks best
what is LASSO
, a variable selection method where the coefficients are
determined by both minimizing the squared error and the
sum of their absolute value not being over a certain
threshold t
How do you choose t in LASSO
use the lasso approach with different values of t and see
which gives the best trade off
why do we have to scale the data for LASSO
if we don't the measure of the data will artificially affect
how big the coefficients need to be
What is elastic net?
A variable selection method that works by minimizing the
squared error and constraining the combination of
absolute values of coefficients and their squares
what is a key difference between stepwise regresson and
lasso regression
If the data is not scaled, the coefficients can have
artificially different orders of magnitude, which means
they'll have unbalanced effects on the lasso constraint.
Why doesn't Ridge Regression perform variable
selection?
The coefficients values are squared so they go closer to
zero or regularizes them