QUESTIONS WITH DETAILED SOLUTIONS |
100% VERIFIED ANSWERS AND EXPLANATIONS
| LATEST UPDATED VERSION FOR ACCURATE
PREPARATION
when might overfitting occur
when the # of factors is close to or larger than the # of data points causing the model to
potentially fit too closely to random effects
Why are simple models better than complex ones
less data is required; less chance of insignificant factors and easier to interpret
what is forward selection
we select the best new factor and see if it's good enough (R^2, AIC, or p-value) add it to our
model and fit the model with the current set of factors. Then at the end we remove factors that
are lower than a certain threshold
what is backward elimination
we start with all factors and find the worst on a supplied threshold (p = 0.15). If it is worse we
remove it and start the process over. We do that until we have the number of factors that we want
and then we move the factors lower than a second threshold (p = .05) and fit the model with all
set of factors
what is stepwise regression
it is a combination of forward selection and backward elimination. We can either start with all
factors or no factors and at each step we remove or add a factor. As we go through the procedure
after adding each new factor and at the end we eliminate right away factors that no longer appear.
what type of algorithms are stepwise selection?
Greedy algorithms - at each step they take one thing that looks best
what is LASSO
a variable selection method where the coefficients are determined by both minimizing the
squared error and the sum of their absolute value not being over a certain threshold t
How do you choose t in LASSO
,use the lasso approach with different values of t and see which gives the best trade off
why do we have to scale the data for LASSO
if we don't the measure of the data will artificially affect how big the coefficients need to be
What is elastic net?
A variable selection method that works by minimizing the squared error and constraining the
combination of absolute values of coefficients and their squares
what is a key difference between stepwise regresson and lasso regression
If the data is not scaled, the coefficients can have artificially different orders of magnitude, which
means they'll have unbalanced effects on the lasso constraint.
Why doesn't Ridge Regression perform variable selection?
The coefficients values are squared so they go closer to zero or regularizes them
What are the pros and cons of Greedy Algorithms (Forward selection, stepwise elimination,
stepwise regression)
Good for initial analysis but often don't perform as well on other data because they fit more to
random effects than you'd like and appear to have a better fit
What are the pros and cons of LASSO and elastic net
They are slower but help make models that make better predictions
Which two methods does elastic net look like it combines and what are the downsides from
it?
Ridge Regression and LASSO.
Advantages: variable selection from LASSO and Predictive benefits of LASSO.
Disadvantages: Arbitrarily rules out some correlated variables like LASSO (don't know which
one that is left out should be); Underestimates coefficients of very predictive variables like Ridge
Regresison
What are some downsides of surveys?
Even if you what appears to be a representative sample in simple ways, maybe it isn't in more
complex ways.
If we're testing to see whether red cars sell for higher prices than blue cars, we need to
account for the type and age of the cars in our data set. This is called:
, Controlling
what is a blocking factor
a source of variability that is not of primary interest to the experimenter
what is an example of a blocking factor
The type of car, sports car or family car, is a blocking factor that it could account for some of the
difference between red cars and blue cars. Because sports cars are more likely to be red; if we
account for the difference, we can reduce the variability in our estimates
Under what conditions should you run A/B tests
When you can collect data quickly. When the data is representative and the amount of data is
small compared to the whole population
Do you have to decide the sample size ahead of time for A/B tests
no, and we can run the hypothesis test anytime we want
What is full factorial design
you test every combination and then use ANOVA to determine importance of each factor
What is fractional factorial design
when you test a subset of the entire set of combinations
What is a balanced design?
You test each choice the same # of times and each pair of choices the same # of times
When is regression effective work well to determine important factors?
If there aren't significant interactions between the factors.
what is exploration?
focusing on getting more information; in this case, to determine with more certainty which ad is
really the best
what is exploitation
we're focused on getting immediate value; in this example, to show the add that seems to be
doing best so far, because it seems to be most likely to be clicked.
what is the multi-armed bandit approach and how does it balance exploration and
exploitation.