ISYE 6501 – QUIZ 2: KEY CONCEPTS,
QUESTIONS, AND ANSWERS
Elastic Net
Constrain combination of absolute value of coefficients and their squares.
Choose tau and upsilon.
Ridge Regression
Take out absolute value term from Elastic Net.
Doesn't do variable selection, but does lead to better predictive models.
LASSO
Constraint added to standard regression equation.
Budget t applied the model.
Greedy Variable Selection
Forward Selection
Backward Selection
Step-wise Regression
Global Variable Selection
LASSO
Elastic Net
LASSO; How to choose t?
Depends on:
1) # of variables you want
2) Quality of the model as you add more variables
Pros and Cons:
1) Forward Selection
2) Backward Selection
3) Step-wise Regression
,Pros:
Good for quick initial analysis to identify variables of potential importance.
Cons:
Don't perform well on additional data.
Pros and Cons:
1) LASSO
2) Elastic Net
Pros:
Better at prediction than greedy selection
Cons:
Slower
What's the difference between LASSO, Ridge Regression, and Elastic Net?
What's the difference between the sum of squared coefficients?
Quadratic term/constraint in ridge regression tends to shrinks/regularizes the coefficients.
Shrinking adds bias, but reduces variance, resulting in better model.
Pros and Cons of Elastic Net (Only)
Pros:
1) Variable selection benefits of LASSO
2) Predictive benefits of Ridge Regression
Cons:
1) Arbitrarily rules out some correlated variables like LASSO.
2) Underestimates coefficients of very predictive variables like Ridge Regression.
Design of Experiments (DOE)
Choosing a set of tests to be made to find the effect of input variables on an outcome.
Blocking Factor
Something that creates variation in an experiment which is not the subject of study.
A/B Testing Requirements
1) Collect data quickly
2) Data must be representative
3) Amount of data is small compared to the whole population
Bernouilli Distribution
Full Factorial Design
, Test of all possible combinations of factor values over multiple factors to find each one's effect, and
interaction effects, on the outcome.
Binomial Distribution
The probability distribution of a binomial random variable.
Fractional Factorial Design
Test of a subset of all possible combinations of factor values over multiple factors. If chosen well, the
desired effects of factors and factor interaction effects can be obtained.
Geometric Distribution
The probability distribution of a geometric random variable X. All possible outcomes of X before the first
success is seen and their associated probabilities.
Control
1. A variable whose value remains constant for all runs of an experiment, so changes in this variable
don't affect the experiment.
2. Design an experiment where some factors ("controls" by definition (1)) are held constant to avoid
them affecting the outcome.
Poisson Distribution
Probability distribution for the number of arrivals during each time period
Multi-Armed Bandit
K-alternatives with equal probabilities.
Model that allows the tradeoff between exploration of unknown resources and exploitation of known
resources to optimize output.
Exponential Distribution
A probability distribution associated with the time between arrivals
Response Surface
Sequential experimentation strategy to understand the relationship between response and input
factors, and/or optimize the response.
Weibull Distribution
A mathematical distribution showing the probability of failure or survival of a material as a function of
the stress
Q-Q Plot
Plots of the observed against expected test statistic.
QUESTIONS, AND ANSWERS
Elastic Net
Constrain combination of absolute value of coefficients and their squares.
Choose tau and upsilon.
Ridge Regression
Take out absolute value term from Elastic Net.
Doesn't do variable selection, but does lead to better predictive models.
LASSO
Constraint added to standard regression equation.
Budget t applied the model.
Greedy Variable Selection
Forward Selection
Backward Selection
Step-wise Regression
Global Variable Selection
LASSO
Elastic Net
LASSO; How to choose t?
Depends on:
1) # of variables you want
2) Quality of the model as you add more variables
Pros and Cons:
1) Forward Selection
2) Backward Selection
3) Step-wise Regression
,Pros:
Good for quick initial analysis to identify variables of potential importance.
Cons:
Don't perform well on additional data.
Pros and Cons:
1) LASSO
2) Elastic Net
Pros:
Better at prediction than greedy selection
Cons:
Slower
What's the difference between LASSO, Ridge Regression, and Elastic Net?
What's the difference between the sum of squared coefficients?
Quadratic term/constraint in ridge regression tends to shrinks/regularizes the coefficients.
Shrinking adds bias, but reduces variance, resulting in better model.
Pros and Cons of Elastic Net (Only)
Pros:
1) Variable selection benefits of LASSO
2) Predictive benefits of Ridge Regression
Cons:
1) Arbitrarily rules out some correlated variables like LASSO.
2) Underestimates coefficients of very predictive variables like Ridge Regression.
Design of Experiments (DOE)
Choosing a set of tests to be made to find the effect of input variables on an outcome.
Blocking Factor
Something that creates variation in an experiment which is not the subject of study.
A/B Testing Requirements
1) Collect data quickly
2) Data must be representative
3) Amount of data is small compared to the whole population
Bernouilli Distribution
Full Factorial Design
, Test of all possible combinations of factor values over multiple factors to find each one's effect, and
interaction effects, on the outcome.
Binomial Distribution
The probability distribution of a binomial random variable.
Fractional Factorial Design
Test of a subset of all possible combinations of factor values over multiple factors. If chosen well, the
desired effects of factors and factor interaction effects can be obtained.
Geometric Distribution
The probability distribution of a geometric random variable X. All possible outcomes of X before the first
success is seen and their associated probabilities.
Control
1. A variable whose value remains constant for all runs of an experiment, so changes in this variable
don't affect the experiment.
2. Design an experiment where some factors ("controls" by definition (1)) are held constant to avoid
them affecting the outcome.
Poisson Distribution
Probability distribution for the number of arrivals during each time period
Multi-Armed Bandit
K-alternatives with equal probabilities.
Model that allows the tradeoff between exploration of unknown resources and exploitation of known
resources to optimize output.
Exponential Distribution
A probability distribution associated with the time between arrivals
Response Surface
Sequential experimentation strategy to understand the relationship between response and input
factors, and/or optimize the response.
Weibull Distribution
A mathematical distribution showing the probability of failure or survival of a material as a function of
the stress
Q-Q Plot
Plots of the observed against expected test statistic.