Regression Final
1. Interpret interac- tion The effect on X1 on Y can be interpreted, but it
effect
depends on what X2 is fixed at
The effect of a unit change in X1 on a predicted
value of Y, when X2 is fixed at some value (x2) is
given by B1+B3*x2
B1: The normal slope for the first variable (before
the colon)
B3: The interaction slope
2. Inclusion of x2: The value of the second variable (after the
Main Effects colon)
Models w interaction terms should include the
individual effects of the interacting variables.
3. Parallel Lines Ex: We do NOT do models like Y = B0 + B1*X1 +
As- sumption B2*X1*X2 because it is missing a slope for X2
itself -- the main effect
Models w/o interaction terms imply the effects of
predic- tors on the outcome do not depend on
one another
I.e. the slope (effect) for flipper_length on
mass is the same for both male and female
penguins.
4. Higher Order (pro cons)
In- teractions
1/
33
, Regression Final
Pros:
- More complex
models can
sometimes be
warranted
- More fine-
tuned
understanding
of the
relationships
be- tween vars,
if that is
needed for
your
application
- Sometimes
better
predictions
2/
33
, Regression Final
Cons:
- Risk of overfitting
- Models difficult to interpret
- Difficult to visualize
- Often, there is little practical significance, even
if there is statistical significance
5. How do you test You can run an anova test. If the P<0.05, we say it
for is good to include it
interactions?
If interaction is significant, the main effects will be,
too
6. Causes of Over- Overfitting occurs when the model used is too
fitting complex, relative to the amount of data you have.
For us, this usually means too many
predictors Rule of thumb: 1 predictor for
7. Detecting Over- every 10 rows of data
fitting
You can use cross-validation. Test & training data.
Overfit-
ting shows much lower performance in test data.
8. Interpret LOO Low LOO R^2 or LOO RMSE would also indicate
R^2 vs R^2 overfit- ting
R^2: The squared correlation between outcome
and pre- dicted value.
The squared correlation between Yi and Yi-i
- LOO-R^2 gives a more robust estimate of model
3/
33
, Regression Final
perfor- mance on
unseen data
compared to
R^2
- LOO-R2 is often
lower than R^2,
but is a better
estimate of how
well the model
will perform on
new data, so this
4/
33
1. Interpret interac- tion The effect on X1 on Y can be interpreted, but it
effect
depends on what X2 is fixed at
The effect of a unit change in X1 on a predicted
value of Y, when X2 is fixed at some value (x2) is
given by B1+B3*x2
B1: The normal slope for the first variable (before
the colon)
B3: The interaction slope
2. Inclusion of x2: The value of the second variable (after the
Main Effects colon)
Models w interaction terms should include the
individual effects of the interacting variables.
3. Parallel Lines Ex: We do NOT do models like Y = B0 + B1*X1 +
As- sumption B2*X1*X2 because it is missing a slope for X2
itself -- the main effect
Models w/o interaction terms imply the effects of
predic- tors on the outcome do not depend on
one another
I.e. the slope (effect) for flipper_length on
mass is the same for both male and female
penguins.
4. Higher Order (pro cons)
In- teractions
1/
33
, Regression Final
Pros:
- More complex
models can
sometimes be
warranted
- More fine-
tuned
understanding
of the
relationships
be- tween vars,
if that is
needed for
your
application
- Sometimes
better
predictions
2/
33
, Regression Final
Cons:
- Risk of overfitting
- Models difficult to interpret
- Difficult to visualize
- Often, there is little practical significance, even
if there is statistical significance
5. How do you test You can run an anova test. If the P<0.05, we say it
for is good to include it
interactions?
If interaction is significant, the main effects will be,
too
6. Causes of Over- Overfitting occurs when the model used is too
fitting complex, relative to the amount of data you have.
For us, this usually means too many
predictors Rule of thumb: 1 predictor for
7. Detecting Over- every 10 rows of data
fitting
You can use cross-validation. Test & training data.
Overfit-
ting shows much lower performance in test data.
8. Interpret LOO Low LOO R^2 or LOO RMSE would also indicate
R^2 vs R^2 overfit- ting
R^2: The squared correlation between outcome
and pre- dicted value.
The squared correlation between Yi and Yi-i
- LOO-R^2 gives a more robust estimate of model
3/
33
, Regression Final
perfor- mance on
unseen data
compared to
R^2
- LOO-R2 is often
lower than R^2,
but is a better
estimate of how
well the model
will perform on
new data, so this
4/
33