ISYE 6414 MIDTERM PREP EXAM
QUESTIONS WITH CORRECT
ANSWERS
The mean sum of square errors in ANOVA measures variability within groups. - Answer-
True
Only the log-transformation of the response variable should be used when the normality
assumption does not hold. - Answer-False
If one confidence interval in the pairwise comparison includes only positive values, we
conclude that the difference in means is positive, and statistically significant. - Answer-
True
The number of degrees of freedom of the χ 2 (chi-square) distribution for the pooled
variance estimator is N − k + 1 where k is the number of samples. - Answer-False
For assessing the normality assumption of the ANOVA model, we can use the quantile-
quantile normal plot and the historgram of the residuals. - Answer-True
One-way ANOVA is a linear regression model with more than one qualitative predicting
variables. - Answer-False
The sampling distribution for the variance estimator in ANOVA is χ 2 (chi-square) with N
- k degrees of freedom. - Answer-False
In simple linear regression, we can diagnose the assumption of constant-variance by
plotting the residuals against fitted values. - Answer-True
If response variable Y has a quadratic relationship with a predictor variable X, it is
possible to model the relationship using multiple linear regression. - Answer-True
The R^2 value represents the percentage of variability in the response that can be
explained by the linear regression on the predictors. Models with higher R^2 are always
preferred over models with lower R^2 . - Answer-False
For the model y = β 0 + β 1 x 1 + ... + β p x p + ϵ , where ϵ ∼ N ( 0 , σ^2 ) , there are p+1
parameters to be estimated - Answer-False
The F-test can be used to evaluate the relationship between two qualitative variables. -
Answer-False
, The Partial F-Test can test whether a subset of regression coefficients are all equal to
zero. - Answer-True
In multiple linear regression, controlling variables are used to control for sample bias. -
Answer-True
In a multiple regression model with 7 predicting variables, the sampling distribution of
the estimated variance of the error terms is a chi-squared distribution with n-8 degrees
of freedom. - Answer-True
There are four assumptions needed for estimation with multiple linear regression: mean
zero, constant variance, independence, and normality. - Answer-False (why?)
Let Y^∗ be the predicted response at x^∗ . The variance of Y^∗ given x^∗ depends on
both the value of x^∗ and the design matrix. - Answer-True (but the wording was
confusing, so everyone got credit no matter what on this question)
Suppose x1 was not found to be significant in the model specified with lm(y ~ x1 + x2 +
x3). Then x1 will also not be significant in the model lm(y ~ x1 + x2). - Answer-False
When estimating confidence values for the mean response for all instances of the
predicting variables, we should use a critical point based on the F-distribution to correct
for the simultaneous inference. - Answer-True
For estimating confidence intervals for the regression coefficients, the sampling
distribution used is a normal distribution. - Answer-False
In a multiple linear regression model with quantitative predictors, the coefficient
corresponding to one predictor is interpreted as the estimated expected change in the
response variable when there is a one unit change in that predictor. - Answer-False
It is possible to produce a model where the overall F-statistic is significant but all the
regression coefficients have insignificant t-statistics. - Answer-True. (explanation: This
can happen when you have multicollinearity in two or more of the predictors. In that
case, you have an overall model which is significant but on the level of individual
predictor, they might not be since either of the collinear features could be included.)
Analysis of Variance (ANOVA) is an example of a multiple regression model. - Answer-
True
For a multiple regression model, both the true errors ϵ and the estimated residuals ϵ-hat
have a constant mean and a constant variance. - Answer-False
If the p-value of the overall F-test is close to 0, we can conclude all the predicting
variable coefficients are significantly nonzero. - Answer-False
QUESTIONS WITH CORRECT
ANSWERS
The mean sum of square errors in ANOVA measures variability within groups. - Answer-
True
Only the log-transformation of the response variable should be used when the normality
assumption does not hold. - Answer-False
If one confidence interval in the pairwise comparison includes only positive values, we
conclude that the difference in means is positive, and statistically significant. - Answer-
True
The number of degrees of freedom of the χ 2 (chi-square) distribution for the pooled
variance estimator is N − k + 1 where k is the number of samples. - Answer-False
For assessing the normality assumption of the ANOVA model, we can use the quantile-
quantile normal plot and the historgram of the residuals. - Answer-True
One-way ANOVA is a linear regression model with more than one qualitative predicting
variables. - Answer-False
The sampling distribution for the variance estimator in ANOVA is χ 2 (chi-square) with N
- k degrees of freedom. - Answer-False
In simple linear regression, we can diagnose the assumption of constant-variance by
plotting the residuals against fitted values. - Answer-True
If response variable Y has a quadratic relationship with a predictor variable X, it is
possible to model the relationship using multiple linear regression. - Answer-True
The R^2 value represents the percentage of variability in the response that can be
explained by the linear regression on the predictors. Models with higher R^2 are always
preferred over models with lower R^2 . - Answer-False
For the model y = β 0 + β 1 x 1 + ... + β p x p + ϵ , where ϵ ∼ N ( 0 , σ^2 ) , there are p+1
parameters to be estimated - Answer-False
The F-test can be used to evaluate the relationship between two qualitative variables. -
Answer-False
, The Partial F-Test can test whether a subset of regression coefficients are all equal to
zero. - Answer-True
In multiple linear regression, controlling variables are used to control for sample bias. -
Answer-True
In a multiple regression model with 7 predicting variables, the sampling distribution of
the estimated variance of the error terms is a chi-squared distribution with n-8 degrees
of freedom. - Answer-True
There are four assumptions needed for estimation with multiple linear regression: mean
zero, constant variance, independence, and normality. - Answer-False (why?)
Let Y^∗ be the predicted response at x^∗ . The variance of Y^∗ given x^∗ depends on
both the value of x^∗ and the design matrix. - Answer-True (but the wording was
confusing, so everyone got credit no matter what on this question)
Suppose x1 was not found to be significant in the model specified with lm(y ~ x1 + x2 +
x3). Then x1 will also not be significant in the model lm(y ~ x1 + x2). - Answer-False
When estimating confidence values for the mean response for all instances of the
predicting variables, we should use a critical point based on the F-distribution to correct
for the simultaneous inference. - Answer-True
For estimating confidence intervals for the regression coefficients, the sampling
distribution used is a normal distribution. - Answer-False
In a multiple linear regression model with quantitative predictors, the coefficient
corresponding to one predictor is interpreted as the estimated expected change in the
response variable when there is a one unit change in that predictor. - Answer-False
It is possible to produce a model where the overall F-statistic is significant but all the
regression coefficients have insignificant t-statistics. - Answer-True. (explanation: This
can happen when you have multicollinearity in two or more of the predictors. In that
case, you have an overall model which is significant but on the level of individual
predictor, they might not be since either of the collinear features could be included.)
Analysis of Variance (ANOVA) is an example of a multiple regression model. - Answer-
True
For a multiple regression model, both the true errors ϵ and the estimated residuals ϵ-hat
have a constant mean and a constant variance. - Answer-False
If the p-value of the overall F-test is close to 0, we can conclude all the predicting
variable coefficients are significantly nonzero. - Answer-False