| Questions and Answers | 2026 Update |
100% Correct – GT.
MGT 6203 – DATA ANALYTICS IN BUSINESS
Midterm Exam (Part 1 & Part 2)
Total Questions: 200
Format: Multiple Choice
Topics: Regression, Hypothesis Testing, Machine Learning, Time Series,
Visualization, Business Applications
Time Limit: 4 hours
Section 1: Regression Analysis (Questions 1–35)
1. A company wants to predict monthly sales based on advertising spend.
The regression output shows: Intercept = 50, Advertising coefficient = 4, and
R² = 0.85. What is the correct interpretation of the advertising coefficient?
A) For every extra unit of advertising spend, sales increase by 50 units
B) For every extra unit of advertising spend, sales increase by 4 units
C) Sales are 85% correlated with advertising
D) Advertising explains 4% of sales variation
Answer: B
*Rationale: The coefficient of 4 means that each additional unit of
advertising spend is associated with a 4-unit increase in sales, holding other
factors constant. The R² of 0.85 indicates that 85% of the variation in sales is
explained by the model*.
2. In a simple linear regression model Y = β₀ + β₁X + ε, what does β₁
represent?
A) The value of Y when X is zero
,B) The change in Y for a one-unit change in X
C) The error term
D) The correlation between X and Y
Answer: B
*Rationale: β₁ is the slope coefficient, representing the expected change in
the dependent variable Y for each one-unit increase in the independent
variable X. β₀ is the intercept (Y when X=0)*.
3. A high R² value (e.g., 0.95) in a regression model indicates:
A) The model has no bias
B) The model explains 95% of the variance in the dependent variable
C) The independent variables are all statistically significant
D) The model will predict perfectly on new data
Answer: B
Rationale: R² measures the proportion of variance in the dependent variable
that is explained by the independent variables. However, a high R² does not
guarantee that the model is correct or will generalize to new data, as it may
indicate overfitting.
4. A residual plot showing a funnel shape (widening spread as fitted values
increase) suggests:
A) Homoscedasticity – constant variance of errors
B) Heteroscedasticity – non-constant variance of errors
C) Normality of residuals
D) No autocorrelation
Answer: B
Rationale: Heteroscedasticity occurs when the variance of residuals is not
constant across all levels of the independent variable. A funnel shape is a
classic visual indicator of heteroscedasticity, which violates a key ordinary
least squares (OLS) assumption.
5. A company predicts revenue using advertising spend and number of
salespeople. The output shows: Advertising coefficient = 3 (p = 0.02),
Salespeople coefficient = 5 (p = 0.10). Which statement is correct?
,A) Both variables significantly increase revenue at α = 0.05
B) Advertising significantly increases revenue; salespeople effect is not
statistically significant
C) Neither variable is significant
D) The model explains 100% of revenue variation
Answer: B
*Rationale: The p-value for advertising (0.02) is less than 0.05, indicating
statistical significance. The p-value for salespeople (0.10) exceeds 0.05, so we
fail to reject the null hypothesis that the coefficient equals zero*.
6. In multiple regression, the adjusted R² differs from R² because:
A) Adjusted R² always increases when adding variables
B) Adjusted R² penalizes the addition of irrelevant predictors
C) Adjusted R² measures only linear relationships
D) Adjusted R² cannot be negative
Answer: B
Rationale: Adjusted R² includes a penalty for each additional predictor
variable, preventing the artificial inflation of R² that occurs when adding
variables that do not improve the model. It helps prevent overfitting by
favoring more parsimonious models.
7. What is the correct interpretation of a 95% confidence interval for a
regression coefficient of [2.5, 5.5]?
A) There is a 95% chance the true coefficient lies between 2.5 and 5.5
B) We are 95% confident the true population coefficient is between 2.5 and
5.5
C) The coefficient is significant at α = 0.01
D) 95% of the data falls within this range
Answer: B
*Rationale: A confidence interval provides a range of plausible values for the
population parameter. If the interval does not contain zero, the coefficient is
statistically significant at the corresponding α level (here, α = 0.05 since 95%
confidence corresponds to 5% significance)*.
, 8. Which of the following is NOT an assumption of ordinary least squares
(OLS) regression?
A) Linearity between predictors and outcome
B) Independence of errors
C) Errors are normally distributed
D) The dependent variable must be normally distributed
Answer: D
Rationale: OLS does not require the dependent variable to be normally
distributed. It requires that the errors (residuals) are normally distributed,
independent, and have constant variance (homoscedasticity).
9. A residual plot with no discernible pattern and points randomly scattered
around zero suggests:
A) The model is likely appropriate (homoscedasticity and linearity)
B) The model has severe heteroscedasticity
C) The model is overfitted
D) The independent variables are multicollinear
Answer: A
Rationale: A random scatter of residuals around zero with constant spread
indicates that the regression assumptions of linearity, homoscedasticity, and
independence are reasonably satisfied. Systematic patterns would indicate
violations.
10. Variance Inflation Factor (VIF) is used to detect:
A) Heteroscedasticity
B) Autocorrelation
C) Multicollinearity among independent variables
D) Non-normality of residuals
Answer: C
Rationale: VIF quantifies how much the variance of a regression coefficient is
inflated due to correlation with other predictors. A VIF > 5 or 10 indicates
problematic multicollinearity, which can make coefficient estimates unstable
and difficult to interpret.