2026/2027: Predictive Modeling, Machine
Learning & Analytics Exam Prep with
Solutions
Description:
Struggling to prepare for your ISYE 6501 Midterm 2 exam in 2027? This comprehensive
study guide includes 85+ practice questions and detailed answers covering everything
from predictive modeling, machine learning, and regression analysis to optimization,
simulation, and advanced analytics topics.
Designed specifically for Georgia Tech ISYE 6501 students, this resource aligns with 2027
course standards and includes real exam-style questions, step-by-step explanations, and
keyword-focused content to help you master key concepts like LASSO, Ridge, A/B testing,
missing data, probability distributions, and more. Whether you’re reviewing model selection,
experimental design, or prescriptive analytics, this guide provides the targeted, high-yield
review you need to succeed.
Don’t just study—study smart. Download your free exam prep guide now and boost your
confidence before test day!
,ISYE 6501 Midterm 2 Study Guide 2027: Exam Questions, Answers
& Predictive Modeling Review
1. In predictive modeling, under what circumstances is a model most likely to become overfit?
A) When the dataset is extremely large.
B) When the number of predictors is far smaller than the number of observations.
C) When the number of predictors is comparable to or exceeds the number of observations.
D) When all predictors are highly correlated.
Answer: C) When the number of predictors is comparable to or exceeds the number of
observations.
Explanation: Overfitting tends to occur when model complexity is high relative to the amount
of data available. If the number of factors or features is close to or larger than the number of data
points, the model may capture not only the underlying signal but also noise and random
fluctuations in the training data, leading to poor generalization on new data.
2. Why are simpler models often preferred over highly complex models in practice?
A) They are guaranteed to have higher predictive accuracy.
B) They require less data, reduce the risk of including irrelevant predictors, and are easier to
interpret.
C) They always include interaction terms.
D) They are immune to multicollinearity.
Answer: B) They require less data, reduce the risk of including irrelevant predictors, and are
easier to interpret.
Explanation: Simpler models are generally more robust because they are less prone to
overfitting, easier to communicate to stakeholders, and require fewer observations to estimate
reliably. They also minimize the inclusion of spurious factors that do not contribute meaningfully
to prediction.
,3. What best describes the forward selection method for variable selection?
A) Starting with no predictors, sequentially add the most significant predictor until no further
improvements meet a threshold.
B) Starting with all predictors, remove the least significant one iteratively.
C) A combination of adding and removing predictors at each step.
D) Using a penalty term to shrink coefficients toward zero.
Answer: A) Starting with no predictors, sequentially add the most significant predictor until no
further improvements meet a threshold.
Explanation: Forward selection begins with an empty model. At each step, the predictor that
provides the greatest improvement in model fit (e.g., highest R², lowest AIC, or significant p-
value) is added. The process continues until no remaining predictor meets a predefined
significance threshold.
4. How does backward elimination differ from forward selection?
A) It starts with all predictors and removes the least significant one at each step.
B) It only considers interaction terms.
C) It requires standardized data.
D) It is a non-greedy algorithm.
Answer: A) It starts with all predictors and removes the least significant one at each step.
Explanation: Backward elimination begins with a model containing all candidate predictors.
Iteratively, the predictor with the highest p-value (or lowest contribution) above a removal
threshold (e.g., p > 0.15) is discarded. This continues until all remaining predictors are
statistically significant.
5. What characterizes stepwise regression?
A) It uses only forward selection.
B) It uses only backward elimination.
C) It alternates between forward addition and backward removal of predictors.
D) It is a non-iterative method.
, Answer: C) It alternates between forward addition and backward removal of predictors.
Explanation: Stepwise regression is a hybrid approach. After adding a new predictor, the
method re-evaluates all included predictors and may remove any that no longer meet significance
criteria. This allows the model to adapt dynamically as new variables enter.
6. Which algorithmic category do forward selection, backward elimination, and stepwise
regression belong to?
A) Global optimization algorithms.
B) Greedy algorithms.
C) Bayesian algorithms.
D) Ensemble algorithms.
Answer: B) Greedy algorithms.
Explanation: These methods are considered greedy because they make the locally optimal
choice at each step (e.g., adding or removing the single best predictor at that moment) without
considering the broader combination of predictors, which may lead to suboptimal global
solutions.
7. What is the primary mechanism of LASSO regression?
A) It minimizes squared error while limiting the sum of squared coefficients.
B) It minimizes squared error while limiting the sum of absolute coefficients.
C) It uses only categorical predictors.
D) It is unaffected by predictor scaling.
Answer: B) It minimizes squared error while limiting the sum of absolute coefficients.
Explanation: LASSO (Least Absolute Shrinkage and Selection Operator) performs both
variable selection and regularization by minimizing the residual sum of squares subject to a
constraint on the sum of the absolute values of the coefficients. This can force some coefficients
to exactly zero, effectively removing them from the model.
8. Why is it essential to scale predictors before applying LASSO?
A) To ensure the intercept is zero.
B) Because the absolute value penalty is sensitive to the magnitude of predictors.