ANALYTICS MODELING GEORGIA
INSTITUTE OF TECHNOLOGY | 200
QUESTIONS | 2026 VERIFIED ANSWERS
& RATIONALES
Exam Blueprint:
Advanced Classification & Regression (25%) – 50 Qs
Advanced Clustering & Unsupervised Learning (15%) – 30 Qs
Time Series & Forecasting (15%) – 30 Qs
Optimization (Linear, Integer, Non-linear) (20%) – 40 Qs
Simulation & Risk Analysis (15%) – 30 Qs
Data Preparation, Model Evaluation & Special Topics (10%) – 20 Qs
SECTION 1: ADVANCED CLASSIFICATION & REGRESSION – Questions 1–50
1. In logistic regression, the interpretation of the coefficient β₁ for a continuous predictor
x₁ is:
A) A one-unit change in x₁ changes the log-odds of the outcome by β₁
B) A one-unit change in x₁ changes the probability of the outcome by β₁
C) A one-unit change in x₁ changes the odds ratio by β₁
D) A one-unit change in x₁ changes the predicted class by β₁
Answer: A
Rationale: Logistic regression models logit(p) = β₀ + β₁x₁. The exponentiated coefficient
exp(β₁) is the odds ratio for a one-unit increase in x₁.
2. Which of the following is a limitation of linear discriminant analysis (LDA)?
A) Assumes that predictors are normally distributed within each class
,B) Assumes equal covariance matrices across classes
C) Cannot handle categorical predictors directly (needs transformation)
D) All of the above
Answer: D
Rationale: LDA makes strong assumptions: multivariate normality, equal covariance
matrices; it can be extended (QDA relaxes equal covariance).
3. Quadratic discriminant analysis (QDA) differs from LDA because QDA:
A) Allows each class to have its own covariance matrix (quadratic decision boundary)
B) Assumes linear decision boundary
C) Is a special case of logistic regression
D) Is non-parametric
Answer: A
Rationale: QDA estimates separate covariance matrices per class, leading to quadratic
boundaries; more flexible but with more parameters.
4. Which of the following is a non-parametric classification method?
A) k-Nearest Neighbors (k-NN)
B) Logistic regression
C) LDA
D) Linear SVM
Answer: A
Rationale: k-NN makes no distributional assumptions; it uses the training data directly.
5. The “kernel trick” in SVM allows:
A) Efficient computation of dot products in a high-dimensional feature space without
explicitly transforming data
B) Only linear classification
C) Faster training on large datasets
D) Removal of support vectors
Answer: A
Rationale: Kernels (RBF, polynomial) map data implicitly; the decision boundary can be
non-linear.
,6. Which kernel is most commonly used for non-linear SVM?
A) Radial Basis Function (RBF) kernel
B) Linear kernel
C) Polynomial kernel of degree 2
D) Sigmoid kernel
Answer: A
Rationale: RBF (Gaussian) kernel is a general-purpose non-linear kernel; it has one
parameter γ.
7. The parameter γ (gamma) in an RBF kernel controls:
A) The influence of a single training example (small γ → large radius, smooth boundary;
large γ → local fit)
B) The regularization strength
C) The number of support vectors
D) The learning rate
Answer: A
Rationale: Large γ creates wiggly decision boundaries (overfitting); small γ smooths.
8. In a decision tree, the Gini impurity at a node is maximized when:
A) The classes are evenly distributed (maximum uncertainty)
B) The node is pure (all one class)
C) The node has only one sample
D) There is no split
Answer: A
Rationale: Gini = 1 – Σ p²; highest when all p are equal (e.g., 0.5/0.5).
9. The reduction in impurity (information gain) is used to decide splits in:
A) Decision trees (ID3, C4.5)
B) Linear regression
C) k-NN
D) SVM
Answer: A
, Rationale: Information gain = entropy(parent) – weighted average entropy(children);
higher gain indicates better split.
10. Pruning a decision tree aims to:
A) Reduce overfitting by removing branches that provide little predictive power
B) Increase the depth of the tree
C) Increase the number of leaves
D) Add more features
Answer: A
Rationale: Pruning simplifies the tree, improving generalization (cost-complexity
pruning).
11. Random Forest reduces overfitting compared to a single decision tree by:
A) Bagging (bootstrap aggregating) and random feature selection
B) Using only one tree
C) Increasing tree depth
D) Removing all features
Answer: A
Rationale: Bootstrapping creates diverse trees; random feature selection reduces
correlation.
12. Gradient Boosting builds trees sequentially; each new tree tries to:
A) Correct the errors (residuals) of the previous ensemble
B) Fit the original targets directly
C) Randomly sample features
D) Reduce the number of trees
Answer: A
Rationale: In gradient boosting, each tree fits the negative gradient of the loss function.
13. Which boosting algorithm explicitly uses the “AdaBoost” approach with exponential
loss?
A) AdaBoost
B) Gradient Boosting Machine (GBM)