Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

ISYE 6501 – Midterm 1 Examination 300 Questions | Georgia Tech 2026 | Grade A – Verified Solutions

Beoordeling
-
Verkocht
-
Pagina's
65
Cijfer
A+
Geüpload op
06-05-2026
Geschreven in
2025/2026

Ace your ISYE 6501 – Introduction to Analytics Modeling midterm with this comprehensive study guide containing 300 practice questions – each with a verified answer and a detailed rationale explaining the underlying concepts. This resource covers every major topic tested in the Georgia Tech OMSA / OMSCS course, including regression, classification, clustering, dimensionality reduction, time series, optimization, and model validation. What’s inside? Analytics Modeling Fundamentals (supervised vs. unsupervised learning, bias‑variance tradeoff, overfitting/underfitting, cross‑validation, holdout, k‑fold CV) Regression Models (linear regression, OLS assumptions, heteroscedasticity, multicollinearity, R², adjusted R², polynomial terms, interactions, regularization (ridge, lasso, elastic net), logistic regression, odds ratios, confusion matrix, precision, recall, F1, ROC/AUC) Classification Models (k‑NN, decision trees, random forest, boosting (AdaBoost, gradient boosting), SVM (kernel trick, support vectors, C parameter), Naïve Bayes, LDA, QDA) Clustering & Dimensionality Reduction (k‑means (elbow method, silhouette), hierarchical clustering (dendrogram, linkage methods), DBSCAN, PCA (variance explained, loadings, scree plot), t‑SNE, UMAP, MDS, NMF, autoencoders) Time Series Forecasting (trend, seasonality, naive forecast, moving average, exponential smoothing (SES, Holt, Holt‑Winters), ARIMA (ACF, PACF, unit root, differencing), stationarity, Box‑Jenkins, forecast evaluation (MAPE, sMAPE, MASE), Ljung‑Box test) Optimization & Linear Programming (LP formulation, feasible region, simplex method, duality (shadow prices), integer programming (IP), branch & bound, transportation problem, assignment problem (Hungarian algorithm), knapsack, sensitivity analysis, reduced cost) Model Selection & Validation (AIC, BIC, Mallows’ Cp, cross‑validation variants, information criteria, learning curves, Occam’s razor, bootstrap, one‑standard‑error rule) Why this resource works: Every answer is directly supported by a rationale – understand the “why” behind the correct choice, not just memorize answers. Covers all topics from the ISYE 6501 syllabus (Georgia Tech OMSA) and common analytics modeling exam blueprints. Perfect for self‑testing, spaced repetition, or last‑minute review before your midterm or final. Ideal for: Georgia Tech OMSA / OMSCS students taking ISYE 6501, analytics modeling students, data science professionals refreshing core concepts, and anyone preparing for a comprehensive exam in regression, classification, clustering, time series, or optimization

Meer zien Lees minder
Instelling
ISYE 6501
Vak
ISYE 6501

Voorbeeld van de inhoud

ISYE 6501 – Midterm 1 Examination 300
Questions | Georgia Tech 2026 | Grade A –
Verified Solutions

SECTION 1: INTRODUCTION & MODELING FRAMEWORK
(Questions 1–20)
1. Which of the following best defines “analytics modeling”?
A. The process of storing large amounts of data
B. The use of mathematical and statistical methods to extract insights and support
decision-making from data
C. The design of efficient database schemas
D. The creation of data visualization dashboards
Rationale: Analytics modeling focuses on building models (statistical, machine learning,
optimization) to understand patterns, predict outcomes, or prescribe actions. Storage,
schemas, and dashboards are related but not definitions of modeling.

2. In the analytics modeling framework, what is the first step after defining the business
problem?
A. Build a complex model immediately
B. Collect and prepare relevant data
C. Deploy the model into production
D. Validate model assumptions
Rationale: Once the problem is defined, the next critical step is obtaining and cleaning
the data that will be used to build the model. Data preparation is usually the most
time-consuming part.

3. A model that is too complex for the amount of available data is most likely to suffer
from:
A. Underfitting
B. Overfitting
C. Bias-variance tradeoff balance
D. Cold start problem
Rationale: Overfitting occurs when a model learns noise and random fluctuations in the
training data rather than the underlying pattern. This happens when model complexity is
high relative to data size.

,4. Which of the following tasks is an example of supervised learning?
A. Customer segmentation using k-means
B. Predicting house prices using historical sales data
C. Dimensionality reduction with PCA
D. Association rule mining for market basket analysis
Rationale: Supervised learning requires labeled output/target variable. House price
prediction uses past sales (with prices) as labels. Clustering, PCA, and association rules
are unsupervised.

5. Which of the following tasks is an example of unsupervised learning?
A. Classification of emails as spam or not spam
B. Clustering customers into groups based on purchasing behavior
C. Predicting temperature from weather features
D. Estimating the probability of loan default
Rationale: Unsupervised learning finds hidden structures without labeled outcomes.
Clustering is a classic unsupervised task.

6. The “bias-variance tradeoff” implies that:
A. Increasing model complexity always reduces test error
B. As model complexity increases, bias typically decreases and variance increases
C. Bias and variance move in the same direction
D. Simple models always have high variance
Rationale: The tradeoff: simple models (high bias, low variance) underfit; complex
models (low bias, high variance) overfit. The goal is to find a balance that minimizes
total error.

7. Holdout validation splits the data into:
A. Only training and test sets
B. Training, validation, and test sets (often training+validation vs test)
C. Only training set
D. K equal sized folds
Rationale: Holdout uses one partition (e.g., 70% training, 30% testing) or
training/validation/test. K-fold cross-validation uses multiple folds, but holdout uses a
single split.

8. In k-fold cross-validation, what is the purpose of the validation folds?
A. To train the final model
B. To estimate the model’s performance on unseen data
C. To increase the training set size
D. To select features
Rationale: Each fold is used once as validation to compute an out-of-sample error

,estimate; the model is trained on the other k-1 folds. This provides a more robust
performance estimate than a single holdout.

9. Which statement about the confusion matrix is correct?
A. It is only used for regression problems
B. It shows the counts of true positives, false positives, true negatives, and false
negatives
C. It cannot be used for multi-class classification
D. It does not depend on the chosen threshold
Rationale: A confusion matrix is a table for classification. For binary classification it
contains TP, FP, TN, FN. Multi-class extensions exist. It depends on the decision
threshold.

10. Sensitivity (recall) is defined as:
A. TP / (TP + FP)
B. TP / (TP + FN)
C. TN / (TN + FP)
D. (TP + TN) / (TP + TN + FP + FN)
Rationale: Sensitivity measures the proportion of actual positives correctly identified.
Formula: TP / (TP + FN). Precision is TP/(TP+FP); specificity is TN/(TN+FP).

11. Precision is defined as:
A. TP / (TP + FP)
B. TP / (TP + FN)
C. TN / (TN + FP)
D. (TP + TN) / total
Rationale: Precision answers: "Of all predicted positives, how many were actually
positive?" High precision means low false positive rate.

12. The F1 score is the harmonic mean of:
A. Accuracy and recall
B. Precision and recall
C. Sensitivity and specificity
D. Precision and accuracy
Rationale: F1 = 2 × (precision × recall) / (precision + recall). It balances precision and
recall, useful for imbalanced classes.

13. Underfitting is characterized by:
A. High variance on test data
B. High bias and poor performance on both training and test data
C. Zero training error

, D. Complex decision boundaries
Rationale: Underfitting occurs when a model is too simple to capture the pattern,
leading to high bias and poor fit on training data, which extends to test data.

14. The ROC curve plots:
A. Precision vs recall
B. True positive rate (TPR) vs false positive rate (FPR)
C. Accuracy vs model complexity
D. Sensitivity vs specificity
Rationale: ROC (Receiver Operating Characteristic) curve shows TPR (sensitivity) on
y-axis and FPR (1-specificity) on x-axis. AUC summarizes performance.

15. A model with AUC = 0.5 indicates:
A. Perfect classification
B. Random guessing (no discriminative power)
C. Slightly better than random
D. The model always predicts the majority class
Rationale: AUC of 0.5 means the classifier’s performance is equivalent to flipping a coin.
Values >0.7 are usually considered acceptable; 1.0 is perfect.

16. Which is a common way to handle missing numeric data?
A. Delete any row with a missing value regardless of context
B. Imputation using mean, median, or model-based methods
C. Replace missing values with zero
D. Ignore missing values
Rationale: Imputation is a standard technique to retain data while handling
missingness. Mean/median imputation is simple; more advanced methods (k-NN,
regression) can be used.

17. Outliers in a dataset can be detected using:
A. Correlation matrix
B. Boxplots (IQR method) or z-scores
C. Linear regression coefficients
D. Confusion matrix
Rationale: Outliers are often identified by values beyond 1.5×IQR from quartiles or
z-scores >3. Boxplots visually show outliers.

18. Normalization (min-max scaling) transforms features to the range:
A. [-1, 1]
B. Typically [0, 1]
C. (-∞, ∞)

Geschreven voor

Instelling
ISYE 6501
Vak
ISYE 6501

Documentinformatie

Geüpload op
6 mei 2026
Aantal pagina's
65
Geschreven in
2025/2026
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

$28.49
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
PremiumExamBank Chamberlain College Of Nursng
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
331
Lid sinds
2 jaar
Aantal volgers
65
Documenten
5460
Laatst verkocht
2 dagen geleden
TEST BANKS AND ALL KINDS OF EXAMS SOLUTIONS

TESTBANKS, SOLUTION MANUALS & ALL EXAMS SHOP!!!! TOP 5_star RATED page offering the very best of study materials that guarantee Success in your studies. Latest, Top rated & Verified; Testbanks, Solution manuals & Exam Materials. You get value for your money, Satisfaction and best customer service!!! Buy without Doubt..

4.8

1043 beoordelingen

5
929
4
74
3
25
2
10
1
5

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen