GEORGIA TECH ISYE 6501 END OF COURSE
QUIZ SCRIPT 2026 FULL ANSWERS
GRADED A+
⩥ Random Forest Pros/Cons. Answer: Pros: Tends to give better
estimates overall because while each tree might be overfitting in one
place or another, they don't necessarily over-fit the same way
Average over all trees corrects overfitting
Cons: Harder to explain the output of these models.
Doesn't help explain how the variables interact, or how certain sequence
of branches is helpful or meaningful like we could a single tree.
Best we can do is some aggregate measure rather than specific insights.
Can't give us a single regression or classification model.
⩥ Logistic Regression. Answer: a nonlinear regression model that relates
a set of explanatory variables to a probability of a positive or negative
response.
⩥ AUC (Area under the ROC Curve). Answer: The Area Under the ROC
curve is the probability that a classifier will be more confident that a
randomly chosen positive example is actually positive than that a
randomly chosen negative example is positive.
, ⩥ Confusion matrices. Answer: A measurement of how well a
classification model works by breaking down responses into true
positive (TP), True Negative (TN), False Positive (FP), and false
negatives (FN)
⩥ Sensitivity. Answer: Sensitivity = TP / (TP + FN)
This is the fraction of category members that are correctly classified
⩥ Specificity. Answer: Specificity = TN / (TN + FP)
Fraction of non category members that are correctly classified
⩥ Forward Selection. Answer: Start with a model that has no factors. At
each step we find the best new factor to add to the model
and put it in as long as it's a good enough improvement. When there's no
factor that's good enough to add, or if we've added as many factors as we
want to have, we stop.
⩥ Backward Elimination. Answer: We start with a model that includes
all factors and at each step, we find the worst factor and remove it from
the model. We continue until there's no factor bad enough to remove,
and the model doesn't have any more factors than we want.
QUIZ SCRIPT 2026 FULL ANSWERS
GRADED A+
⩥ Random Forest Pros/Cons. Answer: Pros: Tends to give better
estimates overall because while each tree might be overfitting in one
place or another, they don't necessarily over-fit the same way
Average over all trees corrects overfitting
Cons: Harder to explain the output of these models.
Doesn't help explain how the variables interact, or how certain sequence
of branches is helpful or meaningful like we could a single tree.
Best we can do is some aggregate measure rather than specific insights.
Can't give us a single regression or classification model.
⩥ Logistic Regression. Answer: a nonlinear regression model that relates
a set of explanatory variables to a probability of a positive or negative
response.
⩥ AUC (Area under the ROC Curve). Answer: The Area Under the ROC
curve is the probability that a classifier will be more confident that a
randomly chosen positive example is actually positive than that a
randomly chosen negative example is positive.
, ⩥ Confusion matrices. Answer: A measurement of how well a
classification model works by breaking down responses into true
positive (TP), True Negative (TN), False Positive (FP), and false
negatives (FN)
⩥ Sensitivity. Answer: Sensitivity = TP / (TP + FN)
This is the fraction of category members that are correctly classified
⩥ Specificity. Answer: Specificity = TN / (TN + FP)
Fraction of non category members that are correctly classified
⩥ Forward Selection. Answer: Start with a model that has no factors. At
each step we find the best new factor to add to the model
and put it in as long as it's a good enough improvement. When there's no
factor that's good enough to add, or if we've added as many factors as we
want to have, we stop.
⩥ Backward Elimination. Answer: We start with a model that includes
all factors and at each step, we find the worst factor and remove it from
the model. We continue until there's no factor bad enough to remove,
and the model doesn't have any more factors than we want.