DATA MINING AND STAT LEARN EXAM PREP
2026 TESTED QUESTIONS WITH RATIONALE
◉ Causation. Answer: One thing causes another thing
◉ Correlation. Answer: Two things tend to happen or not happen
together
◉ P-value. Answer: Estimates the probability the coefficient = 0.
◉ Confidence Interval (CI). Answer: Where the coefficient
probability lies
◉ T-Statistic. Answer: The coefficient divided by its standard error
◉ R-squared. Answer: Estimate of how much variability the model
accounts for
◉ Adjusted R-Squared. Answer: R-squared adjusted to the number
of attributes
, ◉ Overfitting. Answer: A machine learning model error where a
model learns the training data too well
◉ Heteroskedasticity. Answer: Unequal variance
◉ Trend. Answer: Increase or decrease of data over time
◉ Explainability/Interpretability. Answer: Helps us toward an
understanding of 'why'
◉ AUC (Area Under Curve). Answer: Probability that the model
estimates a random 'yes' point higher than a random 'no' point
◉ True positive (TP). Answer: Point in the category, correctly
classified
◉ False positive (FP). Answer: Point not in category, model says it is
◉ True negative (TN). Answer: Point not in category, correctly
classified
◉ False negative (FN). Answer: Point in category, model says no
2026 TESTED QUESTIONS WITH RATIONALE
◉ Causation. Answer: One thing causes another thing
◉ Correlation. Answer: Two things tend to happen or not happen
together
◉ P-value. Answer: Estimates the probability the coefficient = 0.
◉ Confidence Interval (CI). Answer: Where the coefficient
probability lies
◉ T-Statistic. Answer: The coefficient divided by its standard error
◉ R-squared. Answer: Estimate of how much variability the model
accounts for
◉ Adjusted R-Squared. Answer: R-squared adjusted to the number
of attributes
, ◉ Overfitting. Answer: A machine learning model error where a
model learns the training data too well
◉ Heteroskedasticity. Answer: Unequal variance
◉ Trend. Answer: Increase or decrease of data over time
◉ Explainability/Interpretability. Answer: Helps us toward an
understanding of 'why'
◉ AUC (Area Under Curve). Answer: Probability that the model
estimates a random 'yes' point higher than a random 'no' point
◉ True positive (TP). Answer: Point in the category, correctly
classified
◉ False positive (FP). Answer: Point not in category, model says it is
◉ True negative (TN). Answer: Point not in category, correctly
classified
◉ False negative (FN). Answer: Point in category, model says no