ISYE 6501 MIDTERM EXAM
QUESTIONS WITH COMPLETE
SOLUTIONS
Prisoner's dilemma - Answer-A situation in game theory where each participant would
benefit if all participants act in a certain way, but each participant individually has
incentive to not act that way.
Pruning - Answer-Removing a branch from a tree.
Pseudo-R-squared/Pseudo-R2 - Answer-Measure similar to R2 used for nonlinear
regression models where R2 cannot be calculated.
Pure strategy - Answer-A strategy where a participant's action is deterministic (known
with probability 1) - for example, in "rock, paper, scissors", someone who always
chooses "rock" is using a pure strategy.
p-value (hypothesis testing) - Answer-Probability that results at least as extreme as
those in the data would be observed if the null hypothesis is true.
p-value (regression) - Answer-Probability that results at least as extreme as those in the
data would be observed if the coefficient of a variable is zero.
p-value fishing - Answer-Testing many different hypotheses hoping to find one with a
low pvalue. (Bad practice.)
Q-Q plot Quantile-quantile plot - Answer-a plot comparing the quantiles of two data sets,
or one data set and a distribution, to see whether they might have a common
distribution.
Quantitative data - Answer-Data that describes numerical amounts of something - for
example, height and weight.
Queue - Answer-A line of people, things, etc. waiting to go through or be
processed/served by a resource.
Queuing - Answer-The mathematical study of queues.
Random effects - Answer-Patterns that appear to occur in a subset of data, but only
exist due to random variability in the data and are not part of the system.
,Random forest - Answer-Machine learning model that creates many different trees and
returns their mean output. Can be used with classification trees, regression trees,
decision trees.
Real effects - Answer-Actual patterns in the system being modeled. Ideally, good
models will reveal real effects.
Recall - Answer-Fraction of data points in a certain category that are correctly classified
by a model; equal to 𝑇𝑇/𝑇𝑇+𝐹𝐹; also called sensitivity, hit rate, and true positive rate.
Receiver operating characteristic curve (ROC curve) - Answer-Graph that plots the true
positive rate against the false positive rates for different classification cutoff thresholds.
Rectilinear distance - Answer-The sum of the lengths in each dimension between two
points. Also called the Manhattan or 1-norm distance.
Regression - Answer-Statistical model that describes relationships between variables,
and/or predicts future values of a response.
Regression splines - Answer-Regression model where different functions are used for
different ranges of the data. Also called spline regression.
Regression tree - Answer-Tree-based method for regression. After branching to split the
data, each subset is analyzed with its own regression model.
Regularization - Answer-Addition of term(s) to the model to reduce model complexity or
overfitting. For example, adding a penalty to the objective function in regression can
help reduce overfitting (see ridge regression).
Replication - Answer-Running a stochastic simulation multiple times to sample the
distribution of possible simulation results. "A replication" also refers to a single one of
many runs of the simulation.
Resource - Answer-In ARENA the "doers" - for example, a call center worker at a
queue.
Response - Answer-A variable of interest that a model tries to estimate or predict.
Response surface - Answer-Sequential experimentation strategy to understand the
relationship between response and input factors, and/or optimize the response.
Ridge regression - Answer-Method of regularization by limiting the sum of the squares
of the coefficients. Will reduce the magnitude of coefficients, not the number of variables
chosen.
, Robust solution - Answer-A solution that whose worst-case outcome over all possible
scenarios is least bad.
ROC curve - Answer-Receiver operating characteristic curve.
Root - Answer-The first, complete data set in a tree model.
R-squared/R2 - Answer-Measure of linear regression model quality, the fraction of
variance in the response that is explained by the model.
Coefficient of Determination - Answer-R-squared value. The fraction of variance in the
response that is explained by the model.
Scaling - Answer-Shrinking or expanding, and moving, the range of data to fit exactly
into a specific interval.
Scenario - Answer-Specific case/instance of an uncertain outcome; one approach to
stochastic optimization is to optimize over a number of scenarios simultaneously.
Seasonality/cycles - Answer-Repeating pattern in data values over time, often at
consistent intervals.
Seasonality length/cycle length - Answer-Fixed time period at which
cycles/seasonalities repeat themselves.
Sensitivity - Answer-Fraction of data points in a certain category that are correctly
classified by a model; equal to 𝑇𝑇/𝑇𝑇+𝐹𝐹; also called the true positive rate, hit rate, and
recall.
Sequential game - Answer-A game in which participants choose their actions one after
another, so participants who choose later have knowledge of the earlier actions.
Service rate - Answer-Rate at which entities are processed.
Shortest path problem - Answer-Network optimization model that finds the shortest
route in a network from one specific node to another.
Simplicity (of a model) - Answer-Having fewer parameters; opposite of complexity of a
model. Often helpful for avoiding overfitting and increasing interpretability.
Simulation - Answer-A model that imitates the operation or behavior of a real system.
Simultaneous game - Answer-A game in which all participants choose their actions at
the same time.
QUESTIONS WITH COMPLETE
SOLUTIONS
Prisoner's dilemma - Answer-A situation in game theory where each participant would
benefit if all participants act in a certain way, but each participant individually has
incentive to not act that way.
Pruning - Answer-Removing a branch from a tree.
Pseudo-R-squared/Pseudo-R2 - Answer-Measure similar to R2 used for nonlinear
regression models where R2 cannot be calculated.
Pure strategy - Answer-A strategy where a participant's action is deterministic (known
with probability 1) - for example, in "rock, paper, scissors", someone who always
chooses "rock" is using a pure strategy.
p-value (hypothesis testing) - Answer-Probability that results at least as extreme as
those in the data would be observed if the null hypothesis is true.
p-value (regression) - Answer-Probability that results at least as extreme as those in the
data would be observed if the coefficient of a variable is zero.
p-value fishing - Answer-Testing many different hypotheses hoping to find one with a
low pvalue. (Bad practice.)
Q-Q plot Quantile-quantile plot - Answer-a plot comparing the quantiles of two data sets,
or one data set and a distribution, to see whether they might have a common
distribution.
Quantitative data - Answer-Data that describes numerical amounts of something - for
example, height and weight.
Queue - Answer-A line of people, things, etc. waiting to go through or be
processed/served by a resource.
Queuing - Answer-The mathematical study of queues.
Random effects - Answer-Patterns that appear to occur in a subset of data, but only
exist due to random variability in the data and are not part of the system.
,Random forest - Answer-Machine learning model that creates many different trees and
returns their mean output. Can be used with classification trees, regression trees,
decision trees.
Real effects - Answer-Actual patterns in the system being modeled. Ideally, good
models will reveal real effects.
Recall - Answer-Fraction of data points in a certain category that are correctly classified
by a model; equal to 𝑇𝑇/𝑇𝑇+𝐹𝐹; also called sensitivity, hit rate, and true positive rate.
Receiver operating characteristic curve (ROC curve) - Answer-Graph that plots the true
positive rate against the false positive rates for different classification cutoff thresholds.
Rectilinear distance - Answer-The sum of the lengths in each dimension between two
points. Also called the Manhattan or 1-norm distance.
Regression - Answer-Statistical model that describes relationships between variables,
and/or predicts future values of a response.
Regression splines - Answer-Regression model where different functions are used for
different ranges of the data. Also called spline regression.
Regression tree - Answer-Tree-based method for regression. After branching to split the
data, each subset is analyzed with its own regression model.
Regularization - Answer-Addition of term(s) to the model to reduce model complexity or
overfitting. For example, adding a penalty to the objective function in regression can
help reduce overfitting (see ridge regression).
Replication - Answer-Running a stochastic simulation multiple times to sample the
distribution of possible simulation results. "A replication" also refers to a single one of
many runs of the simulation.
Resource - Answer-In ARENA the "doers" - for example, a call center worker at a
queue.
Response - Answer-A variable of interest that a model tries to estimate or predict.
Response surface - Answer-Sequential experimentation strategy to understand the
relationship between response and input factors, and/or optimize the response.
Ridge regression - Answer-Method of regularization by limiting the sum of the squares
of the coefficients. Will reduce the magnitude of coefficients, not the number of variables
chosen.
, Robust solution - Answer-A solution that whose worst-case outcome over all possible
scenarios is least bad.
ROC curve - Answer-Receiver operating characteristic curve.
Root - Answer-The first, complete data set in a tree model.
R-squared/R2 - Answer-Measure of linear regression model quality, the fraction of
variance in the response that is explained by the model.
Coefficient of Determination - Answer-R-squared value. The fraction of variance in the
response that is explained by the model.
Scaling - Answer-Shrinking or expanding, and moving, the range of data to fit exactly
into a specific interval.
Scenario - Answer-Specific case/instance of an uncertain outcome; one approach to
stochastic optimization is to optimize over a number of scenarios simultaneously.
Seasonality/cycles - Answer-Repeating pattern in data values over time, often at
consistent intervals.
Seasonality length/cycle length - Answer-Fixed time period at which
cycles/seasonalities repeat themselves.
Sensitivity - Answer-Fraction of data points in a certain category that are correctly
classified by a model; equal to 𝑇𝑇/𝑇𝑇+𝐹𝐹; also called the true positive rate, hit rate, and
recall.
Sequential game - Answer-A game in which participants choose their actions one after
another, so participants who choose later have knowledge of the earlier actions.
Service rate - Answer-Rate at which entities are processed.
Shortest path problem - Answer-Network optimization model that finds the shortest
route in a network from one specific node to another.
Simplicity (of a model) - Answer-Having fewer parameters; opposite of complexity of a
model. Often helpful for avoiding overfitting and increasing interpretability.
Simulation - Answer-A model that imitates the operation or behavior of a real system.
Simultaneous game - Answer-A game in which all participants choose their actions at
the same time.