Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Exam (elaborations)

ISYE 6501 Midterm 1 Introduction to Analytics Modeling / ISYE 6501 Midterm Exam 1 2025 Actual Questions and Answers

Rating
-
Sold
-
Pages
19
Grade
A
Uploaded on
04-04-2026
Written in
2025/2026

ISYE 6501 Midterm 1 Introduction to Analytics Modeling / ISYE 6501 Midterm Exam 1 2025 Actual Questions and Answers

Institution
Course

Content preview

ISYE 6501 Midterm 1 Introduction to
Analytics Modeling / ISYE 6501
Midterm Exam 1 2025 Actual
Questions and Answers
split the rest between validation and test What are two methods of splitting
data? - ANSWER//random and roation What is the rotation method of splitting
data? - ANSWER//You take turns selecting points. 5 data point rotation
sequence: (Training - Validation - Training - Test - Training What is the
advantage of rotation over randomness? - ANSWER//We make sure each
part of the data is equally separated. What is the disadvantage of using
rotation? - ANSWER//We have to make sure we aren't creating some other
type of bias when we assign points. what is k-fold cross validation? -
ANSWER//split the training/validation data into k-parts; we train on k-1 parts
and validate on the remaining part. What metric do you use for k-fold cross
validation when comparing models? - ANSWER//The average of all k
evaluations. What do we use when important data only appears in the
validation or test sets? - ANSWER//cross-validation What do we do after
we've performed cross-validation? - ANSWER//We train the model again
using all the data. what are the benefits of k-fold cross validation? -
ANSWER//better use of data, better estimate of model quality, and chooses
model more effectively What can clustering be used for? -
ANSWER//grouping data points (e.g., market segmentation) and discovering
groups in data points (e.g., personalized medicine Which should we use most
of the data for: training, validation, or test? - ANSWER//training In k-fold
cross-validation, how many times is each part of the data used for training,
and for validation? - ANSWER//k-1 times for training, and 1 time for validation
what is rectangular distance useful for? - ANSWER//calculating driving
distance when the city is mapped in a grid what is the value of p for euclidean
distance - ANSWER//2 what is the general equation for p-norm distance -
ANSWER// 2-norm - ANSWER//Straight-line distance corresponds to which
distance metric? How do you find the distance of an infinity norm? -
ANSWER//You find the largest | x_i - y_i | What is a centroid - ANSWER//the
center of a cluster What are the steps of k means? - ANSWER//0. Pick k
clusters within range of data. 1. Assign each data point to nearest cluster
center 2. Recalculate cluster centers (centroids) 3. Repeat 1 and 2 until no
changes How do we find the cluster centers? - ANSWER//We take the mean
of all the data points in cluster. Why is k-means an expectation-maximization -
ANSWER//finding the mean of all the points in cluster is similar to finding an
expectation. Assigning data points to cluster centers is the maximization step.
Really we are minimizing, but we could think of it as maximizing the negative
of the distance to a cluster center What are some of the consequences of
outliers in k-means? - ANSWER//It will drag the cluster center artificially to
one side. Because k-means is a heuristic and thus fast what can we do? -

,ANSWER//run it several times choosing different clusters centers and choose
the best one and we can choose different values of k how does bias/variance
change as k changes in KNN - ANSWER//the higher the k the higher the bias
the lower the k the higher the variance. when K = 1 that is the most complex
model and thus likely to overfit the data. How do we find the best value of k in
k means? - ANSWER//Elbow method: we calculate the total distance of each
data point to its cluster center and plot it in two dimensions. We look for the
kik in the graph. When clustering for prediction how do we choose the
prediction? - ANSWER//When we see a new point, we just choose whichever
cluster center is closest. What is the difference between classification and
clustering? - ANSWER//With classification mdoels, we know each data point's
attributes and we already know the right classification for the data points
(supervised). In clustering (unsupervised) we know the attributes but we don't
know what group any of these data points are in. What is the difference
between supervised learning and unsupervised learning? -
ANSWER//Supervised - the response is known Unsupervised - response is
not known. The k-means algorithm for clustering is a "heuristic" because... -
ANSWER//...it isn't guaranteed to get the best answer but it will get to a
solution quickly. A group of astronomers has a set of long-exposure CCD
images of various distant objects. They do not know yet which types of object
each one is, and would like your help using analytics to determine which ones
look similar. Which is more appropriate: classification or clustering? -
ANSWER//clustering Suppose one astronomer has categorized hundreds of
the images by hand, and now wants your help using analytics to automatically
determine which category each new image belongs to. Which is more
appropriate: classification or clustering? - ANSWER//classification Which of
these is generally a good reason to remove an outlier from your data set? A.
The outlier is an incorrectly-entered data, not real data. B. Outliers like this
only happen occasionally. - ANSWER//A. If the data point isn't a true one, you
should remove it from your data set. What is an outlier? - ANSWER//A data
point that is very different from the rest What graph or plot can we use to find
outliers? - ANSWER//box-and-whisker plot What are the parts of a box-and-
whisker plot? - ANSWER//The bottom and top of the box are the 25th and
75th percentile. The middle valu is the median. The whiskers stretch up and
down to reasonable range of values (10 and 90th or 5th and 95 percentiles)
Where would outliers exist in a box and whisker plot - ANSWER//outside of
the whiskers. What are some ways to deal with outliers that are bad data? -
ANSWER//Omit them or use imputation What can change detection be used
for? - ANSWER//Determining whether action might be needed, determining
impact of past action, determining changes to help plan. What is Cumulative
sum (CUSUM) used for - ANSWER//detect in crease, decrease or both What
is C used for in the Cusum formula - ANSWER//Since we expect there to be
some randomness, we include a value C to pull the running total down If we
have a larger C ... - ANSWER//the harder for S_t to get large and the less
sensitive the method will be If we have a smaller C ... - ANSWER//the more
sensitive the method is because S_t can get larger faster What factors go into
finding the right values of C and T? - ANSWER//how costly it is if the model
takes a long time to nice a change, and how costly it is if the model think it
has found a change that really isn't there. Why are hypothesis tests often not
sufficient for change detection? - ANSWER//They often are slow to detect

, changes. Hypothesis tests generally have high threshold levels, which makes
them slow to detect changes. In the CUSUM model, having a higher threshold
T makes it... - ANSWER//detect changes slower, and less likely to falsely
detect changes. In the exponential smoothing equation S_t = \alpha \times x_t
+ (1-\alpha) \times S_{t-1} a value of closer to 1 is chosen if... -
ANSWER//There's less randomness, so we're more willing to trust the
observation. We put more weight on the observation x_t than the previous
estimate S_{t-1} A multiplicative seasonality, like in the Holt-Winters method,
means that the seasonal effect is... - ANSWER//Proportional to the baseline
value. A multiplicative seasonality is larger when the baseline value is larger,
because its effect is a multiple of the baseline In the exponential smoothing
equation S_t = \alpha \times x_t + (1-\alpha) \times S_{t-1} only the current
observation x_t is considered in calculating the estimate S_t. -
ANSWER//False. we consider all previous observations Is exponential
smoothing better for short-term forecasting or long-term forecasting? -
ANSWER//Short-term Exponential smoothing bases its forecast primarily on
the most-recent data points. For forecasts of the longer-term future, there
aren't data points close to the time being forecasted In simple forecasting with
basic exponential smoothing what is the value of F_{t+i} - ANSWER//S_t What
does autoregression mean? - ANSWER//Previous values of the thing being
estimated are used to calculate the estimate Why would we want to estimate
the variance? - ANSWER//Knowing the variance can help us estimate the
amount of error Why is GARCH different from ARIMA and exponential
smoothing? - ANSWER//GARCH estimates variance ARIMA and exponential
smoothing both estimate the value of an attribute; GARCH estimates the
variance When would regression be used instead of a time series model? -
ANSWER//When there are other factors or predictors that affect the response.
Regression helps show the relationships between factors and a response If
two models are approximately equally good, measures like AIC and BIC will
favor the simpler model. Simpler models are often better because... -
ANSWER//Simpler models are less likely to be over-fit, easier to understand,
and easier to explain What is not a common use of regression? -
ANSWER//Prescriptive analytics: Determining the best course of action
Regression is often good for describing and predicting, but is not as helpful for
suggesting a course of action True or false: regression is a way to determine
whether one thing causes another. - ANSWER//False. Regression can show
relationships between observations, but it doesn't show whether one thing
causes another Suppose our regression model to estimate how tall a 2-year-
old will be as an adult has the following coefficients: 0.56xFatherHeight +
0.51xMotherHeight - 0.02xFatherHeightxMotherHeight The negative sign on
the coefficient of FatherHeightxMotherHeight means: - ANSWER//People with
two taller-than-average parents won't be as tall as the individual effects of
father's height and mother's height add up to The negative coefficient for the
interaction term brings down the overall estimate What does
"heteroscedasticity" mean? - ANSWER//The variance is different in different
ranges of the data You might want to de-trend data before... -
ANSWER//...using time-series data in a regression model Factor-based
models like regression generally don't account for time-based effects like
trend. Which of the following does principal component analysis (PCA) do? -
ANSWER//Transform data so there's no correlation between dimensions and

Written for

Course

Document information

Uploaded on
April 4, 2026
Number of pages
19
Written in
2025/2026
Type
Exam (elaborations)
Contains
Questions & answers

Subjects

$17.59
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller
Seller avatar
Evahwanjimasha

Get to know the seller

Seller avatar
Evahwanjimasha Teachme2-tutor
Follow You need to be logged in order to follow users or courses
Sold
-
Member since
9 months
Number of followers
0
Documents
336
Last sold
-
EXCELLENT HOMEWORK HELP AND TUTORING , EXCELLENT HOMEWORK HELP AND TUTORING ,ALL KIND OF QUIZ AND EXAMS WITH GUARANTEE OF A EXCELLENT HOMEWORK HELP AND TUTORING ,ALL KIND OF QUIZ AND EXAMS WITH GUARANTEE OF A Am an expert on major courses especially; ps

My mission is simple: to deliver scholarly, reliable, and results-driven content that empowers students to achieve outstanding grades with confidence. Every resource I create is carefully researched, well-structured, and tailored to meet academic standards, ensuring clarity, accuracy, and depth. Recognized as one of Stuvia’s BEST GOLD RATED TUTORS, I am committed to maintaining a reputation built on quality, integrity, and student success. Whether you need support with quizzes, exams, assignments, or comprehensive study guides, I prioritize your goals and work diligently to help you excel. Your academic success is my priority—expect excellence, professionalism, and results you can count on.

Read more Read less
0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions