100% Correct
What is the difference between supervised learning and unsupervised
learning? - ANSWER-- Supervised - the response is known
Unsupervised - response is not known.
The k-means algorithm for clustering is a "heuristic" because... -
ANSWER-- ...it isn't guaranteed to get the best ANSWER but it will get
to a solution quickly.
A group of astronomers has a set of long-exposure CCD images of
various distant objects. They do not know yet which types of object each
one is, and would like your help using analytics to determine which ones
look similar. Which is more appropriate: classification or clustering? -
ANSWER-- clustering
Suppose one astronomer has categorized hundreds of the images by
hand, and now wants your help using analytics to automatically
determine which category each new image belongs to. Which is more
appropriate: classification or clustering? - ANSWER-- classification
Which of these is generally a good reason to remove an outlier from
your data set?
A. The outlier is an incorrectly-entered data, not real data.
B. Outliers like this only happen occasionally. - ANSWER-- A.
If the data point isn't a true one, you should remove it from your data set.
What is an outlier? - ANSWER-- A data point that is very different from
,the rest
What graph or plot can we use to find outliers? - ANSWER-- box-and-
whisker plot
What are the parts of a box-and-whisker plot? - ANSWER-- The bottom
and top of the box are the 25th and 75th percentile. The middle valu is
the median. The whiskers stretch up and down to reasonable range of
values (10 and 90th or 5th and 95 percentiles)
Where would outliers exist in a box and whisker plot - ANSWER--
outside of the whiskers.
What are some ways to deal with outliers that are bad data? - ANSWER-
- Omit them or use imputation
What can change detection be used for? - ANSWER-- Determining
whether action might be needed, determining impact of past action,
determining changes to help plan.
What is Cumulative sum (CUSUM) used for - ANSWER-- detect in
crease, decrease or both
What is C used for in the Cusum formula - ANSWER-- Since we expect
there to be some randomness, we include a value C to pull the running
total down
If we have a larger C ... - ANSWER-- the harder for S_t to get large and
the less sensitive the method will be
If we have a smaller C ... - ANSWER-- the more sensitive the method is
because S_t can get larger faster
,What factors go into finding the right values of C and T? - ANSWER--
how costly it is if the model takes a long time to nice a change, and how
costly it is if the model think it has found a change that really isn't there.
Why are hypothesis tests often not sufficient for change detection? -
ANSWER-- They often are slow to detect changes.
Hypothesis tests generally have high threshold levels, which makes them
slow to detect changes.
In the CUSUM model, having a higher threshold T makes it... -
ANSWER-- detect changes slower, and less likely to falsely detect
changes.
In the exponential smoothing equation S_t = \alpha \times x_t +
(1\alpha) \times S_{t-1} a value of closer to 1 is chosen if... - ANSWER-
- There's less randomness, so we're more willing to trust the observation.
We put more weight on the observation x_t than the previous estimate
S_{t-1}
A multiplicative seasonality, like in the Holt-Winters method, means
that the seasonal effect is... - ANSWER-- Proportional to the baseline
value.
A multiplicative seasonality is larger when the baseline value is larger,
because its effect is a multiple of the baseline
In the exponential smoothing equation S_t = \alpha \times x_t +
(1\alpha) \times S_{t-1} only the current observation x_t is considered in
, calculating the estimate S_t. - ANSWER-- False. we consider all
previous observations
Is exponential smoothing better for short-term forecasting or long-term
forecasting? - ANSWER-- Short-term
Exponential smoothing bases its forecast primarily on the most-recent
data points. For forecasts of the longer-term future, there aren't data
points close to the time being forecasted
In simple forecasting with basic exponential smoothing what is the value
of F_{t+i} - ANSWER-- S_t
What does autoregression mean? - ANSWER-- Previous values of the
thing being estimated are used to calculate the estimate
Why would we want to estimate the variance? - ANSWER-- Knowing
the variance can help us estimate the amount of error
Why is GARCH different from ARIMA and exponential smoothing? -
ANSWER-- GARCH estimates variance
ARIMA and exponential smoothing both estimate the value of an
attribute; GARCH estimates the variance
When would regression be used instead of a time series model? -
ANSWER-- When there are other factors or predictors that affect the
response.
Regression helps show the relationships between factors and a response