1. For each of the 13 models/methods, select the choice that includes the
category of question it is commonly used for. For models/methods that
have more than one correct category, the one it is most commonly
used for; for models/methods that have no correct category listed,
select "None".
a. ARIMA is a response prediction.
b. CART is a classification and response prediction.
c. Cross validation is a validation.
d. CUSUM none of these.
e. Exponential smoothing is a response prediction.
f. GARCH is a variance estimation.
g. Kmeans is clustering.
h. K-nearest-neighbor is a classification and response prediction.
i. Linear regression is a response prediction.
j. logistic regression is a classification and response prediction.
k. Principal component analysis is none of these.
l. Random forest is a classification and response prediction.
m. Support vector machine is a classification prediction.
2. For each of the following models, specify whether it is designed for use
with attribute/feature data or time-series data:
a. CUSUM is used with time series data.
b. Logistic regression is used with attribute/feature data.
c. Support vector machines are used with attribute/feature data.
d. GARCH is used with time series data.
e. Random forest is used with attribute/feature data.
f. K-means is used with attribute/feature data.
g. Linear regression is used with attribute/feature data.
h. K-nearest-neighbor is used with attribute/feature data.
i. ARIMA is used with time series data.
j. Principal component analysis is used with attribute/feature data.
k. Exponential smoothing is used with time series data.
Figures A and B show the training data for a soft classification problem,
using two predictors (x1 and x2) to separate between black and white
points. The dashed lines are the classifiers found using SVM. Figure A
, uses a linear kernel, and Figure B uses a nonlinear kernel that required
fitting 16 parameter values.
3. Figure A's classifier IS NOT based only on the value of x2.
4. Figure A's classifier WOULD probably perform worse on test data than
on training data.
5. Figure A's classifier has a WIDER margin than Figure B's classifier in the
training data.
6. Figure A's classifier incorrectly classifies EXACTLY 4 white points as
black in the training data.
7. Figure A DOES NOT SHOW that the black point (7.2,1.4) is an outlier.
8. Figure B's classifier IS NOT NECESSARILY better than Figure A's
classifier, because Figure B's classifier classifies more of the training
data correctly.
9. Figure B's classifier would probably perform WORSE on test data than
on training data.
10. Figure B's classifier incorrectly classifies MORE OR FEWER THAN
5 white points in the training data.
11. Figure B DOES NOT SHOW that the black point (7.2,1.4) is
colored incorrectly; it should actually be white.
category of question it is commonly used for. For models/methods that
have more than one correct category, the one it is most commonly
used for; for models/methods that have no correct category listed,
select "None".
a. ARIMA is a response prediction.
b. CART is a classification and response prediction.
c. Cross validation is a validation.
d. CUSUM none of these.
e. Exponential smoothing is a response prediction.
f. GARCH is a variance estimation.
g. Kmeans is clustering.
h. K-nearest-neighbor is a classification and response prediction.
i. Linear regression is a response prediction.
j. logistic regression is a classification and response prediction.
k. Principal component analysis is none of these.
l. Random forest is a classification and response prediction.
m. Support vector machine is a classification prediction.
2. For each of the following models, specify whether it is designed for use
with attribute/feature data or time-series data:
a. CUSUM is used with time series data.
b. Logistic regression is used with attribute/feature data.
c. Support vector machines are used with attribute/feature data.
d. GARCH is used with time series data.
e. Random forest is used with attribute/feature data.
f. K-means is used with attribute/feature data.
g. Linear regression is used with attribute/feature data.
h. K-nearest-neighbor is used with attribute/feature data.
i. ARIMA is used with time series data.
j. Principal component analysis is used with attribute/feature data.
k. Exponential smoothing is used with time series data.
Figures A and B show the training data for a soft classification problem,
using two predictors (x1 and x2) to separate between black and white
points. The dashed lines are the classifiers found using SVM. Figure A
, uses a linear kernel, and Figure B uses a nonlinear kernel that required
fitting 16 parameter values.
3. Figure A's classifier IS NOT based only on the value of x2.
4. Figure A's classifier WOULD probably perform worse on test data than
on training data.
5. Figure A's classifier has a WIDER margin than Figure B's classifier in the
training data.
6. Figure A's classifier incorrectly classifies EXACTLY 4 white points as
black in the training data.
7. Figure A DOES NOT SHOW that the black point (7.2,1.4) is an outlier.
8. Figure B's classifier IS NOT NECESSARILY better than Figure A's
classifier, because Figure B's classifier classifies more of the training
data correctly.
9. Figure B's classifier would probably perform WORSE on test data than
on training data.
10. Figure B's classifier incorrectly classifies MORE OR FEWER THAN
5 white points in the training data.
11. Figure B DOES NOT SHOW that the black point (7.2,1.4) is
colored incorrectly; it should actually be white.