QUESTIONS WITH 100 %
CORRECT ANSWERS G RADED A+.
Reducing the number of predictors to the smallest set that will still provide accurate predictions is a
concept called ____________________.
regeneration
parsimony shrinkage
Gillette's razor - Precise Answer ✔✔parsimony
You have been given a dataset with 15 predictors and a binary outcome that denotes whether a customer
has left the company (yes or no). As an absolute minimum, you'll need _____________ samples to
achieve an minimally accurate prediction.
150
180
190
200 - Precise Answer ✔✔180
The marketing department of ACME Corporation needs to identify potential high-value customers for
their new Kitchen Robot. These robots are expensive, so we are looking to identify customers that can
afford such a machine. It's been determined that households with a net income greater than $50,000
USD are of interest in the marketing campaign, so you will choose a(n) ___________ algorithm to model
these customers.
classification
regression
affinity analysis
recommender system - Precise Answer ✔✔classification
,The process of identifying outliers is best performed by someone with domain knowledge as opposed to
someone with statistical knowledge.
True
False - Precise Answer ✔✔True
If you impute a missing value with its column mean, then you will ___________________.
maximize the variability of the dataset overweight the variability of the dataset understate the
variability of the dataset normalize the variability of the dataset - Precise Answer ✔✔understate the
variability of the dataset
Standardization uses the following formula:
Using the rule-of-thumb method, one can assume that all extreme values (outliers) will be greater than
____________ or less than ____________.
0, 1
1, 0
+3, -3
+1, -1 - Precise Answer ✔✔+3, -3
-infinity, +infinity - Precise Answer ✔✔a lower and an upper boundary selected by the data analyst
Overfitting occurs when ___________________ is low, which makes ______________ higher.
,variance, bias bias, variance irreducible error,
reducible error sampling, accuracy - Precise Answer
✔✔bias, variance
In contrast to standardization, normalization (i.e., MinMaxScaler in sci-kit learn) fits all values between
__________________.
-3 and +3 0 and 1 a lower and an upper boundary selected by
the data analyst
You have been given a dataset with 15 predictors and a numeric outcome that denotes the income that a
household has obtained. As an absolute minimum, you'll need _____________ samples to achieve an
minimally accurate prediction.
200
180
300
150 - Precise Answer ✔✔150
When dealing with a class imbalance in a classification model, the data analyst can _____________ the
minority class or ________________ the majority class.
underweight, overweight
underweight, oversample subsample, oversample overweight,
underweight - Precise Answer ✔✔overweight, underweight
As a means to control excessive bias, we can use ______________.
data partitions dimension reduction
standardization normalization - Precise Answer
✔✔data partitions
, When working with linear or logistic regression, categorical variables must have one subtype removed
when one-hot encoding (dummy coding) or else the model will fail.
True
False - Precise Answer ✔✔True
Because machine learning is automated, there is not any human bias or discrimination in the results.
True
False - Precise Answer ✔✔False
You are given a dataset that has many duplicate entries, that is customers who appear multiple times in
the data because of address changes, marriages, and mis-entries (boulevard instead of blvd., etc.)
Because of this situation, you will choose a(n) __________ algorithm to clean up the dataset.
dimension reduction data reduction adaptive filtering
collaborative filtering - Precise Answer ✔✔data reduction
Fake Facebook and _____________ accounts, most notably under Russian control, have helped create and
spread divisive and destabilizing messaging in Western democracies with a goal of affecting election
outcomes.
TikTok
Etsy - Precise Answer ✔✔Twitter