ML CS7641 MIDTERM QUESTIONS WITH VERIFIED
ACCURATE ANSWERS
Supervised Learning - Answers - school of machine learning that relies on human input
to train a model
Basal assumption of supervised learning - Answers - there exists some well-behaved,
consistent function behind data we're seeing
Classification - Answers - mapping complex inputs to labels/classes/discrete values
Regression - Answers - mapping complex inputs to any numeric values
Source of data errors - Answers - hardware, malicious intent, human element,
unmodeled influences
Graph of Fit - Answers –
Where is a good fit of data - Answers - where the error across both training data and
cross-validation data are relatively similar.
Cross validation - Answers - a method used for reducing overfitting
Instances - Answers - representing the input data from which the overall model will
"learn"
Concept - Answers - Abstract idea that represents data
Candidate - Answers - potential target concept
Testing set - Answers - instances that our candidate concept has not yet seen in order
to evaluate how close it is to the ideal target concept
,Decision Trees - Answers - Map various choices to diverging paths that end with some
decision
Order in which features are best applied to decision trees - Answers - correlated with its
ability to reduce our VC space
ID3 Algorithm - Answers - - A< -best attribute
- Assign A as decision attribute
- for each option in A, create branch n
- lump training examples to respective branches
-if perfectly classified: stop, else: repeat
Information Gain Equation - Answers –
Information Gain - Answers - How much an attribute can reduce overall entropy
Entropy Equation - Answers –
Entropy - Answers - Measure of how much information an attribute gives about a
system
"Best" Attribute - Answers - One with maximum information gain
Restriction Bias - Answers - automatically occurs when we decide our hypothesis set,
H.
Preference Bias - Answers - what sort of hypotheses from our hypothesis set, h ∈ H
Preference Bias of ID3 - Answers - -it will prefer trees with good splits at the top.
-It will prefer trees that correctly label data
-It will prefer shallower trees due to top-heavy split preference
Does it make sense to repeat an attribute more than once on a branch on a decision
tree? - Answers - Only if it is continuous
, Prune - Answers - Method of reducing overfitting on decision trees, potentially by
bubbling up he decisions down the branch back up to parent node
Attribute importance for regression problems - Answers - Rely on purely statistical
methods (like variance and correlation) to determine how important an attribute is.
Ensemble Learning - Answers - Collective learners that work together
Weak Learner - Answers - A learner that performs better than random guess for any
distribution of data
Bootstrap Aggregation/Bagging - Answers - choosing data uniformly randomly to form
our subset (with replacement)
Why does bagging work? - Answers - Taking the average of a set of weak learners
trained on subsets of the data can outperform a single learner trained on the entire
dataset is because of overfitting, our mortal fear in machine learning. Overfitting a
subset will not overfit the overall dataset, and the average will "smooth out" the specifics
of each individual learner
Fundamental Idea of Boosting - Answers - Prefer data that we're not good at analyzing
Boosting - Answers - Craft learners that are specifically catered towards data that
previous learners struggled with in order to form a cohesive picture of the entire dataset
Compute training error based on outcome and likelihood:
x1, Incorrect, P(1/2)
x2, correct, P(1/20)
x3, Incorrect, P(2/5)
x4, correct, P(1/20) – Answers
- 9/10
Boosting High Level Algorithm - Answers - -construct a distribution D_t
-find a weak classifier Ht(x) that minimizes the error over D_t.
Then after the loop, combine the weak classifiers into a stronger one
AdaBoost Algorithm - Answers –
ACCURATE ANSWERS
Supervised Learning - Answers - school of machine learning that relies on human input
to train a model
Basal assumption of supervised learning - Answers - there exists some well-behaved,
consistent function behind data we're seeing
Classification - Answers - mapping complex inputs to labels/classes/discrete values
Regression - Answers - mapping complex inputs to any numeric values
Source of data errors - Answers - hardware, malicious intent, human element,
unmodeled influences
Graph of Fit - Answers –
Where is a good fit of data - Answers - where the error across both training data and
cross-validation data are relatively similar.
Cross validation - Answers - a method used for reducing overfitting
Instances - Answers - representing the input data from which the overall model will
"learn"
Concept - Answers - Abstract idea that represents data
Candidate - Answers - potential target concept
Testing set - Answers - instances that our candidate concept has not yet seen in order
to evaluate how close it is to the ideal target concept
,Decision Trees - Answers - Map various choices to diverging paths that end with some
decision
Order in which features are best applied to decision trees - Answers - correlated with its
ability to reduce our VC space
ID3 Algorithm - Answers - - A< -best attribute
- Assign A as decision attribute
- for each option in A, create branch n
- lump training examples to respective branches
-if perfectly classified: stop, else: repeat
Information Gain Equation - Answers –
Information Gain - Answers - How much an attribute can reduce overall entropy
Entropy Equation - Answers –
Entropy - Answers - Measure of how much information an attribute gives about a
system
"Best" Attribute - Answers - One with maximum information gain
Restriction Bias - Answers - automatically occurs when we decide our hypothesis set,
H.
Preference Bias - Answers - what sort of hypotheses from our hypothesis set, h ∈ H
Preference Bias of ID3 - Answers - -it will prefer trees with good splits at the top.
-It will prefer trees that correctly label data
-It will prefer shallower trees due to top-heavy split preference
Does it make sense to repeat an attribute more than once on a branch on a decision
tree? - Answers - Only if it is continuous
, Prune - Answers - Method of reducing overfitting on decision trees, potentially by
bubbling up he decisions down the branch back up to parent node
Attribute importance for regression problems - Answers - Rely on purely statistical
methods (like variance and correlation) to determine how important an attribute is.
Ensemble Learning - Answers - Collective learners that work together
Weak Learner - Answers - A learner that performs better than random guess for any
distribution of data
Bootstrap Aggregation/Bagging - Answers - choosing data uniformly randomly to form
our subset (with replacement)
Why does bagging work? - Answers - Taking the average of a set of weak learners
trained on subsets of the data can outperform a single learner trained on the entire
dataset is because of overfitting, our mortal fear in machine learning. Overfitting a
subset will not overfit the overall dataset, and the average will "smooth out" the specifics
of each individual learner
Fundamental Idea of Boosting - Answers - Prefer data that we're not good at analyzing
Boosting - Answers - Craft learners that are specifically catered towards data that
previous learners struggled with in order to form a cohesive picture of the entire dataset
Compute training error based on outcome and likelihood:
x1, Incorrect, P(1/2)
x2, correct, P(1/20)
x3, Incorrect, P(2/5)
x4, correct, P(1/20) – Answers
- 9/10
Boosting High Level Algorithm - Answers - -construct a distribution D_t
-find a weak classifier Ht(x) that minimizes the error over D_t.
Then after the loop, combine the weak classifiers into a stronger one
AdaBoost Algorithm - Answers –