Induction - Answers the process that moves from a given series of specifics to a generalization
Deduction - Answers the process of moving from a general rule to a specific example
Supervised Learning - Answers Use labeled training data to generalize labels to new instances
(function approximation)
Unsupervised Learning - Answers Make sense out of unlabeled data (data description)
Reinforcement Learning - Answers Learning from delayed reward
Classification versus Regression - Answers Classification is process of mapping x to a discrete label
(e.g., T/F, M/F, 0/1,red/blue/green); regression is mapping of x to continuous values in R
Instances - Answers Vectors of attributes to describe input
Concept - Answers Function that maps inputs to outputs
Target Concept - Answers The concept that we are trying to find
Hypothesis Class - Answers All functions I'm willing to consider
Candidate - Answers Concept that might be the target concept
Decision Tree Algorithm - Answers 1. Pick "Best" Attribute
2. Ask question
3. Follow the answer path
4. Go to 1 until got answer
ID3 algorithm - Answers Loop:
A<- best attribute
Maximize information
Gain(S,A)=Entropy(S)-∑|S_v|/|S|Entropy(S_v))
Assign A as decision attribute for node
For face value of A, create descendant of node
Sort Training Examples to Leaves
If examples perfectly classified, stop.
Else, iterate over leaves
Entropy - Answers A measure of disorder or randomness. Entropy=-∑p(v)log p(v) where p is the
probability of observing the value v.
ID3 Inductive bias - Answers - Good splits at top rather than bottom
- Correct over incorrect
- Shorter trees to longer trees (follows from 1st 2)
Inductive bias - Answers Set of assumptions that the learner uses to predict outputs given inputs that
it has not encountered.
Restriction bias is where the set of hypothesis considered is restricted to a smaller set. Preference
bias is where some hypothesis are preferred over others.
Where do errors come from? - Answers - Measurement/sensor
- Malicious
- Transcription error
- Unmodeled influences
Perceptron - Answers The (binary) linear classifier that has:
- Input values or One input layer
- Weights and Bias
- Net sum
- Activation Function
Perceptron Rule - Answers wₖ=wₖ+∆wₖ
∆wₖ=η(y-ŷ)xₖ
ŷ=∑wₖxₖ≥0
Gradient Descent Update - Answers More robust to nonlinear separability
a=∑wₖxₖ
Minimize error metric E(w)=½∑(y-a)²
Sigmoid - Answers σ(a)=1/(1+e^-a)
a→∞ , then σ(a)→0
a→-∞ , then σ(a)→1
Derivative Dσ(a)=σ(a)(1-σ(a))
Restriction bias of perceptron - Answers Half spaces