[Read Ch. 5]
[Recommended exercises: 5.2, 5.3, 5.4]
Sample error, true error
Con dence intervals for observed hypothesis
error
Estimators
Binomial distribution, Normal distribution,
Central Limit Theorem
Paired t tests
Comparing learning methods
74 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
, Two De nitions of Error
The true error of hypothesis h with respect to
target function f and distribution D is the
probability that h will misclassify an instance
drawn at random according to D.
errorD (h) xPr
2D
[f (x) 6= h(x)]
The sample error of h with respect to target
function f and data sample S is the proportion of
examples h misclassi es
error (h) 1 X (f (x) 6= h(x))
S
n x2S
Where (f (x) 6= h(x)) is 1 if f (x) 6= h(x), and 0
otherwise.
How well does errorS (h) estimate errorD (h)?
75 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997
, Problems Estimating Error
1. Bias: If S is training set, errorS (h) is
optimistically biased
bias E [errorS (h)] ? errorD (h)
For unbiased estimate, h and S must be chosen
independently
2. Variance: Even with unbiased S , errorS (h) may
still vary from errorD (h)
76 lecture slides for textbook Machine Learning, T. Mitchell, McGraw Hill, 1997