Machine Learning
Exercises Chapter 6
March 22, 2009
1. (Exercise 6.1 from the book) Consider again the example of application of Bayes rule in Section
6.2.1. Suppose the doctor decides to order a second laboratory test for the same patient, and
suppose the second test returns a positive result as well. What are the posterior probabilities of
cancer and ¬cancer following these two tests? Assume that the two tests are independent.
The prior probability of someone having cancer is P (cancer) = 0.008. If you have cancer, the test
will detect this with P (+|cancer) = 0.98 chance. If you do not have cancer, the test will fail with
P (+|¬cancer) = 0.03.
2. (Exercise 6.2 from the book) In the example of Section 6.2.1 we computed the posterior probability
of cancer by normalizing the quantities P (+|cancer) · P (cancer) and P (+|¬cancer) · P (¬cancer)
so that they summed to one. Use Bayes theorem and the theorem of total probability (see Table
6.1) to prove that this method is valid (i.e., that normalizing in this way yields the correct value
for P (cancer|+).
3. (Exercise 6.3 from the book) Consider the concept learning algorithm F indG, which outputs a
maximally general consistent hypothesis (e.g., some maximally general member of the version
space).
(a) Give a distribution for P (h) and P (D|h) under which F indG is guaranteed to output a MAP
hypothesis.
(b) Give a distribution for P (h) and P (D|h) under which F indG is not guaranteed to output a
MAP hypothesis.
(c) Give a distribution for P (h) and P (D|h) under which F indG is guaranteed to output a ML
hypothesis but not a MAP hypothesis.
4. Consider the Bayesian network drawn in the figure below. Thus, the independence expressed in
this Bayesian net is: A and B are (absolutely) independent, C is independent of B given A. D is
independent of C given A and B. (Remark on notation: P (A) means P (A = true) and P (C|¬A)
means P (C = true|A = f alse).)
P (A) 0.3
P (B) 0.6 A B
P (C|A) 0.8
P (C|¬A) 0.4
P (D|A, B) 0.7
P (D|A, ¬B) 0.8
P (D|¬A, B) 0.1 C D
P (D|¬A, ¬B) 0.2
Calculate (using Bayes Theorem, Product rule and Equation 6.23): P (D), P (A|C) and P (A, D|¬B).
1
Exercises Chapter 6
March 22, 2009
1. (Exercise 6.1 from the book) Consider again the example of application of Bayes rule in Section
6.2.1. Suppose the doctor decides to order a second laboratory test for the same patient, and
suppose the second test returns a positive result as well. What are the posterior probabilities of
cancer and ¬cancer following these two tests? Assume that the two tests are independent.
The prior probability of someone having cancer is P (cancer) = 0.008. If you have cancer, the test
will detect this with P (+|cancer) = 0.98 chance. If you do not have cancer, the test will fail with
P (+|¬cancer) = 0.03.
2. (Exercise 6.2 from the book) In the example of Section 6.2.1 we computed the posterior probability
of cancer by normalizing the quantities P (+|cancer) · P (cancer) and P (+|¬cancer) · P (¬cancer)
so that they summed to one. Use Bayes theorem and the theorem of total probability (see Table
6.1) to prove that this method is valid (i.e., that normalizing in this way yields the correct value
for P (cancer|+).
3. (Exercise 6.3 from the book) Consider the concept learning algorithm F indG, which outputs a
maximally general consistent hypothesis (e.g., some maximally general member of the version
space).
(a) Give a distribution for P (h) and P (D|h) under which F indG is guaranteed to output a MAP
hypothesis.
(b) Give a distribution for P (h) and P (D|h) under which F indG is not guaranteed to output a
MAP hypothesis.
(c) Give a distribution for P (h) and P (D|h) under which F indG is guaranteed to output a ML
hypothesis but not a MAP hypothesis.
4. Consider the Bayesian network drawn in the figure below. Thus, the independence expressed in
this Bayesian net is: A and B are (absolutely) independent, C is independent of B given A. D is
independent of C given A and B. (Remark on notation: P (A) means P (A = true) and P (C|¬A)
means P (C = true|A = f alse).)
P (A) 0.3
P (B) 0.6 A B
P (C|A) 0.8
P (C|¬A) 0.4
P (D|A, B) 0.7
P (D|A, ¬B) 0.8
P (D|¬A, B) 0.1 C D
P (D|¬A, ¬B) 0.2
Calculate (using Bayes Theorem, Product rule and Equation 6.23): P (D), P (A|C) and P (A, D|¬B).
1