Dr. Yang Ning Homework 1
Problem 1 (6 points)
1. Express Var(X1 X2 ) through the variances and covariances of X1 , X2 (assuming all
variances exist).
Answer:
Var(X1 X2 ) = E((X1 X2 ) 2 ) (E(X1 X2 ))2
= E(X12 ) 2E(X1 X2 ) + E(X22 ) E(X1 )2 + 2E(X1 )E(X2 ) E(X2 )2
= Var(X1 ) + Var(X2 ) 2Cov(X1 , X2 )
2. Assume that X1 , ..., Xn are i.i.d. real-valued random variables with finite variances. Show
that
⇣1 X n ⌘ 1
Var Xi = Var(X1 ).
n n
i=1
Answer: From 1.1, we notice that if X1 and X2 are independent, then variance of the
sum of random variables is the sum of variance.
n
! n
!
1X 1 X
Var Xi = 2 Var Xi
n n
i=1 i=1
n
X
1
= Var (Xi ) (Xi ’s are independent)
n2
i=1
1
= · nVar (X1 ) (Xi ’s are identically distributed)
n2
1
= Var(X1 )
n
3. Assume that X, Y are independent random variables with E[X] = 0, E[Y ] = 1, Var(X) =
1, Var(Y ) = 2. Compute E[(3X + Y )(5Y + 2X 1)]
Answer:
E[(3X + Y )(5Y + 2X 1)] = E(15XY + 5Y 2 + 6X 2 Y 3X)
2 2
= 15E(XY ) + 5E(Y ) + 6E(X ) E(Y ) 3E(X)
= 15 · 0 + 5(Var(Y ) + E(Y ) ) + 6(Var(X) + E(X)2 ) E(Y ) 3E(X)
2
(X and Y are independent)
= 20
Problem 2 (8 points)
Assume
This study source that weby have
was downloaded the regression
100000850872992 model on 02-16-2023 08:49:35 GMT -06:00
from CourseHero.com
1
https://www.coursehero.com/file/47582188/STSCI-4740-HW1-solpdf/
, Y = f (X) + ",
where " is independent of X and E(") = 0, E("2 ) = 2 .Assume that the training data
(x1 ; y1 ), ..., (xn ; yn )are used to construct an estimate of f(x), denoted by fˆ(x). Given a new
random vector (X,Y ) (i.e., test data independent of the training data),
1. show that E[(f (X) fˆ(X))2 |X = x] = var(fˆ(x)) + [E[fˆ(x)] f (x)]2
Answer:
E[(f (X) fˆ(X))2 |X = x] = E[(f (x) fˆ(x))2 ]
(X and the estimate of f are independent)
= E[(f (x) E(fˆ(x)) + E(fˆ(x)) fˆ(x))2 ]
= E[(f (x) Efˆ(x))2 ] + E[(fˆ(x) Efˆ(x))2 ] + 2E[(f (x) Efˆ(x))(fˆ(x) Efˆ(x))]
= [(f (x) Efˆ(x))2 ] + E[(fˆ(x) Efˆ(x))2 ] + 2(f (x) Efˆ(x))E(fˆ(x) Efˆ(x))
(f(x) and Efˆ(x)are constant)
= [f (x) Efˆ(x)]2 + var(fˆ(x))
2. Show that E[(Y fˆ(x))2 |X = x] = var(fˆ(x)) + [E[fˆ(x)] f (x)]2 + 2
Answer: We have shown in the class that
E[(Y fˆ(x))2 |X = x]
= E[(f (x) + " fˆ(x))2 ]
= E[(f (x) fˆ(x))2 ] + E("2 ) + 2E["(f (x) fˆ(x))]
= var(fˆ(x)) + [E[fˆ(x)] f (x)]2 + 2 + 2E["]E[f (x) fˆ(x)] (from 2.1)
= var(fˆ(x)) + [E[fˆ(x)] f (x)]2 + 2
. (" is independent of f(x) and fˆ(x))
3. Explain the bias-variance trade-o↵ based on the above equation.
Answer: the total error= bias +variance+ irriducible error Our goal is to minimize the
total error to attain an accurate model. Howerver, there is a trade-o↵ between bias and
variance. Flexible models have low bias and high variance and relatively rigid models
have high bias and low variance. The model with the optimal predictive capability is the
one that leads to the best balance between bias and variance.
4. Explain the di↵erence between training MSE and test MSE. Can expected test MSE be
smaller than 2 ?
Answer: Training MSE is computed in the trainig data set and can reach 0 if we fit
the training data very well. Test MSE is computed with the test observations and fitted
model. Although some model performs well with respect to trainig MSE, it may not have
the same predictive ability in the test data. Our goal is to find the model which minimize
the expected test MSE.
As 2.2 shows, the expected test MSE is the sum of variance of preidictor, the squared
bias and 2 , so it can’t be smaller than 2
This study source was downloaded by 100000850872992 from CourseHero.com on 02-16-2023 08:49:35 GMT -06:00
2
https://www.coursehero.com/file/47582188/STSCI-4740-HW1-solpdf/