What Formula
Variance 𝑉𝑎𝑟(𝑋) = 𝐸[(𝑋 − 𝜇)2 ]
𝑁
1
𝜎2 = ∑(𝑥𝑖 − 𝜇)2
𝑁
𝑖=1
Short-cut variance 𝑁
1
𝜎2 = ( ∑ 𝑥𝑖2 ) − 𝜇2
𝑁
𝑖=1
Covariance 𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸[(𝑋 − 𝜇𝑋 )(𝑌 − 𝜇𝑌 )]
𝑁
1
𝜎𝑋,𝑌 = ∑(𝑥𝑖 − 𝜇𝑋 )(𝑦𝑖 − 𝜇𝑌 )
𝑁
𝑖=1
Short-cut covariance 𝑁
1
𝜎𝑋,𝑌 = ∑ 𝑥𝑖 𝑦𝑖 − 𝜇𝑋 𝜇𝑌
𝑁
𝑖=1
Correlation 𝐶𝑜𝑣(𝑋, 𝑌)
𝐶𝑜𝑟𝑟(𝑋, 𝑌) =
𝑠𝑑 (𝑋)𝑠𝑑(𝑌)
𝜎𝑋,𝑌
𝜌𝑋,𝑌 =
𝜎𝑋 𝜎𝑌
Z-value (standard normally distributed random variable) 𝑥0 − 𝜇
𝑧0 =
𝜎
Standardized coefficient 𝛽̂ × 𝜎𝑥 Cov(y, x) σ𝑥 Cov(y, x)
= 2 × =
𝜎𝑦 σ𝑥 σ𝑦 σ𝑦 σ𝑥
OLS standard errors ∑𝑁 ̂2
𝑖=1 𝑢𝑖
𝜎̂2 =
𝑁−2
𝜎̂
̂
𝑠𝑒(𝛽) =
√∑𝑁
𝑖=1(𝑥𝑖 − 𝑥̅ )
2
𝜎̂√∑𝑁 2
𝑖=1 𝑥𝑖
𝑠𝑒(𝛼̂ ) =
√∑𝑁
𝑖=1(𝑥𝑖 − 𝑥̅ )
2
𝑁
Goodness of fit (R2)
𝑇𝑆𝑆 = ∑(𝑦𝑖 − 𝑦̅)2
𝑖=1
𝑁
𝐸𝑆𝑆 = ∑(𝑦̂𝑖 − 𝑦̅)2
𝑖=1
𝑁
𝑅𝑆𝑆 = ∑(𝑦𝑖 − 𝑦̂𝑖 )2
𝑖=1
2
𝐸𝑆𝑆 𝑅𝑆𝑆
𝑅 = =1−
𝑇𝑆𝑆 𝑇𝑆𝑆
𝑇𝑆𝑆 = 𝐸𝑆𝑆 + 𝑅𝑆𝑆
T-test 𝛽̂ − 𝛽0 𝛽̂
𝑡̂𝛽 = =
𝑠𝑒(𝛽̂) 𝑠𝑒(𝛽̂)
95% confidence interval 𝑈𝑝𝑝𝑒𝑟 𝑏𝑜𝑢𝑛𝑑: 𝛽 = 𝛽̂ + 𝑐 𝑠𝑒(𝛽̂)
𝐿𝑜𝑤𝑒𝑟 𝑏𝑜𝑢𝑛𝑑: 𝛽 = 𝛽̂ − 𝑐 𝑠𝑒(𝛽̂)
Multivariate linear regression model 𝑀𝑎𝑡𝑟𝑖𝑥 𝑛𝑜𝑡𝑎𝑡𝑖𝑜𝑛: 𝛽̂ = (𝑋 ′ 𝑋)−1 𝑋′𝑦
1
, Bias ̂1
𝛽2 ∗ 𝛿
̃
𝐸(𝛽1 ) − 𝛽1
Multivariate Variance 𝜎2
𝑉𝑎𝑟(𝛽̂𝑗 ) = 2
∑𝑁 ̅𝑗 ) (1 − 𝑅𝑗2 )
𝑖=1(𝑥𝑗𝑖 − 𝑥
Adjusted R2 𝑅𝑅𝑆/(𝑛 − 𝑘 − 1)
𝑅̅ 2 = 1 −
𝑇𝑆𝑆/(𝑛 − 1)
F-test (𝑅𝑆𝑆𝑅 − 𝑅𝑆𝑆𝑈𝑅 )/𝑞
𝐹=
𝑅𝑆𝑆𝑈𝑅 /(𝑛 − 𝑘 − 1)
Other Important Stuff
Ordinary Least Squares (OLS)
1. Take vertical distances, defines as 𝑢̂𝑖 = 𝑦𝑖 − 𝑦̂𝑖 between each point in the graph and each potential candidate
fitted line.
2. Take the square of each distance and sums them: ∑𝑁 2
𝑖=1 û𝑖
3. Find the estimated coefficients 𝛼̂ and 𝛽̂ that minimize the sum of the squared residuals ∑𝑁 𝑖=1 û𝑖
2
a. We know that the fitted value of the dependent variable is 𝑦̂𝑖 = 𝛼̂ + 𝛽̂ 𝑥𝑖
b. We know that the true value of the dependent variable is 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝑢𝑖
2
c. Minimize the following L function: 𝐿 = ∑𝑁 2 𝑁
𝑖=1 û𝑖 = ∑𝑖=1(𝑦𝑖 − 𝑦 ̂𝑖 )2 = ∑𝑁 ̂ − 𝛽̂ 𝑥𝑖 )
𝑖=1(𝑦𝑖 − 𝛼
𝐸[(𝑦−𝑦̅)(𝑥−𝑥̅ )] 𝐶𝑜𝑣(𝑥,𝑦)
i. This gives: 𝛼̂ = 𝑦̅ − 𝛽̂𝑥̅ where 𝛽̂ = =
𝐸[(𝑥−𝑥̅ )2] 𝑉𝑎𝑟(𝑥)
Interpretations of β Under Log
Model DV IV Interpretation of β
Level-level Y X ∆𝑦 = 𝛽∆𝑥
Level-log Y Log(x) ∆𝑦 = (𝛽/100)%∆𝑥
Log-level Log(y) X %∆𝑦 = 100𝛽∆𝑥
Log-log Log(y) Log(x) %∆𝑦 = 𝛽%∆𝑥
OLS Properties
1. Estimator: 𝛼̂ and 𝛽̂ are estimators of the true values α and β.
2. Linear: 𝛼̂ and 𝛽̂ are linear estimators, linear combinations of y.
3. Unbiased: OLS estimators 𝛼̂ and 𝛽̂ are unbiased if on average they are equal to the true values α and β.
a. This implies that if we take the distribution of 𝛼̂ and 𝛽̂, derived estimating our model across many
samples, the mean of each estimator will be equal to the true values of α and β.
b. 𝐸(𝛼̂ ) = 𝛼, 𝐸(𝛽̂) = 𝛽
4. Best: OLS estimators 𝛼̂ and 𝛽̂ have the minimum variance among the class of linear unbiased estimators.
a. This implies that if we take the distribution of 𝛼̂ and 𝛽̂, derived estimating our model across many
samples, the variance of each estimator will be the minimum across all linear unbiased estimators (also
known as efficiency).
Large sample properties of OLS
1. Consistency: the estimates 𝛼̂ and 𝛽̂ will converge to the true values α and β as the sample size N increases to
infinity.
2. Asymptotic normality: the estimates 𝛼̂ and 𝛽̂ are approximately normally distributed in large enough samples.
OLS Assumptions (Bivariate Model)
1. The population model is linear in parameters: 𝑦𝑖 = 𝛼 + 𝛽𝑥𝑖 + 𝑢𝑖
2. We have a random sample from the population
2