This walkthrough of the course is relevant for the academic year of 25/26’, this
means material can differ per year. Midterm only has MC questions. Final exam 1/3
MC questions and 2/3 open questions. Stata is not involved in the both exams.
Week 1: Introduction to Econometrics
Econometrics is the use of statistical methods to estimate and analyze economic
relationships using data. Econometrics is used to empirically estimate economic
relationships, test economic theories, make economic predictions, evaluate policies.
With observational data (not from a lab), we often see correlation but want causality.
To interpret a regression coefficient causally, we need the ceteris paribus (everything
else constant) to hold. We want the change in Y due only to X, not because other
things moved too. The important difference between correlation and causation, is that
correlation shows that two variables move together, while causation means that
changing X creates a change in Y: X Y, it’s a causal effect.
In a simple linear regression model, we have one output variable (y), and one
independent variable (x), this means that the model will looks as follows; Y = β 0 + β1X
+ u, where both β0 & β1 are population parameters. Population parameters are the
true (unknown) numbers that describe the relationship in the entire population, not
just the sample. In this simple regression model, u, is the error term. The error term
(u) includes everything that is not inside the model itself, often Y depends on many
more things besides just X, this is bundled in the u term.
It is important to remember that we can only causally interpret β 1 if X changes, while
nothing inside the error term changes systematically. This means that β 1 is causal if
and only if E [ X | u ] = 0. An example will make this really clear;
Consider the following economic model, where we want to estimate the effect of
education on wages; Wages = β 0 + β1Education + u. Suppose that u contains ability,
which you did not measure ability affects both education and wages. More able
people tend to get more education, and more able people tend to earn higher wages
independent of their educational achievements. Therefore, people with high
education tend to have higher ability this means that the following does NOT hold;
E [ X | u ] = 0. If education is correlated with the error term (as is the case), we have
an endogeneity problem. It can be solved using exogenous variation, this can be
done by using a policy rule for example, people born after a certain date must stay
in school longer. This creates a jump in education independent of ability (u).
There exists 3 different types of data;
1. Cross-sectional data; many units, one time point
2. Time series data; one unit, many time points
3. Panel data; Many units, many time points (most useful)
, It is important that every dataset is independent and identically distributed.
Independent means that Yi does not tell you anything about Yj. Identically distributed
means that every observation is drawn from the same population, with the same
population parameters.
If we have the simple linear regression model Y = β 0 + β1X + u, we can implement a
change in x: ∆ y =β 1 ∆ x +∆ u, then we take the derivative of y with respect to x:
Δy
=β 1 , only if Δ u=0 . This means that we can only speak of causality if the ceteris
Δx
paribus condition holds (holding everything else (u) constant). β 1 is the change in y
from a one-unit change in X , if the error term does not change. The error term can
contain; omitted variables, randomness, non-linearities, and measurement errors.
Normalization implies; E [ X | u ] = 0, the average value of the error term is zero for
every value of X. So, for people with a low X and high X, the average value of the
error term is the same. If this happens X is exogenous (not contaminated by the
error). If it fails X is endogenous and β 1 is not causal. If E [ X | u ] = 0 then, Cov (x,u)
= 0.
In economic models, our goal is to estimate the population parameters. We can do
this using OLS (ordinary least squares principle). OLS chooses ^
β1 & ^
β 0 in such a way
that it reduces the sum of squared residuals. ^β describes the OLS estimates
computed from the sample (they are the best guesses of the population parameters).
X bar and Y bar are the mean of the corresponding variables.
We can estimate the population parameters as follows;
n
∑ ( x i−x ) ( y i− y )
^
β 1= i=1 , this is the OLS slope estimator.
n
∑ ( xi −x ) 2
i =1
^
β 0= y − ^
β 1 x , this is the OLS intercept estimator.
Where x = the mean of x values and y = the mean of y values.
The calculation of ^
β 1 shows the numerator, which is the sample covariance (how x
and y move together). The denominator in the calculation of ^ β 1 shows the sample
variance (how much spread in x). If ^
β > 0, then x and y are positively correlated, but
1
correlation is not causality. Only causality (again) if; E [ X | u ] = 0.
The key difference between covariance and correlation, is that covariance is
measured in units, while correlation is used uniformly and shows a number between -
1 & 1.