Homework assignment for April 2: 5512 - Empirical Research and Analysis I
Felix Moosbauer 01628856
1.)
Look at the data visualization slides and background information on learn@wu. Watch this youTube
video about getting started in Stata: https://www.youtube.com/watch?v=YAVq99iUTTI
Summarize 3 key learnings
It appears to be especially useful to be able to create labels for variables and assign meanings to
zeros and ones.
I also like creating new variables fast using functions in the expression builder.
Also, the multiple possibilities to create graphs was interesting.
2.)
Describe and interpret the graphs above. What can you learn from the graphs? Can you derive any
policy implication? Explain explicitly what you cannot learn (causation?) from the graph. Choose one
of the relations displayed above, roughly guess the correlation coefficient (especially whether it is
positive or negative) and explain how to interpret it.
Graph A shows how high obesity prevalence is for different logarithmic income levels. The time
change is indicated here by having different coloured dots and lines. The same goes for Graph B, but
for diabetes instead of obesity. Graph C shows the annual change rate for both obesity and diabetes
for absolute incomes.
We can see that there is a clear link between high obesity/diabetes and low income and that is
becoming more significant with every year. Furthermore, we can also observe that overall obesity
and diabetes rates are also rising in general.
Policies:
, Potential policies to counteract this development would be to tax unhealthy food to make it more
expensive and less attractive for low income groups. Also, food stamps could be implemented that
target specifically healthy food.
To fight the rise in obesity and diabetes in general, more awareness measures and education should
be implemented in places such as schools.
What you cannot learn:
We cannot observe any causation from the graph, we only know that there is a correlation. We do
not know if obesity is caused by low income or vice versa.
Guessing the correlation coefficient:
For A, obesity prevalence in 2015: estimated correlation -0.35. The coefficient is negative, as obesity
decreases with progression of the X axis. -0.35 is a moderate and negative correlation. For 1990 the
coefficient would be significantly lower at ca. -0.15.
3.)
Does e-sport performance improve by watching youTube? To investigate, 100 Fortnite players with at
least two years’ experience were questioned. They indicated how often they watch youTube and their
player ranking. It was found that players who watch a lot of youTube videos are worse ranked than
players who do not often watch youTube videos.
a) Which empirical method is used to answer the research question?
Questioning. Most likely an online survey was used to question the players.
b) What are possible explanations for the relationship between watching youTube and player’s
performance?
Watching Youtube videos takes time away from playing the game and yields less time efficient
learning.
Watching Youtube videos might imply a more casual approach to gaming and slower improvement.
b) What is the measurement scale for the variable “number of youTube videos watched”? What is
the difference to the other variable in this example?
The measurement scale for “number of youtube videos watched” is metric, as it is based on an
independent number. However, “player ranking” is an ordinal variable, as it is based on the
placement in relation to the other values.
c) Formulate two different statistical hypotheses to the corresponding research question and name
possible corresponding statistical methods.
Double sided
H0: Watching youtube videos has no effect on player ranking (A – B = 0)
H1: Watching youtube videos affects player ranking (A – B != 0)
One sided
H0: Watching youtube videos doesn’t increase player ranking (A <= B)
Felix Moosbauer 01628856
1.)
Look at the data visualization slides and background information on learn@wu. Watch this youTube
video about getting started in Stata: https://www.youtube.com/watch?v=YAVq99iUTTI
Summarize 3 key learnings
It appears to be especially useful to be able to create labels for variables and assign meanings to
zeros and ones.
I also like creating new variables fast using functions in the expression builder.
Also, the multiple possibilities to create graphs was interesting.
2.)
Describe and interpret the graphs above. What can you learn from the graphs? Can you derive any
policy implication? Explain explicitly what you cannot learn (causation?) from the graph. Choose one
of the relations displayed above, roughly guess the correlation coefficient (especially whether it is
positive or negative) and explain how to interpret it.
Graph A shows how high obesity prevalence is for different logarithmic income levels. The time
change is indicated here by having different coloured dots and lines. The same goes for Graph B, but
for diabetes instead of obesity. Graph C shows the annual change rate for both obesity and diabetes
for absolute incomes.
We can see that there is a clear link between high obesity/diabetes and low income and that is
becoming more significant with every year. Furthermore, we can also observe that overall obesity
and diabetes rates are also rising in general.
Policies:
, Potential policies to counteract this development would be to tax unhealthy food to make it more
expensive and less attractive for low income groups. Also, food stamps could be implemented that
target specifically healthy food.
To fight the rise in obesity and diabetes in general, more awareness measures and education should
be implemented in places such as schools.
What you cannot learn:
We cannot observe any causation from the graph, we only know that there is a correlation. We do
not know if obesity is caused by low income or vice versa.
Guessing the correlation coefficient:
For A, obesity prevalence in 2015: estimated correlation -0.35. The coefficient is negative, as obesity
decreases with progression of the X axis. -0.35 is a moderate and negative correlation. For 1990 the
coefficient would be significantly lower at ca. -0.15.
3.)
Does e-sport performance improve by watching youTube? To investigate, 100 Fortnite players with at
least two years’ experience were questioned. They indicated how often they watch youTube and their
player ranking. It was found that players who watch a lot of youTube videos are worse ranked than
players who do not often watch youTube videos.
a) Which empirical method is used to answer the research question?
Questioning. Most likely an online survey was used to question the players.
b) What are possible explanations for the relationship between watching youTube and player’s
performance?
Watching Youtube videos takes time away from playing the game and yields less time efficient
learning.
Watching Youtube videos might imply a more casual approach to gaming and slower improvement.
b) What is the measurement scale for the variable “number of youTube videos watched”? What is
the difference to the other variable in this example?
The measurement scale for “number of youtube videos watched” is metric, as it is based on an
independent number. However, “player ranking” is an ordinal variable, as it is based on the
placement in relation to the other values.
c) Formulate two different statistical hypotheses to the corresponding research question and name
possible corresponding statistical methods.
Double sided
H0: Watching youtube videos has no effect on player ranking (A – B = 0)
H1: Watching youtube videos affects player ranking (A – B != 0)
One sided
H0: Watching youtube videos doesn’t increase player ranking (A <= B)