Written by students who passed Immediately available after payment Read online or as PDF Wrong document? Swap it for free 4.6 TrustPilot
logo-home
Summary

Summary DS Research Methods (JBM020) 2020/2021

Rating
-
Sold
1
Pages
45
Uploaded on
16-08-2021
Written in
2021/2022

This document is an exhaustive summary of all the material provided in the 2020/2021 Data Science Research Methods course. It includes in-depth descriptions of theory from the books Experimental Design (Berger et al., 2018) and Mostly Harmless Econometrics (Angrist et al., 2009) as well as the theory given in the lectures. Additionally, this 45-page document contains examples and quiz questions including worked out solutions to help you pass the exam!

Show more Read less
Institution
Course

Content preview

Lieve Göbbels
DS Research Methods (JBM020)
Semester 2, 2020-2021



Data Science Research Methods
Scienti c Method and Experimentation 3
The scienti c method 3
Experimentation and experimental design 3
Important concepts 4
One-Factor Designs and the Analysis of Variance 5
One-Factor Designs 5
Analysis of Variance (ANOVA) 6
Sample Size Determination 8
Sample size determination 8
Normal distribution 8
Binomial distribution 9
ANOVA II - Power 11
One-way ANOVA and power 11
Effect size 11
Sample size determination 11
Multiple Comparisons 12
Multiple comparisons 12
Bonferroni correction 12
Fisher’s Least Signi cance Difference test (LSD) 12
Tukey’s Honest Signi cant Difference test (HSD) 13
Two-Factor Designs 14
Two-way ANOVA with replication 14
Two-factor with no replication and no interaction 15
Introduction to blocking 16
Full Factorial Designs 17
Full factorial designs 17
Estimating effects in 2 factor 2 level experiments 18
Three factors at two levels 19
Number and kinds of effects 19
Main effects with large interactions 19
Choosing levels of factors when measured along continuum 20
Errors of estimates in full factorial designs 20
Fractional Factorial Designs 21
Blocking in full factorial designs II 21
Fractional factorial designs 22
Analysis of fractional factorial designs 23
Response Surface Optimization 24
Response Surface Optimization 24
Optimization steps 24
Regression models 24

, Step 2: Improvement 25
Step 3: Determination (Response Surface Designs) 25
Finding the optimum using CCD or BB estimates 26
Introduction to Econometrics for Data Scientists 27
Econometrics 27
Independence and correlation 27
Regressions 27
Causality and Selection 29
Causality formalized 29
Average Treatment Effect (ATE) 29
Average Treatment effect on Treated (ATT) 29
Selection (bias) 29
Random assignment 30
Potential problems with experiments 31
IV estimation 31
Selection on Observables and Matching 32
Matching estimators 32
Some recaps 32
Selection on observables 33
Matching 33
Different methods 34
Differences-in-Differences Estimation 36
Differences-in-differences estimation 36
Implementation 36
Testing the parallel trends assumption 36
Group-speci c trends and dynamic effects 37
More pre-periods 37
Compositional changes 37
Generalization: synthetic control 37
Regression Discontinuity Design 38
Regression Discontinuity Design (RDD) 38
Sharp RDD 38
Fuzzy RDD 40
Speci cation testing 41
Quiz Questions and Solutions 42
Quiz questions and solutions 42

, Scienti c Method and Experimentation
In short:
• The scienti c method
• Experimentation and experimental design
• Important concepts


The scienti c method
There are three important goals of data science (and beyond):
1. description: provide insight into past events;
2. prediction: provide insight into a (possible) future;
3. explanation/prescription: advise on possible outcomes.

Basic elements of the scienti c method
1. formulate (research) question;
2. perform background research;
3. formulate hypothesis;
4. determine logical consequence of hypothesis;
5. collect observations (conduct experiment);
6. test truth of hypothesis by analyzing observations (statistics);
7. report results;
8. if the hypothesis is not con rmed, go back to 2.

Some of these steps can be linked to the Six Sigma’s DMAIC method (De ne, Measure, Analyze,
Improve, Control):
• 1 can be linked to the De ne phase;
• 4 can be linked to the Measure phase;
• 5 can be linked to the Analyze phase.
So, the Improve and Control phases do not have a direct link. The scienti c method is characterized
by its iterative method.




Experimentation and experimental design
An experiment is an investigation in which the researcher selects the values (levels) of one or more
input (independent) variables and observes the values of the output (dependent) variables. This has
the purpose to get insight in the relationship between dependent and independent variables which is
then often used to optimize the underlying process.
An experimental design is then the aggregation of independent variables, the set of amounts,
settings or magnitudes (levels) of each independent variable, and the combinations of these levels.
So, the core of experimental design is to answer the three-part question:
• which factors should we study?
• how should the levels of these factors vary?
• in what way should these levels be combined?
Sometimes, for examples when analysis is ex post facto (after the data is already collected),
the levels of independent variables cannot be speci ed, because they are already given. Then,

Connected book

Written for

Institution
Study
Course

Document information

Summarized whole book?
No
Which chapters are summarized?
Ch 2, 3, 4, 6, 9, 10, 11, 16
Uploaded on
August 16, 2021
File latest updated on
August 17, 2021
Number of pages
45
Written in
2021/2022
Type
SUMMARY

Subjects

$8.18
Get access to the full document:

Wrong document? Swap it for free Within 14 days of purchase and before downloading, you can choose a different document. You can simply spend the amount again.
Written by students who passed
Immediately available after payment
Read online or as PDF

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Lieve12 RWTH Aachen University
Follow You need to be logged in order to follow users or courses
Sold
172
Member since
5 year
Number of followers
118
Documents
28
Last sold
4 months ago

4.4

17 reviews

5
8
4
8
3
1
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Working on your references?

Create accurate citations in APA, MLA and Harvard with our free citation generator.

Working on your references?

Frequently asked questions