Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
College aantekeningen

IRO - Statistics I

Beoordeling
-
Verkocht
1
Pagina's
38
Geüpload op
16-02-2025
Geschreven in
2022/2023

This course is intended to introduce students without prior training to the use of quantitative methods in the social sciences. The course introduces students to basic statistics: how to summarise large amounts of information efficiently and how to draw basic inferences. As political scientists, we are interested in answering questions such as “What is the association between country size and civil war?”. To answer such questions, we require 1) data and 2) methods and techniques to process such data. In this course, you will learn how to describe data, apply and interpret the results from simple statistical tests as well as familiarise yourself with the statistical software R.

Meer zien Lees minder

Voorbeeld van de inhoud

Statistics I
STAT

Grade
Three assignments of the workgroups accounting for 20%
 Participation accounting for 10%

Exam – 70%
 Open and closed questions
 Minimum grade of five

Lectures – week 1

Variables and levels of measurement
Variable – any characteristics, number, or quantity that can be measured
and can differ across entities or across time

Variables have different scales or levels of measurement

Levels of measurement – nature of
information of the values assigned to
variables

Types of levels
Nominal variable – type of categorical
variable that includes two or more
exclusive categories with no natural order

Ordinal variable – type of categorical variable with clear ordering of the
values
 E.g. low <-> high, little <-> much, small <-> large
 Distance between values not the same across the levels
 Relative comparison

Numerical variable – a variable where the measurement is typically
represented by numbers

Continuous variable – a continuous numeric variable can be measured to
any level of precision
 Alternative levels of measurement - two forms of continuous variables
(Stanley smith Stevens);
- Interval – numerical variable but the zero is arbitrary/meaningless
- Ratio – like interval but meaningful zero

Discrete variables – cannot be measured to any level of precision, only
certain, countable values (usually whole numbers) are possible

,Explanatory & response variables
Explanatory (independent) variable
 Cause
 Often written as X

Response (dependent) variable
 Outcome
 Often written as Y

Organizing variables
Common format of dataset
- Each column = particular variable
- Each row = given record of the data set in question
- Each cell = one observation on one element in our dataset

Measures of central tendency

Distribution
When we collect data, we can show how the values are distributed in
relation to other values

Frequency distribution – display of the pattern of frequencies of a variable
of a data set
 Show all the possible values (or intervals) of the data and how often
they occur

Skewness and symmetry
There is an infinite number of distributions – symmetrical, bimodal,
multimodal




Asymmetrical distributions
 Negative (left) skew – mass concentrated on the right; left tail is longer
 Positive (right) skew – mass concentrated on the left; right tail is longer

How can we summarise/describe distributions of variables
Option 1 – visualize data

Option 2 – calculate measures to summarise data

, Measure of central tendency – a value that describes a set of data by
identifying the central position within that set of data
 Measure of dispersion – how stretched or squeezed is the distribution

Level of measurement Measures of central tendency
Nominal Mode
Ordinal Median + mode
Numeric Mean + median + mode


Mode – the most frequent score in a data set
 There can be several modes

Median – middle score for a set of data that has been arranged in order of
magnitude
 Even number of scores -> convention add two numbers in the middle
and divide them by two

(Arithmetic) Mean:



 Mean is sensitive to extreme values (outliers)
- If extreme values are in the data set the median may be more useful
 Median – robust statistic

Measures of dispersion

How stretched or squeezed is the distribution
Level of measurement Measure of dispersion
Nominal No measure of dispersion possible
Ordinal Range, inter-quartile range
Numeric Range, inter-quartile range,
variance/standard deviation

The range – the difference between the lowest and highest value
 Range = maximum – minimum

Range & interquartile range
We can split data into chunks (quantiles)

Many quantiles exist but some are common;
 Percentile; distribution is divided into 100 parts
 Deciles; distribution is divided into 1o parts
 Quintiles; distribution is divided into 5 parts
 Quartiles; distribution is divided into 3 parts

A common form of range – interquartile range
 The IQR is the range of the middle 50% of the data

,  Calculated by subtracting the 1st quartile from the 3rd quartile
- First quartile Q1 – median of the 50% smallest entries
- Third quartile Q3 – median of the 50% largest entries

Variance and standard deviation
Problem – interquartile range uses only a selection of data (which makes it
robust against outliers)

Measures of spread using all data – deviance = Xi – X (difference between
value and mean)

Once we have the deviance of all we can calculate the sum of all
deviances: total deviance ->

Total deviance:
Problem – total deviance is always zero (negative and positive deviations)
 Not a useful measure of spread

Instead we calculate the sum of squared errors (SS) ->



Two steps
1) Square the deviances (difference between mean and values)
2) Add the squared deviances

Variances (s2)
Problem – increase of n (number of observations) – increase of sum of
squared errors
- Not a useful measure to compare

Solution – divide sum of squared errors by number of observations (N)



minus 1
*n – 1 is bessels correction

Standard deviation (s)




Larger standard deviation – bigger spread/dispersion around the mean

The standard deviation is dependent on the scale

Documentinformatie

Geüpload op
16 februari 2025
Aantal pagina's
38
Geschreven in
2022/2023
Type
College aantekeningen
Docent(en)
Tim mickler
Bevat
Alle colleges

Onderwerpen

€7,49
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
semwaltman

Ook beschikbaar in voordeelbundel

Thumbnail
Voordeelbundel
Summary Block 4 (IRO)
-
2 2025
€ 14,98 Meer info

Maak kennis met de verkoper

Seller avatar
semwaltman Leiden University College The Hague
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
8
Lid sinds
3 jaar
Aantal volgers
4
Documenten
10
Laatst verkocht
2 weken geleden

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen