STA1003: Fundamental Statistics
Chi-Square Goodness of Fit Test
Overview:
What is the Goodness-of-Fit Test;
When is a Goodness-of-Fit Hypothesis test used;
What steps are involved in conducting a Goodness-of-Fit test;
How to perform a Goodness-of-Fit test in SPSS.
Introduction to Statistical Inference:
When analyzing data, we can not just accept the sample mean or sample
proportion as the population or TRUE mean or proportion.
When we estimate the statistics (sample mean or sample proportion, or
any other statistics), we get different answers due to variability.
So we have to perform statistical inference
o Confidence Interval: when you want to estimate a population
parameter
o Hypothesis Testing: when we want to assess the evidence provided
by the data in favor of some claim about the population
Chi-Square Test of Independence:
In the previous lecture we performed a hypothesis test to see if two
categorical variables were associated
Differences between these conditional distributions suggest an association
between the two categorical variables
But are the differences due to chance (sampling error) or a real
difference?
Use a chi-squared test of independence to decide
Goodness-of-fit Tests:
Suppose the student population at USQ is 70% online and 30% on-campus
(that is the population proportions are 70% online and 30% on-campus). A
random sample of 100 students yields 63 online students and 37 on-
campus, can we conclude that the sample is (random and) representative
of the population?
That is, how good does the data fit the probability model?
A hypothesis test to address this question is called a test of Goodness-of-
fit.
Also called Chi-Square Goodness-of-fit test
The idea behind the Goodness-of-fit test is to see if the sample comes
from the population with the claimed distribution (does the sample "fit"
the population distribution)
Another way of looking at that is to ask if the frequency distribution fits a
specific pattern.
Employers at a large financial firm wish to know which days of the week
employees are absent in a five-day work week. The firm believes that
employees are absent equally during the week.
Suppose a random sample of 60 managers were asked on which day of
the week they had the highest number of employee absences:
o Monday = 15
o Tuesday = 12
o Wednesday = 9
o Thursday = 9
o Friday = 15
Chi-Square Goodness of Fit Test
Overview:
What is the Goodness-of-Fit Test;
When is a Goodness-of-Fit Hypothesis test used;
What steps are involved in conducting a Goodness-of-Fit test;
How to perform a Goodness-of-Fit test in SPSS.
Introduction to Statistical Inference:
When analyzing data, we can not just accept the sample mean or sample
proportion as the population or TRUE mean or proportion.
When we estimate the statistics (sample mean or sample proportion, or
any other statistics), we get different answers due to variability.
So we have to perform statistical inference
o Confidence Interval: when you want to estimate a population
parameter
o Hypothesis Testing: when we want to assess the evidence provided
by the data in favor of some claim about the population
Chi-Square Test of Independence:
In the previous lecture we performed a hypothesis test to see if two
categorical variables were associated
Differences between these conditional distributions suggest an association
between the two categorical variables
But are the differences due to chance (sampling error) or a real
difference?
Use a chi-squared test of independence to decide
Goodness-of-fit Tests:
Suppose the student population at USQ is 70% online and 30% on-campus
(that is the population proportions are 70% online and 30% on-campus). A
random sample of 100 students yields 63 online students and 37 on-
campus, can we conclude that the sample is (random and) representative
of the population?
That is, how good does the data fit the probability model?
A hypothesis test to address this question is called a test of Goodness-of-
fit.
Also called Chi-Square Goodness-of-fit test
The idea behind the Goodness-of-fit test is to see if the sample comes
from the population with the claimed distribution (does the sample "fit"
the population distribution)
Another way of looking at that is to ask if the frequency distribution fits a
specific pattern.
Employers at a large financial firm wish to know which days of the week
employees are absent in a five-day work week. The firm believes that
employees are absent equally during the week.
Suppose a random sample of 60 managers were asked on which day of
the week they had the highest number of employee absences:
o Monday = 15
o Tuesday = 12
o Wednesday = 9
o Thursday = 9
o Friday = 15