DATA (GEA 1000) COMPLETE
SUMMARY NOTES TO HELP YOU SCORE
AND A UNIVERSITY OF SINGAPORE.
, Chapter 1
Definitions
Population The entire group (of individuals or objects) that we want to know
something about
The population of A group in which the researcher has an interest in drawing
interest conclusions of the study
- Such as ‘the population of Asia’, ‘the population of
Singapore’ etc.
Population parameter A numerical fact about a population
- These are constants
Sample A proportion of the population selected in the study
- Done when census data is not readily available
- Preferred over the census as a sample is less costly
administratively and collection+processing of data is faster
for a sample
Census An attempt to reach out to the entire population of interest
- Note that one might not achieve a 100% response rate
Estimate Inference about the population’s parameter, based on information
obtained from a sample
Sampling frame ‘Source material’ from which sample is drawn
- May not cover the population of interest or may contain
units that are not in the population of interest
Note: In order to fulfill generalizability criteria, the sampling frame
should be equals to or larger than the target population (if the
sampling frame fails to cover any member of the target population,
it cannot be used to generalize fully to the target population as
there exist members that have been left out)
, Types of research questions:
1. Making an estimate about the population
- What is the average number of hours that students study each week?
- What proportion of all Singapore students is enrolled in a university?
2. Test a claim about the population
- Does the majority of students qualify for student loans?
- Is the average course load for a university student greater than 20 units?
3. Compare 2 sub-populations/Investigate a relationship between 2 variables in the population
- In university X, do female students have a higher GPA score than male students (1)
- - Does drinking coffee help students pass the math exam (2)
Sampling methods - Probability sampling
● The selection process is via a known/randomized mechanism in which every unit in the
population has a non-zero and known probability of being selected
● Eliminate biases associated with human selection
1. Simple random sampling
- Units are selected randomly from the sampling frame by a random number generator
- Sample results do not change haphazardly from sample to sample and variability is due
to chance
Advantages Disadvantages
- Sample tends to be a good - Subject to non-response (choose to opt out of the study)
representation of the
population - Possible limited access of information as the selected
individuals may be located in different geographical location
2. Systematic sampling
- A method of selecting units from a list through the application of selection interval K, so
that every Kth unit on the list, following a random start, is included in the sample
Advantages Disadvantages
- More straightforward and simpler - May not be representative of the population if
selection process than the simple random the sampling list is non-random
sampling
- Do not need to know the exact
population size at the planning stage
3. Stratified sampling