Research in Biomedical Sciences
- Refresh your knowledge of Introduction to
Biomedical Sciences
- Set you off to a good start to Research in
Biomedical Sciences
METHODOLOGY
Variables
observable or hypothetical events that can change and whose changes can be measured in some way
- Independent variables - Explanatory variables
- Dependent variables - Response variables
- Extraneous variables
- Confounding variables
Independent vs dependent
Study the effect of X on Y
X: independent variable, manipulated variable controlled by experimenter, predicter
Y: dependent variable, observed effect, outcome
Extraneous variable
- variables that are not of interest to the researcher but that might influence the variables of
interest if not controlled
- variables that provide an alternative explanation
If controlled (i.e. kept constant or manipulated): good
If not controlled: extraneous variable = confounding variable
Levels of measurement
Every variable is measured at a different scale
Categorical: no meaningful interpretation of differences
- Nominal: variable represents a category without logical order
Dichotomous: if two categories
- Ordinal: ranked variable: represents a category with a specific order or rank position
Quantitative: meaningful interpretation of differences
- Discrete: counts: finite numbers
- Continuous: scale variable with
infinite numbers
,RESEARCH DESIGN
Research question
- Causal effect or association?
- Whether one causes the other or association or relationship
Dependent variable(s)
- Measurement
Type (nominal, ordinal, discrete, continuous)
How many?
Independent variable(s)
- Measurement
Type (nominal, ordinal, discrete, continuous)
How many?
Manipulation
- Compare groups or conditions? How many?
- Are measurements/manipulations
dependent/within-subjects/paired: comparison of one subject
independent/between-subjects/not paired?: comparison between subjects
Observational
- only observation no manipulation
Cross-sectional: all measurements at the same time
Case control: outcome is measured at the current time point, possible predictors
looked for in the past
Cohort/prospective: difference measured in two time points
Experimental
- variables manipulated
Random control design: participants are randomly assigned to groups, comparing
variables by comparing groups
Cross-over design: all participants undergo all conditions, checked in different levels
Representative sample
- All members of a defined group
- The group to which we aim to generalise
- The representation has to be relative
sample: subset of the population, the limited group in which we observe data
,DESCRIPTIVE STATISTICS
Goal: to present, organize and summarize data observed in the sample
- Measures of frequency
frequency and proportion
- Measures of central tendency: most central or typical value of a data set
Mean
Median
Mode
- Measures of dispersion/variability: the extent to which all the values in a data set vary around
the central or typical value
range, interquartile range
variance, standard deviation
Frequency
- Frequency: how often each value in the data set occurs
- Proportion: how often each value in the data set occurs in proportion to other values
frequency
total amount of data points
Central tendency
Mean: the average value
Advantages/disadvantages
- Use with quantitative data
- Takes account of the exact distances between values in the data set
- Powerful statistic used in estimating population parameters and in inferential statistics
- Sensitive to outliers (extreme values in the data set
va: the value at the median location
- the value at the middle of the sample if scores ranked from lowest to highest
- If odd number of data: median = value at median location
- If even number of data: median = mean of the adjacent values of the median location
Advantages/disadvantages
- Use with ordinal data
- Takes account only of the position of ranked values in the data set
- Unaffected by outliers (extreme values in the data set)
, Mode: the most frequently occurring value in a data set
- There is no most frequent value: no mode, all appear once
- 2 modes: data are bimodal
Advantages/disadvantages
- Typically used with nominal data
- Does not take account of the exact distances between values in the data set, nor the rank
order
- Unaffected by outliers
- Uninformative in small data sets
Dispersion/Variability
Range: the difference between the high and low scores of the sample
Advantages/disadvantages
- simplest, rudest measure
- sensitive to outliers
- unrepresentative of any features of the distribution of values between the extremes
Interquartile range (IQR): the distance between the two values that cut off the bottom 25% of values
(Q1) and the top 25% of values (Q3)
- Q1: 25th percentile: median of the values below the median
- Q3: 75th percentile: median of the values above the median
- IQR = Q3 –Q1
Advantages/disadvantages
- unaffected by outliers
- most useful measure for ordinal-level data
Variance: an estimate of the average amount by which the scores in the sample deviate from the
mean score
Standard deviation (SD): an estimate of the average amount by which the scores in the sample
deviate from the mean score
Advantages/disadvantages
- take account of all values in the data set
- most sensitive measures of dispersion, but also sensitive to outliers
- measures of dispersion around the mean (for SD: at the scale of variable)
- Refresh your knowledge of Introduction to
Biomedical Sciences
- Set you off to a good start to Research in
Biomedical Sciences
METHODOLOGY
Variables
observable or hypothetical events that can change and whose changes can be measured in some way
- Independent variables - Explanatory variables
- Dependent variables - Response variables
- Extraneous variables
- Confounding variables
Independent vs dependent
Study the effect of X on Y
X: independent variable, manipulated variable controlled by experimenter, predicter
Y: dependent variable, observed effect, outcome
Extraneous variable
- variables that are not of interest to the researcher but that might influence the variables of
interest if not controlled
- variables that provide an alternative explanation
If controlled (i.e. kept constant or manipulated): good
If not controlled: extraneous variable = confounding variable
Levels of measurement
Every variable is measured at a different scale
Categorical: no meaningful interpretation of differences
- Nominal: variable represents a category without logical order
Dichotomous: if two categories
- Ordinal: ranked variable: represents a category with a specific order or rank position
Quantitative: meaningful interpretation of differences
- Discrete: counts: finite numbers
- Continuous: scale variable with
infinite numbers
,RESEARCH DESIGN
Research question
- Causal effect or association?
- Whether one causes the other or association or relationship
Dependent variable(s)
- Measurement
Type (nominal, ordinal, discrete, continuous)
How many?
Independent variable(s)
- Measurement
Type (nominal, ordinal, discrete, continuous)
How many?
Manipulation
- Compare groups or conditions? How many?
- Are measurements/manipulations
dependent/within-subjects/paired: comparison of one subject
independent/between-subjects/not paired?: comparison between subjects
Observational
- only observation no manipulation
Cross-sectional: all measurements at the same time
Case control: outcome is measured at the current time point, possible predictors
looked for in the past
Cohort/prospective: difference measured in two time points
Experimental
- variables manipulated
Random control design: participants are randomly assigned to groups, comparing
variables by comparing groups
Cross-over design: all participants undergo all conditions, checked in different levels
Representative sample
- All members of a defined group
- The group to which we aim to generalise
- The representation has to be relative
sample: subset of the population, the limited group in which we observe data
,DESCRIPTIVE STATISTICS
Goal: to present, organize and summarize data observed in the sample
- Measures of frequency
frequency and proportion
- Measures of central tendency: most central or typical value of a data set
Mean
Median
Mode
- Measures of dispersion/variability: the extent to which all the values in a data set vary around
the central or typical value
range, interquartile range
variance, standard deviation
Frequency
- Frequency: how often each value in the data set occurs
- Proportion: how often each value in the data set occurs in proportion to other values
frequency
total amount of data points
Central tendency
Mean: the average value
Advantages/disadvantages
- Use with quantitative data
- Takes account of the exact distances between values in the data set
- Powerful statistic used in estimating population parameters and in inferential statistics
- Sensitive to outliers (extreme values in the data set
va: the value at the median location
- the value at the middle of the sample if scores ranked from lowest to highest
- If odd number of data: median = value at median location
- If even number of data: median = mean of the adjacent values of the median location
Advantages/disadvantages
- Use with ordinal data
- Takes account only of the position of ranked values in the data set
- Unaffected by outliers (extreme values in the data set)
, Mode: the most frequently occurring value in a data set
- There is no most frequent value: no mode, all appear once
- 2 modes: data are bimodal
Advantages/disadvantages
- Typically used with nominal data
- Does not take account of the exact distances between values in the data set, nor the rank
order
- Unaffected by outliers
- Uninformative in small data sets
Dispersion/Variability
Range: the difference between the high and low scores of the sample
Advantages/disadvantages
- simplest, rudest measure
- sensitive to outliers
- unrepresentative of any features of the distribution of values between the extremes
Interquartile range (IQR): the distance between the two values that cut off the bottom 25% of values
(Q1) and the top 25% of values (Q3)
- Q1: 25th percentile: median of the values below the median
- Q3: 75th percentile: median of the values above the median
- IQR = Q3 –Q1
Advantages/disadvantages
- unaffected by outliers
- most useful measure for ordinal-level data
Variance: an estimate of the average amount by which the scores in the sample deviate from the
mean score
Standard deviation (SD): an estimate of the average amount by which the scores in the sample
deviate from the mean score
Advantages/disadvantages
- take account of all values in the data set
- most sensitive measures of dispersion, but also sensitive to outliers
- measures of dispersion around the mean (for SD: at the scale of variable)