Science & Society
probability - answers calculation tool
statistics - answers observation of past
sample space - answers all possible outcomes
event - answers a specific possible outcome
discrete - answers finite
continuous - answers changing
frequentist - answers measure of occurrences
subjective - answers measure one's belief
random sample - answers any possible outcome
population - answers collection to be analyzed
experiment - answers planned activity that yields data
parameter - answers number that describes a population
statistic - answers number computed from sample
statistic variable - answers measurable characteristic
statistical model - answers an equation that shows relationship of variables
statistical inference - answers using sample information to learn about population
Descriptive Statistics - answers statistics dealing with organizing and summarizing data
Inferential Statistics - answers making predictions about unknown population
parameters using sample statistics
Constant - answers do not change in repeated trials over time
Variable - answers measurements of some characteristic vary from trial to trial
,Qualitative - answers Measurements vary in kind or name but not in degree meaning
they cannot be ranked
Quantitative - answers measurable data that is discrete or continuous
nominal - answers named
ordinal - answers ordered
bias - answers some individuals are systematically favored over others
Simple random sampling - answers list of all possible individuals in the population is
made and n subjects are chosen in such a way that every set of n subjects has an equal
chance of being selected
Stratified random sampling - answers population is naturally divided into two or more
groups of similar subjects called strata
Multistage random sampling - answers The population is divided into several clusters
and these are further divided into smaller sub-clusters
Haphazard sampling - answers This method involves selecting a sample by some
convenient mechanism that does not involve randomization
Volunteer Response Sampling - answers people volunteer to be a part of the study. EX-
telephone call in polls ,internet survey
Factor - answers explanatory variables that cause a change in the response
Levels - answers specific value of factor
Double-blinding experiment - answers In addition to the experimental units, those
people who are conducting the experiments not know to which group the experimental
units have been assigned
Confounding experiment - answers The existence of some factor other than the
treatment that makes treatment and control groups different
Observational study - answers No treatment is imposed (shows association not
causation)
Block - answers group of individuals that are known before the experiment to be similar
in some way that is expected to affect the response to the treatments
−2LL - answers the log-likelihood multiplied by minus 2. This version of the likelihood is
used in logistic regression.
,α-level - answers the probability of making a Type I error (usually this value is 0.05).
Adjusted mean - answers in the context of analysis of covariance this is the value of the
group mean adjusted for the effect of the covariate.
Adjusted predicted value - answers a measure of the influence of a particular case of
data. It is the predicted value of a case from a model estimated without that case
included in the data. The value is calculated by re-estimating the model without the case
in question, then using this new model to predict the value of the excluded case. If a
case does not exert a large influence over the model then its predicted value should be
similar regardless of whether the model was estimated including or excluding that case.
The difference between the predicted value of a case from the model when that case
was included and the predicted value from the model when it was excluded is the DFFit.
Adjusted R2 - answers a measure of the loss of predictive power or shrinkage in
regression. The adjusted R2 tells us how much variance in the outcome would be
accounted for if the model had been derived from the population from which the sample
was taken.
AIC (Akaike's information criterion) - answers a goodness-of-fit measure that is
corrected for model complexity. That just means that it takes account of how many
parameters have been estimated. It is not intrinsically interpretable, but can be
compared in different models to see how changing the model affects the fit. A small
value represents a better fit to the data.
AICC (Hurvich and Tsai's criterion) - answers a goodness-of-fit measure that is similar
to AIC but is designed for small samples. It is not intrinsically interpretable, but can be
compared in different models to see how changing the model affects the fit. A small
value represents a better fit to the data.
Alpha factoring - answers a method of factor analysis.
Alternative hypothesis - answers the prediction that there will be an effect (i.e., that your
experimental manipulation will have some effect or that certain variables will relate to
each other).
Analysis of covariance - answers a statistical procedure that uses the F-statistic to test
the overall fit of a linear model, adjusting for the effect that one or more covariates have
on the outcome variable. In experimental research this linear model tends to be defined
in terms of group means and the resulting ANOVA is therefore an overall test of whether
group means differ after the variance in the outcome variable explained by any
covariates has been removed.
Analysis of variance - answers a statistical procedure that uses the F¬¬-statistic to test
the overall fit of a linear model. In experimental research this linear model tends to be
, defined in terms of group means, and the resulting ANOVA is therefore an overall test of
whether group means differ.
ANCOVA - answers acronym for analysis of covariance.
Anderson-Rubin method - answers a way of calculating factor scores which produces
scores that are uncorrelated and standardized with a mean of 0 and a standard
deviation of 1.
ANOVA - answers acronym for analysis of variance.
AR(1) - answers this stands for first-order autoregressive structure. It is a covariance
structure used in multilevel linear models in which the relationship between scores
changes in a systematic way. It is assumed that the correlation between scores gets
smaller over time and that variances are assumed to be homogeneous. This structure is
often used for repeated-measures data (especially when measurements are taken over
time such as in growth models).
Autocorrelation - answers when the residuals of two observations in a regression model
are correlated.
bi - answers unstandardized regression coefficient. Indicates the strength of relationship
between a given predictor, i, of many and an outcome in the units of measurement of
the predictor. It is the change in the outcome associated with a unit change in the
predictor.
βi - answers standardized regression coefficient. Indicates the strength of relationship
between a given predictor, i, of many and an outcome in a standardized form. It is the
change in the outcome (in standard deviations) associated with a one standard
deviation change in the predictor.
β-level - answers the probability of making a Type II error (Cohen, 1992, suggests a
maximum value of 0.2).
Bar chart - answers a graph in which a summary statistic (usually the mean) is plotted
on the y-axis against a categorical variable on the x-axis (this categorical variable could
represent, for example, groups of people, different times or different experimental
conditions). The value of the mean for each category is shown by a bar. Different-
coloured bars may be used to represent levels of a second categorical variable.
Bartlett's test of sphericity - answers unsurprisingly, this is a test of the assumption of
sphericity. This test examines whether a variance-covariance matrix is proportional to
an identity matrix Therefore, it effectively tests whether the diagonal elements of the
variance-covariance matrix are equal (i.e., group variances are the same), and whether
the off-diagonal elements are approximately zero (i.e., the dependent variables are not
correlated). Jeremy Miles, who does a lot of multivariate stuff, claims he's never ever