Exam (elaborations)

Statistics summary Master Data Science & Society

Rating

Sold

Pages

Grade

A+

Uploaded on

26-11-2025

Written in

2025/2026

Statistics summary Master Data Science & Society

Institution

Course

Content preview

Statistics summary Master Data
Science & Society
probability - answers calculation tool

statistics - answers observation of past

sample space - answers all possible outcomes

event - answers a specific possible outcome

discrete - answers finite

continuous - answers changing

frequentist - answers measure of occurrences

subjective - answers measure one's belief

random sample - answers any possible outcome

population - answers collection to be analyzed

experiment - answers planned activity that yields data

parameter - answers number that describes a population

statistic - answers number computed from sample

statistic variable - answers measurable characteristic

statistical model - answers an equation that shows relationship of variables

statistical inference - answers using sample information to learn about population

Descriptive Statistics - answers statistics dealing with organizing and summarizing data

Inferential Statistics - answers making predictions about unknown population
parameters using sample statistics

Constant - answers do not change in repeated trials over time

Variable - answers measurements of some characteristic vary from trial to trial

,Qualitative - answers Measurements vary in kind or name but not in degree meaning
they cannot be ranked

Quantitative - answers measurable data that is discrete or continuous

nominal - answers named

ordinal - answers ordered

bias - answers some individuals are systematically favored over others

Simple random sampling - answers list of all possible individuals in the population is
made and n subjects are chosen in such a way that every set of n subjects has an equal
chance of being selected

Stratified random sampling - answers population is naturally divided into two or more
groups of similar subjects called strata

Multistage random sampling - answers The population is divided into several clusters
and these are further divided into smaller sub-clusters

Haphazard sampling - answers This method involves selecting a sample by some
convenient mechanism that does not involve randomization

Volunteer Response Sampling - answers people volunteer to be a part of the study. EX-
telephone call in polls ,internet survey

Factor - answers explanatory variables that cause a change in the response

Levels - answers specific value of factor

Double-blinding experiment - answers In addition to the experimental units, those
people who are conducting the experiments not know to which group the experimental
units have been assigned

Confounding experiment - answers The existence of some factor other than the
treatment that makes treatment and control groups different

Observational study - answers No treatment is imposed (shows association not
causation)

Block - answers group of individuals that are known before the experiment to be similar
in some way that is expected to affect the response to the treatments
−2LL - answers the log-likelihood multiplied by minus 2. This version of the likelihood is
used in logistic regression.

,α-level - answers the probability of making a Type I error (usually this value is 0.05).

Adjusted mean - answers in the context of analysis of covariance this is the value of the
group mean adjusted for the effect of the covariate.

Adjusted predicted value - answers a measure of the influence of a particular case of
data. It is the predicted value of a case from a model estimated without that case
included in the data. The value is calculated by re-estimating the model without the case
in question, then using this new model to predict the value of the excluded case. If a
case does not exert a large influence over the model then its predicted value should be
similar regardless of whether the model was estimated including or excluding that case.
The difference between the predicted value of a case from the model when that case
was included and the predicted value from the model when it was excluded is the DFFit.

Adjusted R2 - answers a measure of the loss of predictive power or shrinkage in
regression. The adjusted R2 tells us how much variance in the outcome would be
accounted for if the model had been derived from the population from which the sample
was taken.

AIC (Akaike's information criterion) - answers a goodness-of-fit measure that is
corrected for model complexity. That just means that it takes account of how many
parameters have been estimated. It is not intrinsically interpretable, but can be
compared in different models to see how changing the model affects the fit. A small
value represents a better fit to the data.

AICC (Hurvich and Tsai's criterion) - answers a goodness-of-fit measure that is similar
to AIC but is designed for small samples. It is not intrinsically interpretable, but can be
compared in different models to see how changing the model affects the fit. A small
value represents a better fit to the data.

Alpha factoring - answers a method of factor analysis.

Alternative hypothesis - answers the prediction that there will be an effect (i.e., that your
experimental manipulation will have some effect or that certain variables will relate to
each other).

Analysis of covariance - answers a statistical procedure that uses the F-statistic to test
the overall fit of a linear model, adjusting for the effect that one or more covariates have
on the outcome variable. In experimental research this linear model tends to be defined
in terms of group means and the resulting ANOVA is therefore an overall test of whether
group means differ after the variance in the outcome variable explained by any
covariates has been removed.

Analysis of variance - answers a statistical procedure that uses the F¬¬-statistic to test
the overall fit of a linear model. In experimental research this linear model tends to be

, defined in terms of group means, and the resulting ANOVA is therefore an overall test of
whether group means differ.

ANCOVA - answers acronym for analysis of covariance.

Anderson-Rubin method - answers a way of calculating factor scores which produces
scores that are uncorrelated and standardized with a mean of 0 and a standard
deviation of 1.

ANOVA - answers acronym for analysis of variance.

AR(1) - answers this stands for first-order autoregressive structure. It is a covariance
structure used in multilevel linear models in which the relationship between scores
changes in a systematic way. It is assumed that the correlation between scores gets
smaller over time and that variances are assumed to be homogeneous. This structure is
often used for repeated-measures data (especially when measurements are taken over
time such as in growth models).

Autocorrelation - answers when the residuals of two observations in a regression model
are correlated.

bi - answers unstandardized regression coefficient. Indicates the strength of relationship
between a given predictor, i, of many and an outcome in the units of measurement of
the predictor. It is the change in the outcome associated with a unit change in the
predictor.

βi - answers standardized regression coefficient. Indicates the strength of relationship
between a given predictor, i, of many and an outcome in a standardized form. It is the
change in the outcome (in standard deviations) associated with a one standard
deviation change in the predictor.

β-level - answers the probability of making a Type II error (Cohen, 1992, suggests a
maximum value of 0.2).

Bar chart - answers a graph in which a summary statistic (usually the mean) is plotted
on the y-axis against a categorical variable on the x-axis (this categorical variable could
represent, for example, groups of people, different times or different experimental
conditions). The value of the mean for each category is shown by a bar. Different-
coloured bars may be used to represent levels of a second categorical variable.

Bartlett's test of sphericity - answers unsurprisingly, this is a test of the assumption of
sphericity. This test examines whether a variance-covariance matrix is proportional to
an identity matrix Therefore, it effectively tests whether the diagonal elements of the
variance-covariance matrix are equal (i.e., group variances are the same), and whether
the off-diagonal elements are approximately zero (i.e., the dependent variables are not
correlated). Jeremy Miles, who does a lot of multivariate stuff, claims he's never ever

Report Copyright Violation

Written for

Course: Master Data Science

All documents for this subject (2)

Document information

Uploaded on: November 26, 2025
Number of pages: 47
Written in: 2025/2026
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

statistics summary master data science society

$18.49

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

TrustworthyScholar

Get to know the seller

TrustworthyScholar NURSING, ECONOMICS, MATHEMATICS, BIOLOGY, AND HISTORY MATERIALS — PREMIUM TUTORING, HOMEWORK SUPPORT, EXAM & TEST PREPARATION, AND COMPLETE STUDY GUIDES WITH GUARANTEED TOP GRADES. I am a committed medical professional equipp

View profile

Sold

Member since

1 year

Number of followers

Documents

840

Last sold

1 year ago

The Test Bank Marketplace by Professor Vincent. "Premium Test Banks for Major Courses. Trusted by Students."

Welcome to Professor Vincent's Test Bank Vault—the premier destination for top-rated test banks. Founded by educators, we provide instant access to verified exam preparation materials for Nursing, Economics, Engineering, Finance, and a wide range of major courses. We don't just sell documents; we sell academic confidence. Unlock your potential and study smarter with Professor Vincent today!

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller TrustworthyScholar. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $18.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 47184 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Statistics summary Master Data Science & Society

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?