Summary

Summary Advanced Statistics

Rating

Sold

Pages

Uploaded on

21-03-2025

Written in

2024/2025

Summary of the advanced statistics course.

Institution

Course

Content preview

Summary Advanced Statistics
Some definitions
P-value => the chance of observing a difference from H0 at least as extreme as the one in you sample
 P-hacking: Performing a large number of statistical tests, only reporting the ones that are
statistically significant, thereby increasing the risk of false positive results.

Standard Error (SE) => a measure of uncertainty of an estimate, so how much the estimate is
expected to vary from the estimate of the true population.
 It helps understand how reliable or representative our sample is as an estimate of the
population.
 A smaller standard error suggests a more reliable estimate, while a larger one indicates more
uncertainty.
Standard deviation (SD) => tells us how spread out or varied a set of data points is from the average
(mean).
 It helps understand the degree of variability or dispersion in a dataset.
 A larger standard deviation means the data points are more spread out, while a smaller one
indicates they are closer to the mean.
Degrees of freedom (DF) => represent the number of values in the final calculation of a statistic that
are free to vary.
 It measures the flexibility or constraints in data.
 It's the number of data points minus the number of parameters estimated or restrictions
imposed in a statistical analysis.

Tidy data => Every row is one is measurement in space and time, columns are variables with meaning
in the context of a hypothesis or model. Long format!
o Minimal number of columns = Degrees of freedom Model
Power => the probability that a statistical test or analysis will correctly detect a true effect or
difference when it exists. It measures the ability of a test to avoid a "false negative" or Type II error,
indicating the test's sensitivity to finding real effects.

Type I Error => incorrectly rejecting a true null hypothesis. In other words, it's a false positive,
indicating that there is an effect or difference when there isn't one. Underestimate of SE.
Type II Error => failing to reject a false null hypothesis. In other words, it's a false negative, indicating
that there is no effect or difference when there actually is one. Overestimate of SE.

Null deviance => measure for the deviance of the null model (maximal deviance explained by model).
Residual deviance => measure for deviance of the residuals (variance not explained by model).
- Residual deviance should be the same or close to degrees of freedom = model fits good
Deviance explained: (Null deviance - residual deviance)/Null deviance
- Overdispersion => having more variation or "spread" in the data than the model predicts,
which can lead to inaccurate model results and conclusions.
o You can deal with this in different ways:
 The dispersion parameter can be used to correct for the
underestimate/overestimate of SE.
 Quasipoisson (poisson but with more variance)
 Negative binomial (poisson but with more variance, more complex, separate
parameters for mean and variance)
 Mixed Models, but only if there is a random effect factor.
- Under dispersion => having less variation or "spread" than the model predicts, which can
also affect the accuracy of model results and conclusions.

, Fisher scoring => how many steps it took to find the best fit (4-8 is good, above 15 bad).

Studies where data is not independent:
- Longitudinal studies: Subject is measured over time
- Repeated measurement: Subject receives multiple treatments.
- Nested designs: One subject nested in treatment (not a factorial design).
- Split plot design: Combination of factorial and nested design.

Statistical Considerations of Study Design
 Balance => Equal sample size per category
o Not always possible => but increases power and simplicity of the analysis
 Replication => true replication is absolutely essential
o The required sample size depends on…
 Variance (the stochastic part of the process)
 Effect size (how large the true differences are)
 Model complexity (more parameters require more samples)
o Variance and effect size can be determined from a pilot study, previous research, or
expert knowledge.
o Model complexity depends on what kind of comparison you want to make, what
distribution you think the outcome has conditional on the explanatory variables,
whether you believe there to be potential confounders that have to be included, etc.
o No pseudo replications (measurements on the same experimental units, like leaves
on one tree instead of multiple trees).
 Randomization => random allocation of treatments, locations, or even the order in which you
process samples.
o Avoiding confounding effects
o Without randomization, samples run first will have slightly different measurement
error than samples run last.
 Blocking => a way to group similar things or subjects together.
o For estimating confounding effects
o A block is a subset of the experimental material within which experimental units are
expected to be homogeneous (e.g., a microarray is a block).
o Blocking can make it easier to detect the true effects of the factors you're studying by
reducing the influence of other variables that could muddy the result.
o Nested mixed models can use blocking (blocks nested in blocks).

Required sample size (n) depends on the complexity of the study design:
- Groups
- Natural variability
- Experimental techniques

 Small n => significance testing
 Medium n => regularized linear models (also small)
 Large n => predictive models

Report Copyright Violation

Written for

Institution: Universiteit Leiden (UL)
Study: Biology
Course: Advanced Statistics (4313AST17)

All documents for this subject (3)

Document information

Uploaded on: March 21, 2025
Number of pages: 10
Written in: 2024/2025
Type: SUMMARY

Subjects

statistics
programming
models
anova
t test

$9.05

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

mayastelzer

Also available in package deal

Get to know the seller

mayastelzer Universiteit Leiden

View profile

Sold

Member since

1 year

Number of followers

Documents

Last sold

5 months ago

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller mayastelzer. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $9.05. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 46483 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Summary Advanced Statistics

Content preview

Written for

Document information

Subjects

Also available in package deal

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?