Class notes

Extensive summary of computational statistics lectures

Rating

Sold

Pages

108

Uploaded on

14-12-2025

Written in

2025/2026

Extensive summary of computational statistics. Lecture summaries, it consists of 5 parts: - Inferential statistics/ Sampling - Bias, var, mse and Monte Carlo simulations - Resampling: Bootstrap & Permutation/ Model accuracy - Resampling: Cross-validation - Multivariate & high-dimensional data It has a lot of examples that are used during the lectures. Has tables to make it clear/ to discriminate between subjects + has extra notes to explain things that were not clear. Very detailed

Show more Read less

Institution

Course

Content preview

Lecture 1
Statistics: statistics is concerned with the use of data in the context of uncertainty, branch of
mathematics that uses probability theory

Computer science/ computing science (CS): fundamental concepts to understand and explore the
natural and artificial world in computational terms

Computational statistics: implementing statistical methods on computers, including the ones
unthinkable before the computer age, as well as to cope with analytically intractable problems

Basics in probability & statistics:
Data = fundamental to research/ learning:
- Science/ learning relies on data
- Making sense of data & predicting future values is fundamental to statistics & data science

Experiment 1: Bernoulli trial:
- Bernoulli trial = experiment or observation that has exactly two possible outcomes, usually
success and fail (binary)
- Each trial has only two outcomes
- Has a fixed probability of success, say p
- Is independent of other trials
- What we want to learn: when tossing a coin, what is the probability of observing heads?
- Approach 1 = empirical approach
- Gather data
- Estimate the probability: what is the proportion of instances with a hit
- Approach 2 = mathematical approach: can we set up a mathematical model for the experiment?
- This is, can we express the data generating mechanism as a mathematical, more
precisely, statistical model?
- To do this, carefully consider the data (“observations”) that arise and how they arise by
the experiment
- Observations: “hit” (=heads/ 1) or “fail” (=tails/0) ⇒ (the data)
- Experiment: toss a coin ⇒ (the generating mechanism)
- Note: the outcome of the experiment is uncertain (random experiment)

, - A binary random variable follows a Bernoulli distribution
- X is a Bernoulli random variable with success probability pi

Experiment 1: comparison empirical to mathematical result:
- Probability of observing heads when tossing a coin
- Based on the actual experiment; P(X=1) = (in class 0.4)
- Based on the mathematical model and assuming a fair coin; P(X=1) = 0.5

Experiment 2: Binomial trial:
- Binomial trial = a sequence of n independent Bernoulli trials, each with the same probability of
success p
- E.g.,: flipping a coin 10 times and counting how many heads you get
- What we want to learn:
- When tossing a coin five times, what is the probability of observing exactly four heads?
- Approach 1 = empirical approach: let’s gather data
- Toss a coin 5 times and register the number of times head turns up
- Estimate the probability: what is the proportion of instances with four times head
- Based on the experiment, what is the estimated probability of observing 4 times head
tossing a coin five times? P(x=4)= …
- How confident are you in this result? Why?
- Approach 2 = mathematical approach: can we set up a mathematical model for the experiment?
- This is, can we express the data generating mechanism as a mathematical (more
precisely, statistical) model?
- To do this, carefully consider the data (“observations”) that arise and how they arise by
the experiment
- Observations: number of heads in 5 coin tosses (the data)
- Experiment: toss a coin 5 times in a row (the generating mechanism)
- Let’s look at the data generating mechanism in more detail
- Observed data = number of heads when tossing a coin 5 times:

, - The outcome of this experiment is based on counting the number of hits
when repeating a Bernoulli trial n (here n = 5) times
- These Bernoulli trials are independent (meaning…)
- The outcome of such an experiment based on counting the successes
(ones) of n independent Bernoulli trials follows a Binomial distribution
Binomial distribution:

- Not one coin toss, but repeating the same experiment n times
- Each trial:
- Has two outcomes
- Has the same success probability pi
- Is independent of the others (trials are not correlated, previous trial does not affect next
trial)
- X = number of successes in n trials
- Binomial coefficient = this counts how many different ways those k successes can occur among n
trials
- For example, n = 3 and k = 2
- Success pattern: SSF, SFS, FSS
- There are 3 ways: 3! / 2!(3-2)! = 3
- X~Bin(n,pi) = X follows a binomial distribution with n trials and success probability pi per trial
- If you toss a fair coin n=5 => X~Bin(5, 0.5)
- P(X=2) = probability of getting exactly 2 heads in 5 tosses
- 5! / 2!3! (0.5)^2 (0.5)^3 = 10*0.03125 = 0.3125

Experiment 2:
- When tossing a coin five times, what is the probability of observing four times head?
- Based on the experiment: P(4H | 5 tosses) = …
- Now calculate the analytic solution, this is the mathematical solution
- Probability theory: P(4H | 5 tosses) = …

, - How many experiment trials do you need for a “good” guess of the solution? How practical is the
(real) experiment procedure?

Experiment 3: Hypergeometric trial
- What we want to learn: is it likely that one is able to taste whether the sample of chocolate is
Belgian or not?
- Mathematical approach: let’s test the hypothesis of random guessing
- P(X correct | guessing) = … (in class activity)
- Not binomial, since there is dependence on the previous answer

Experiment 4: 3-door problem
- Approach 1 = empirical approach: gather data
- From the data we learn that there were almost twice as much winners among those who
switched
- Drawback: can be expensive to collect data
- Approach 2 = mathematical statistics:
- Using laws of probability (product rule):
- P(win | stay) = P(winning door at first) = ⅓
- P(win | switch) = P(non-winning door at first) x P(winning door at second) = ⅔
*1/1 = ⅔
- Drawback: some problems (too) difficult to solve in an analytic way

- If it’s difficult to get real data ⇒ rely on simulation

Report Copyright Violation

Written for

Institution: Tilburg University (UVT)
Study: Data Science & Society
Course: Computational Statistics (880260M6)

All documents for this subject (1)

Document information

Uploaded on: December 14, 2025
Number of pages: 108
Written in: 2025/2026
Type: Class notes
Professor(s): Katrijn van deun
Contains: All classes

Subjects

statistics
computational statistics
lecture
extensive
detailed
well explained
examples
summary

$10.27

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

StudentSums

2.5

(2)

Get to know the seller

StudentSums Erasmus Universiteit Rotterdam

View profile

Sold

Member since

5 year

Number of followers

Documents

Last sold

1 month ago

2.5

2 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller StudentSums. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $10.27. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 47251 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Extensive summary of computational statistics lectures

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?