Probability and statistics II
Chapter 2: Discrete random variables
Andrew Black
Semester 1, 2021
Bounding probabilities for discrete random variables
In some cases, we may not know the probability mass function of the process we are modelling; we may only
know (or in some way can get an estimate of) the mean, or both the mean and variance of the probability
distribution of interest.
In these cases, we can use some important inequality results due to Markov and Chebyschev to calculate
bounds on probabilities.
Tail sum formula
Let Y be a discrete random variable that takes the values 0, 1, 2, . . . , n, then
n
ÿ
E[Y ] = P (Y Ø i).
i=1
Proof
n
ÿ
E[X] = kP (X = k)
k=0
= 0P (X = 0) + 1P (X = 1) + 2P (X = 2) + 3P (X = 3) + . . .
= P (X = 1)+
P (X = 2) + P (X = 2)+
P (X = 3) + P (X = 3) + P (X = 3)+
..
.
P (X = n) + P (X = n) + . . . + P (X = n)
ÿn
= P (X Ø k).
k=1
Markov’s inequality
If Y is a random variable that takes only nonnegative values, then for any value a > 0,
E[Y ]
P (Y Ø a) Æ .
a
Roughly, if E[Y ] is small then it is not likely that Y is large.
1
, Example
Suppose the average cost to maintain a car for a year is $1500, what is the upper bound on the probability
that the cost in one year is greater than $7500?
1500
P (Y Ø 7500) Æ = 0.2
7500
Markov’s inequality proof
n
ÿ
E[Y ] = yP (Y = y)
y=0
a≠1
ÿ n
ÿ
= yP (Y = y) + yP (Y = y)
y=0 y=a
ÿn
Ø yP (Y = y)
y=a
ÿn
Ø aP (Y = y)
y=a
ÿn
=a P (Y = y)
y=a
= aP (Y Ø a),
so E[Y ] Ø aP (Y Ø a)
E[Y ]
∆P (Y Ø a) Æ .
a
_
Chebyshev’s inequality
Let Y be a random variable with expected value µ and finite variance ‡ 2 . Then, for any constant k > 0,
1
P (|Y ≠ µ| < k‡) Ø 1 ≠
k2
1
or, equivalently, P (|Y ≠ µ| Ø k‡) Æ .
k2
Roughly if Var(Y ) is small then Y will not vary far from its mean.
Example
The number of customers at a shop in a day has a mean of 20 with a variance of 16. What is a lower bound
on the probability that the number of customers will lie between 12 and 28?
First, notice that if we let Y = 12 and then 28, we get that
Ô
|Y ≠ µ| = |12 ≠ 20| = 8 = 2 16 = 2‡,
and similarly
|Y ≠ µ| = |28 ≠ 20| = 8.
2
Chapter 2: Discrete random variables
Andrew Black
Semester 1, 2021
Bounding probabilities for discrete random variables
In some cases, we may not know the probability mass function of the process we are modelling; we may only
know (or in some way can get an estimate of) the mean, or both the mean and variance of the probability
distribution of interest.
In these cases, we can use some important inequality results due to Markov and Chebyschev to calculate
bounds on probabilities.
Tail sum formula
Let Y be a discrete random variable that takes the values 0, 1, 2, . . . , n, then
n
ÿ
E[Y ] = P (Y Ø i).
i=1
Proof
n
ÿ
E[X] = kP (X = k)
k=0
= 0P (X = 0) + 1P (X = 1) + 2P (X = 2) + 3P (X = 3) + . . .
= P (X = 1)+
P (X = 2) + P (X = 2)+
P (X = 3) + P (X = 3) + P (X = 3)+
..
.
P (X = n) + P (X = n) + . . . + P (X = n)
ÿn
= P (X Ø k).
k=1
Markov’s inequality
If Y is a random variable that takes only nonnegative values, then for any value a > 0,
E[Y ]
P (Y Ø a) Æ .
a
Roughly, if E[Y ] is small then it is not likely that Y is large.
1
, Example
Suppose the average cost to maintain a car for a year is $1500, what is the upper bound on the probability
that the cost in one year is greater than $7500?
1500
P (Y Ø 7500) Æ = 0.2
7500
Markov’s inequality proof
n
ÿ
E[Y ] = yP (Y = y)
y=0
a≠1
ÿ n
ÿ
= yP (Y = y) + yP (Y = y)
y=0 y=a
ÿn
Ø yP (Y = y)
y=a
ÿn
Ø aP (Y = y)
y=a
ÿn
=a P (Y = y)
y=a
= aP (Y Ø a),
so E[Y ] Ø aP (Y Ø a)
E[Y ]
∆P (Y Ø a) Æ .
a
_
Chebyshev’s inequality
Let Y be a random variable with expected value µ and finite variance ‡ 2 . Then, for any constant k > 0,
1
P (|Y ≠ µ| < k‡) Ø 1 ≠
k2
1
or, equivalently, P (|Y ≠ µ| Ø k‡) Æ .
k2
Roughly if Var(Y ) is small then Y will not vary far from its mean.
Example
The number of customers at a shop in a day has a mean of 20 with a variance of 16. What is a lower bound
on the probability that the number of customers will lie between 12 and 28?
First, notice that if we let Y = 12 and then 28, we get that
Ô
|Y ≠ µ| = |12 ≠ 20| = 8 = 2 16 = 2‡,
and similarly
|Y ≠ µ| = |28 ≠ 20| = 8.
2