Probability and statistics II
Chapter 5: Markov chains
Semester 1, 2021
Lecture 30
Discrete-time Markov chains
We often would like to model processes in which the probability associated with the next event depends only
on what has just occurred.
Markov chain models have this property, called memorylessness:
Memorylessness: the next state that is visited by a Markov chain depends only on the present state, and
not on any previous states.
Random processes (stochastic processes)
A random process is a family (or sequence) of random variables
{Xn }nœT
where n is a parameter running over an index set T .
Discrete-time Markov chains
Consider the case where the index n corresponds to discrete units of time, so that
T = {0, 1, 2, . . .}.
This type of random process is said to occur in discrete time.
Discrete-time Markov chains
We also restrict our attention to discrete random variables, which always have a countable state space.
Let {Xn }nœN , be a sequence of random variables with a countable state space
S = {x0 , x1 , x2 , . . . , xi , . . .}.
We now consider some notation that will enable us to model a particular form of stochastic process known as
a discrete-time Markov chain.
1
,Discrete-time Markov chain (DTMC)
A discrete-time random process
{Xn }nœN ,
is a discrete-time Markov chain if it satisfies
P (Xn = s | X0 = x0 , X1 = x1 , . . . , Xn≠1=xn≠1 ) = P (Xn = s | Xn≠1=xn≠1 )
for all n Ø 1 and all s, x0 , x1 , . . . , xn≠1 œ S.
That is, the probability of moving from state xn≠1 to state s depends
• only on the state xn≠1 , and
• not on the history of the process X0 , X1 , . . . , Xn≠2 (i.e. how the process got to state xn≠1 ).
This is the “memoryless" nature of Markov processes.
Discrete-time Markov chains
Since the state space S can be put into a one-to-one correspondence with some subset of the natural numbers
N = {0, 1, 2, . . . }, without loss of generality, we can assume that the state space is a subset of N.
This simplifies our notation considerably, as
• we essentially can give up the physical interpretation of the states for a numerical one, and
• we can simply write more general expressions for the transition probabilities like
P (Xn+1 = j|Xn = i) for i, j œ S,
which govern the evolution of the Markov chain.
That is, the evolution of the sequence X1 , X2 , . . . from some starting state X0 .
Note that the starting state X0 can either
• be specified a priori, or
• be chosen randomly from some distribution across S.
Time-homogeneous Markov chain
A Markov chain {Xn }nœN is called time-homogeneous if we have for all n and i, j œ S that
P (Xn+1 = j | Xn = i) = P (X1 = j | X0 = i).
in which case we write
pij = P (Xn+1 = j | Xn = i) for all n and i, j œ S.
Otherwise, the Markov chain is said to be time-inhomogeneous.
Example: Repeated coin toss
A coin had a probability of p of getting a head.
S = {0, 1, 2, . . .}
2
, State transition diagram
p p p
1−p 0 1 2 ...
between
2 more
how
1−p 1−p
states The set
of a
process
Example: A vending machine
A vending machine can be in one of two states:
• working
• not working
If the machine is working on a particular day, then the probability that it will not be working on the next
day is ”, such that 0 < ” < 1.
When the machine is not working on a particular day, it will be in working condition on the next day with
probability “, such that 0 < “ < 1.
Example: A vending machine
If we say the machine is in state 0 if not working and in state 1 if it is working, then the state space S is. . .
S = {0, 1}
State diagram
1 0 1 1
Discrete-time Markov chains
The initial state could be set a priori or selected by some distribution such as (p, 1 ≠ p).
If we assume the machine starts in state 0, we can consider the evolution of the machine’s condition for, say,
values of ” = 0.9 and “ = 0.2.
X0 = 0
;
0 with probability 1 ≠ “ = 0.8
X1 =
1 with probability “ = 0.2
;
0 with probability 0.2(0.9) + 0.8(0.8) = 0.82
X2 =
1 with probability 0.8(0.2) + 0.2(0.1) = 0.18.
3
Chapter 5: Markov chains
Semester 1, 2021
Lecture 30
Discrete-time Markov chains
We often would like to model processes in which the probability associated with the next event depends only
on what has just occurred.
Markov chain models have this property, called memorylessness:
Memorylessness: the next state that is visited by a Markov chain depends only on the present state, and
not on any previous states.
Random processes (stochastic processes)
A random process is a family (or sequence) of random variables
{Xn }nœT
where n is a parameter running over an index set T .
Discrete-time Markov chains
Consider the case where the index n corresponds to discrete units of time, so that
T = {0, 1, 2, . . .}.
This type of random process is said to occur in discrete time.
Discrete-time Markov chains
We also restrict our attention to discrete random variables, which always have a countable state space.
Let {Xn }nœN , be a sequence of random variables with a countable state space
S = {x0 , x1 , x2 , . . . , xi , . . .}.
We now consider some notation that will enable us to model a particular form of stochastic process known as
a discrete-time Markov chain.
1
,Discrete-time Markov chain (DTMC)
A discrete-time random process
{Xn }nœN ,
is a discrete-time Markov chain if it satisfies
P (Xn = s | X0 = x0 , X1 = x1 , . . . , Xn≠1=xn≠1 ) = P (Xn = s | Xn≠1=xn≠1 )
for all n Ø 1 and all s, x0 , x1 , . . . , xn≠1 œ S.
That is, the probability of moving from state xn≠1 to state s depends
• only on the state xn≠1 , and
• not on the history of the process X0 , X1 , . . . , Xn≠2 (i.e. how the process got to state xn≠1 ).
This is the “memoryless" nature of Markov processes.
Discrete-time Markov chains
Since the state space S can be put into a one-to-one correspondence with some subset of the natural numbers
N = {0, 1, 2, . . . }, without loss of generality, we can assume that the state space is a subset of N.
This simplifies our notation considerably, as
• we essentially can give up the physical interpretation of the states for a numerical one, and
• we can simply write more general expressions for the transition probabilities like
P (Xn+1 = j|Xn = i) for i, j œ S,
which govern the evolution of the Markov chain.
That is, the evolution of the sequence X1 , X2 , . . . from some starting state X0 .
Note that the starting state X0 can either
• be specified a priori, or
• be chosen randomly from some distribution across S.
Time-homogeneous Markov chain
A Markov chain {Xn }nœN is called time-homogeneous if we have for all n and i, j œ S that
P (Xn+1 = j | Xn = i) = P (X1 = j | X0 = i).
in which case we write
pij = P (Xn+1 = j | Xn = i) for all n and i, j œ S.
Otherwise, the Markov chain is said to be time-inhomogeneous.
Example: Repeated coin toss
A coin had a probability of p of getting a head.
S = {0, 1, 2, . . .}
2
, State transition diagram
p p p
1−p 0 1 2 ...
between
2 more
how
1−p 1−p
states The set
of a
process
Example: A vending machine
A vending machine can be in one of two states:
• working
• not working
If the machine is working on a particular day, then the probability that it will not be working on the next
day is ”, such that 0 < ” < 1.
When the machine is not working on a particular day, it will be in working condition on the next day with
probability “, such that 0 < “ < 1.
Example: A vending machine
If we say the machine is in state 0 if not working and in state 1 if it is working, then the state space S is. . .
S = {0, 1}
State diagram
1 0 1 1
Discrete-time Markov chains
The initial state could be set a priori or selected by some distribution such as (p, 1 ≠ p).
If we assume the machine starts in state 0, we can consider the evolution of the machine’s condition for, say,
values of ” = 0.9 and “ = 0.2.
X0 = 0
;
0 with probability 1 ≠ “ = 0.8
X1 =
1 with probability “ = 0.2
;
0 with probability 0.2(0.9) + 0.8(0.8) = 0.82
X2 =
1 with probability 0.8(0.2) + 0.2(0.1) = 0.18.
3