PROBABILITY AND STOCHASTIC PROCESS 33
4. Random vectors and their distribution
Sometimes a single random variable is not enough to describe the outcomes of a random ex-
periments. For example, to record the height and weight of every person in a certain community,
we need a pair (x, y), where the components respectively represents he height and weight of a
particular individuals. In many cases it is necessary to consider the joint behavior of two or
more random variables.
Definition 4.1 (n-dimensional random vector). Let X1 , X2 , . . . , Xn be n real random vari-
ables defined on a given probability space (Ω, F, P). The function X : Ω → Rn defined by
X(ω) := (X1 (ω), X2 (ω), . . . , Xn (ω))
is called an n-dimensional random vector.
Let X be a n-dimensional random vector defined on a probability space (Ω, F, P). Then the
function PX on B(Rn ) defined by
PX (B) = P(X ∈ B), B ∈ B(Rn )
is a probability measure on (Rn , B(Rn )). This is called distribution of X.
Definition 4.2 (Joint cumulative distribution function (joint cdf )). Let X = (X1 , X2 , . . . , Xn )
be an n-dimensional random vector. The function F(X1 ,X2 ,...,Xn ) : Rn → [0, 1] defined by
F(X1 ,X2 ,...,Xn ) (x1 , x2 , . . . , xn ) = P(X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn )
is called the joint cumulative distribution funmction (joint cdf) of the random variables X1 , X2 , . . . , Xn .
Marginal cumulative distribution function (marginal cdf ): In the following, we consider
n = 2, and the same results will hold for n > 2. Let X and Y be two random variables with
joint cdf F(X,Y ) . One can find the cdf of X and Y from the joint cdf F(X,Y ) . Indeed
FX (x) = P(X ≤ x) = P ∪y {X ≤ x, Y ≤ y} = lim P(X ≤ x, Y ≤ y) = lim F(X,Y ) (x, y)
y→∞ y→∞
Similarly, we also have
FY (y) = lim F(X,Y ) (x, y).
x→∞
The distribution functions FX and FY are sometimes referred to as marginal cdf of X and
Y . One can easily show that joint cdf is nondecreasing and right continuous on each of its
arguments. Moreover, for any (x1 , y1 ), (x2 , y2 ) ∈ R2 with x1 ≤ x2 and y1 ≤ y2 , set
A := {x ≤ x2 , y ≤ y2 }, B = {x ≤ x1 , y ≤ y2 }, C = {x ≤ x2 , y ≤ y1 }, D = {x ≤ x1 , y ≤ y1 }.
Observe that
K1 := {x1 < x ≤ x2 , y ≤ y1 } = C \ D =⇒ PX (K1 ) = PX (C) − PX (D)
K2 := {x1 < x ≤ x2 , y ≤ y2 } = A \ B =⇒ PX (K2 ) = PX (A) − PX (B).
Since K2 \ K1 = {x1 < x ≤ x2 , y1 < y ≤ y2 }, we have
0 ≤ PX ({x1 < x ≤ x2 , y1 < y ≤ y2 }) = PX (K2 ) − PX (K1 )
= PX (A) − PX (B) − PX (C) − PX (D)
= F(X,Y ) (x2 , y2 ) + F(X,Y ) (x1 , y1 ) − F(X,Y ) (x1 , y2 ) − F(X,Y ) (x2 , y1 ) .
Theorem 4.1. A function F : R2 → [0, 1] is a joint cdf of some two dimensional random vector
if and only if it satisfies the following conditions:
a) F is nondecreasing and right continuous with respect to each arguments.
b) lim F (x, y) = 0 = lim F (x, y) and lim F (x, y) = 1.
y→−∞ x→−∞ (x,y)→(∞,∞)
,34 A. K. MAJEE
c) For any (x1 , y1 ), (x2 , y2 ) ∈ R2 with x1 ≤ x2 and y1 ≤ y2 ,
F (x2 , y2 ) + F (x1 , y1 ) − F (x1 , y2 ) − F (x2 , y1 ) ≥ 0.
Example 4.1. The function F : R2 → [0, 1] given by
(
0, x < 0, or y < 0, or x + y < 1,
F (x, y) =
1, otherwise
is NOT a joint cdf of any two dimensional random vector. If so, then
1 1 1 1 1 1
0 ≤ P( < X ≤ 1, < Y ≤ 1) = F (1, 1) + F ( , ) − F (1, ) − F ( , 1) = 1 + 0 − 1 − 1 = −1 < 0.
3 3 3 3 3 3
Definition 4.3 (Discrere random vector). A random vector X = (X1 , X2 , . . . , Xn ) is said to
be discrete if the random variables X1 , X2 , . . . , Xn are all discrete i.e., there exists a countable
set E ⊆ Rn such that P(X ∈ E) = 1.
Definition 4.4 (Joint probability mass function). Let X be a discrete random vector. The
function pX : Rn → [0, 1] defined by
(
P(X = x), if x belongs to the image of X
pX (x) =
0, otherwise
is called joint probability mass function (joint mpf) of X.
Marginal pmf: Let X and Y be two discrete random variable with joint pmf p(X,Y ) . Then we
can compute pmf of X and Y in terms of p(X,Y ) as follows:
X X
pX (x) = P(X = x) = P ∪y {X = x, Y = y} = P(X = x, Y = y) = p(X,Y ) (x, y)
y y
X X
pY (y) = P(Y = y) = P ∪x {X = x, Y = y} = P(X = x, Y = y) = p(X,Y ) (x, y).
x x
pX and pY sometimes are referred as marginal pmf of X and Y .
Example 4.2. A fair coin is tossed three times. Let X be the number of heads in three tossing,
and let Y denotes the difference between number of heads and number of tails in absolute value.
Then X ∈ {0, 1, 2, 3} and Y ∈ {1, 3}. In this case, Ω = {H, T }3 . We define P(A) = |A|
8 . Thus,
for example
3 3
P(X = 1, Y = 1) = P({HT T, T HT, T T H}) = , P(X = 2, Y = 1) = P({HHT, HT H, T HH}) = .
8 8
The joint pmf and the marginal pmf are given in the following table:
Like in one variable case, joint cdf can be determined in terms of joint pmf. Indeed, since
image of (X, Y ) is the countable set E = {(xi , yj ) : i = 0, 1, . . . , j = 0, 1, 2, . . .}, we see that, for
any (x, y) ∈ R2
X X
F(X,Y ) (x, y) = P(X ≤ x, Y ≤ y) = P(X = xi , Y = yi ) = p(X,Y ) (xi , yj ).
xi ≤x,yi ≤y xi ≤x,yi ≤y
Example 4.3. A fair die is rolled and a fair coin is tossed independently. Let X be the face
value of the die and let
(
0, if tail turns up
Y =
1, if head turns up
, PROBABILITY AND STOCHASTIC PROCESS 35
Figure 1. Joint pmf and Marginal pmf
where the joint pmf of X and Y are given by
(
1
, if (x, y) is image of (X, Y )
p(X,Y ) (x, y) = 12
0, otherwise.
Find the joint cdf of X and Y .
Solution:
P Observe that X ∈ {1, 2, 3, 4, 5, 6} and Y ∈ {0, 1}. By using the relation F(X,Y ) (x, y) =
xi ≤x,yi ≤y p(X,Y ) (xi , yj ), we have
0, x < 1, −∞ < y < ∞; −∞ < x < ∞, y < 0
1
12 , 1 ≤ x < 2, 0 ≤ y < 1
1
6 , 2 ≤ x < 3, 0 ≤ y < 1; 1 ≤ x < 2, y ≥ 1
1
4 , 3 ≤ x < 4, 0 ≤ y < 1
F(X,Y ) (x, y) = 13 , 4 ≤ x < 5, 0 ≤ y < 1; 2 ≤ x < 3, y ≥ 1
5
12 , 5 ≤ x < 6, 0 ≤ y < 1
1
2, 6 ≤ x, 0 ≤ y < 1; 3 ≤ x < 4, y ≥ 1
2
4 ≤ x < 5, y ≥ 1
,
3
1, x ≥ 6, y ≥ 1 .
Definition 4.5. We say that X and Y are jointly continuous if there exists a nonnegative
function f(X,Y ) (·, ·) defined for all real x and y, having the property that, for every Borel set
C ∈ B(R2 ) such that
ZZ
P((X, Y ) ∈ C) = f(X,Y ) (x, y) dx dy .
(x,y)∈C
The function f(X,Y ) (·, ·) is called the joint probability density function (joint pdf) of X and Y .
Take C = {(x, y) : x ∈ A, y ∈ B} where A, B ∈ B(R). Then we have
Z Z
P(X ∈ A, Y ∈ B) = f(X,Y ) (x, y) dx dy.
B A
Thus, we have
Z b Z a
F(X,Y ) (a, b) = P(X ∈ (−∞, a], Y ∈ (−∞, b]) = f(X,Y ) (x, y) dx dy
−∞ −∞
4. Random vectors and their distribution
Sometimes a single random variable is not enough to describe the outcomes of a random ex-
periments. For example, to record the height and weight of every person in a certain community,
we need a pair (x, y), where the components respectively represents he height and weight of a
particular individuals. In many cases it is necessary to consider the joint behavior of two or
more random variables.
Definition 4.1 (n-dimensional random vector). Let X1 , X2 , . . . , Xn be n real random vari-
ables defined on a given probability space (Ω, F, P). The function X : Ω → Rn defined by
X(ω) := (X1 (ω), X2 (ω), . . . , Xn (ω))
is called an n-dimensional random vector.
Let X be a n-dimensional random vector defined on a probability space (Ω, F, P). Then the
function PX on B(Rn ) defined by
PX (B) = P(X ∈ B), B ∈ B(Rn )
is a probability measure on (Rn , B(Rn )). This is called distribution of X.
Definition 4.2 (Joint cumulative distribution function (joint cdf )). Let X = (X1 , X2 , . . . , Xn )
be an n-dimensional random vector. The function F(X1 ,X2 ,...,Xn ) : Rn → [0, 1] defined by
F(X1 ,X2 ,...,Xn ) (x1 , x2 , . . . , xn ) = P(X1 ≤ x1 , X2 ≤ x2 , . . . , Xn ≤ xn )
is called the joint cumulative distribution funmction (joint cdf) of the random variables X1 , X2 , . . . , Xn .
Marginal cumulative distribution function (marginal cdf ): In the following, we consider
n = 2, and the same results will hold for n > 2. Let X and Y be two random variables with
joint cdf F(X,Y ) . One can find the cdf of X and Y from the joint cdf F(X,Y ) . Indeed
FX (x) = P(X ≤ x) = P ∪y {X ≤ x, Y ≤ y} = lim P(X ≤ x, Y ≤ y) = lim F(X,Y ) (x, y)
y→∞ y→∞
Similarly, we also have
FY (y) = lim F(X,Y ) (x, y).
x→∞
The distribution functions FX and FY are sometimes referred to as marginal cdf of X and
Y . One can easily show that joint cdf is nondecreasing and right continuous on each of its
arguments. Moreover, for any (x1 , y1 ), (x2 , y2 ) ∈ R2 with x1 ≤ x2 and y1 ≤ y2 , set
A := {x ≤ x2 , y ≤ y2 }, B = {x ≤ x1 , y ≤ y2 }, C = {x ≤ x2 , y ≤ y1 }, D = {x ≤ x1 , y ≤ y1 }.
Observe that
K1 := {x1 < x ≤ x2 , y ≤ y1 } = C \ D =⇒ PX (K1 ) = PX (C) − PX (D)
K2 := {x1 < x ≤ x2 , y ≤ y2 } = A \ B =⇒ PX (K2 ) = PX (A) − PX (B).
Since K2 \ K1 = {x1 < x ≤ x2 , y1 < y ≤ y2 }, we have
0 ≤ PX ({x1 < x ≤ x2 , y1 < y ≤ y2 }) = PX (K2 ) − PX (K1 )
= PX (A) − PX (B) − PX (C) − PX (D)
= F(X,Y ) (x2 , y2 ) + F(X,Y ) (x1 , y1 ) − F(X,Y ) (x1 , y2 ) − F(X,Y ) (x2 , y1 ) .
Theorem 4.1. A function F : R2 → [0, 1] is a joint cdf of some two dimensional random vector
if and only if it satisfies the following conditions:
a) F is nondecreasing and right continuous with respect to each arguments.
b) lim F (x, y) = 0 = lim F (x, y) and lim F (x, y) = 1.
y→−∞ x→−∞ (x,y)→(∞,∞)
,34 A. K. MAJEE
c) For any (x1 , y1 ), (x2 , y2 ) ∈ R2 with x1 ≤ x2 and y1 ≤ y2 ,
F (x2 , y2 ) + F (x1 , y1 ) − F (x1 , y2 ) − F (x2 , y1 ) ≥ 0.
Example 4.1. The function F : R2 → [0, 1] given by
(
0, x < 0, or y < 0, or x + y < 1,
F (x, y) =
1, otherwise
is NOT a joint cdf of any two dimensional random vector. If so, then
1 1 1 1 1 1
0 ≤ P( < X ≤ 1, < Y ≤ 1) = F (1, 1) + F ( , ) − F (1, ) − F ( , 1) = 1 + 0 − 1 − 1 = −1 < 0.
3 3 3 3 3 3
Definition 4.3 (Discrere random vector). A random vector X = (X1 , X2 , . . . , Xn ) is said to
be discrete if the random variables X1 , X2 , . . . , Xn are all discrete i.e., there exists a countable
set E ⊆ Rn such that P(X ∈ E) = 1.
Definition 4.4 (Joint probability mass function). Let X be a discrete random vector. The
function pX : Rn → [0, 1] defined by
(
P(X = x), if x belongs to the image of X
pX (x) =
0, otherwise
is called joint probability mass function (joint mpf) of X.
Marginal pmf: Let X and Y be two discrete random variable with joint pmf p(X,Y ) . Then we
can compute pmf of X and Y in terms of p(X,Y ) as follows:
X X
pX (x) = P(X = x) = P ∪y {X = x, Y = y} = P(X = x, Y = y) = p(X,Y ) (x, y)
y y
X X
pY (y) = P(Y = y) = P ∪x {X = x, Y = y} = P(X = x, Y = y) = p(X,Y ) (x, y).
x x
pX and pY sometimes are referred as marginal pmf of X and Y .
Example 4.2. A fair coin is tossed three times. Let X be the number of heads in three tossing,
and let Y denotes the difference between number of heads and number of tails in absolute value.
Then X ∈ {0, 1, 2, 3} and Y ∈ {1, 3}. In this case, Ω = {H, T }3 . We define P(A) = |A|
8 . Thus,
for example
3 3
P(X = 1, Y = 1) = P({HT T, T HT, T T H}) = , P(X = 2, Y = 1) = P({HHT, HT H, T HH}) = .
8 8
The joint pmf and the marginal pmf are given in the following table:
Like in one variable case, joint cdf can be determined in terms of joint pmf. Indeed, since
image of (X, Y ) is the countable set E = {(xi , yj ) : i = 0, 1, . . . , j = 0, 1, 2, . . .}, we see that, for
any (x, y) ∈ R2
X X
F(X,Y ) (x, y) = P(X ≤ x, Y ≤ y) = P(X = xi , Y = yi ) = p(X,Y ) (xi , yj ).
xi ≤x,yi ≤y xi ≤x,yi ≤y
Example 4.3. A fair die is rolled and a fair coin is tossed independently. Let X be the face
value of the die and let
(
0, if tail turns up
Y =
1, if head turns up
, PROBABILITY AND STOCHASTIC PROCESS 35
Figure 1. Joint pmf and Marginal pmf
where the joint pmf of X and Y are given by
(
1
, if (x, y) is image of (X, Y )
p(X,Y ) (x, y) = 12
0, otherwise.
Find the joint cdf of X and Y .
Solution:
P Observe that X ∈ {1, 2, 3, 4, 5, 6} and Y ∈ {0, 1}. By using the relation F(X,Y ) (x, y) =
xi ≤x,yi ≤y p(X,Y ) (xi , yj ), we have
0, x < 1, −∞ < y < ∞; −∞ < x < ∞, y < 0
1
12 , 1 ≤ x < 2, 0 ≤ y < 1
1
6 , 2 ≤ x < 3, 0 ≤ y < 1; 1 ≤ x < 2, y ≥ 1
1
4 , 3 ≤ x < 4, 0 ≤ y < 1
F(X,Y ) (x, y) = 13 , 4 ≤ x < 5, 0 ≤ y < 1; 2 ≤ x < 3, y ≥ 1
5
12 , 5 ≤ x < 6, 0 ≤ y < 1
1
2, 6 ≤ x, 0 ≤ y < 1; 3 ≤ x < 4, y ≥ 1
2
4 ≤ x < 5, y ≥ 1
,
3
1, x ≥ 6, y ≥ 1 .
Definition 4.5. We say that X and Y are jointly continuous if there exists a nonnegative
function f(X,Y ) (·, ·) defined for all real x and y, having the property that, for every Borel set
C ∈ B(R2 ) such that
ZZ
P((X, Y ) ∈ C) = f(X,Y ) (x, y) dx dy .
(x,y)∈C
The function f(X,Y ) (·, ·) is called the joint probability density function (joint pdf) of X and Y .
Take C = {(x, y) : x ∈ A, y ∈ B} where A, B ∈ B(R). Then we have
Z Z
P(X ∈ A, Y ∈ B) = f(X,Y ) (x, y) dx dy.
B A
Thus, we have
Z b Z a
F(X,Y ) (a, b) = P(X ∈ (−∞, a], Y ∈ (−∞, b]) = f(X,Y ) (x, y) dx dy
−∞ −∞