Exam (elaborations)

Sampling Solved Exercises 4 (Multi-Stage Sampling)

Rating

Sold

Pages

Grade

A+

Uploaded on

26-01-2022

Written in

2021/2022

Sampling Solved Exercises 4 (Multi-Stage Sampling)

Institution

Course

Content preview

5
Multi-stage Sampling

5.1 Deﬁnitions

We consider a partitioning of the population U into M parts, called primary
units (PU). Each PU is itself partitioned into Ni parts, called secondary units
(SU), identiﬁed by the pair (i, k), where k varies from 1 to Ni . The population
of secondary units in PU i is denoted Ui . It is possible to repartition each SU
and to iterate this process. We sample m PU (sample S) then, in general in-
dependently from one PU to another, we sample ni SU in PU i if it is sampled
(sample Si ): we say that we are faced with sampling of two stages. If this ﬁnal
stage is sampled exhaustively, the sampling is called ‘cluster sampling’.

5.2 Estimator, variance decomposition, and variance

In a two-stage sampling design without replacement, if PU i is selected with
inclusion probability πi , and if SU (i, k) that it contains is selected with prob-
ability πk|i , then we estimate the total

M
Y = yi,k
i=1 k∈Ui

without bias by yi,k
Y = .
πi πk|i
i∈S k∈Si

The variance var(Y ) is the sum of two terms, knowing the ‘inter-class’ variance
var1 (E2|1 (Y )) and the ‘intra-class’ variance E1 (var2|1 (Y )), where 1 and 2 are
the indices representing the two successive sampling stages. In the case of a
simple random sample at each stage, when ni only depends on i, we show
that:

,160 5 Multi-stage Sampling
2
m ST2 M 2
M
ni S2,i
var(Y ) = M 2 1 − + Ni 1 − ,
M m m i=1 Ni ni
where
1
M
ST2 = (Yi − Y )2 ,
M − 1 i=1

1
M
Y = Yi ,
M i=1
and
1
2
S2,i = (yi,k − Y i )2 ,
Ni − 1
k∈Ui

with
Yi
Yi = ,
Ni
and
Yi = yi,k .
k∈Ui

This variance can be estimated without bias by:

m s2T M 2 ni s22,i
Y ) = M 2 1 −
var( + Ni 1 − ,
M m m N i ni
i∈S

where
1 1 2
s2T = (Yi − Yi ) ,
m−1 m
i∈S i∈S

and
(yi,k − Y i )2 ,
1
s22,i =
ni − 1
k∈Si

with
Yi = Ni Y i ,
and
1
Y i = yi,k .
ni
k∈Si

5.3 Speciﬁc case of sampling of PU with replacement
When the primary units are selected with replacement, we have a remarkable
result. Denoting m as the sample size of PU, j as the order number of the
drawing and ij as the identiﬁer of the PU selected at the jth drawing, and
denoting:

, 5.4 Cluster eﬀect 161

• pi as the sampling probability of PU i at the time of any drawing

M
pi = 1.
i=1

• Yi as the unbiased estimator of the true total Yi (expression as a function
of the sampling design within PU i).
We then estimate without bias the true total with the Hansen-Hurwitz esti-
mator:
1 Yij
m
YHH = ,
m j=1 pij

and we estimate without bias its variance by:
2
1 m
Yij
YHH
var = − YHH .
m(m − 1) j=1 pij

This very simple expression is valid for whatever sampling design used within
the PU (we only require that Yi be unbiased for Yi ).

5.4 Cluster eﬀect
We thus indicate the phenomenon conveying a certain ‘similarity’ among the
individuals of the same PU, in comparison with the variable of interest y. We
can formalise this by:
7M 7Ni 7Ni
i=1 k=1 =1 (yi,k − Y )(yi, − Y )
=k 1
ρ= 7M 7 ,
i=1 k∈Ui (yi,k −Y )2 N −1

where
N
N=.
M
With simple random sampling without replacement at each of the two stages
and with the PU of same size, we show that

Sy2
var(Y ) = N 2 (1 + ρ(n̄ − 1))
mn̄
as soon as ni = n̄ for all PU i (and that we neglect the sampling rate of PU).
The cluster eﬀect increases the variance, especially since n̄ is large.

, 162 5 Multi-stage Sampling

EXERCISES
Exercise 5.1 Hard disk
On a micro-computer hard disk, we count 400 ﬁles, each one consisting of
exactly 50 records. To estimate the average number of characters per record,
we decide to sample using simple random sampling 80 ﬁles, then 5 records in
each ﬁle. We denote: m = 80 and n = 5. After sampling we ﬁnd:
• the sample variance of the estimators for the total number of characters
per ﬁle, which is s2T = 905 000 ;
• the mean of the m sample variances s22,i is equal to 805, where s22,i repre-
sents the variance for the number of characters per record in ﬁle i.
1. How do we estimate without bias the mean number Y of characters per
record?
2. How do we estimate without bias the accuracy of the previous estimator?
3. Give a 95% conﬁdence interval for Y .

Solution

1. We denote yi,k as the number of characters in record k of ﬁle i. We have

1 1 1
M M M
Y = yi,k = N Yi = Y i,
N i=1 N i=1 M i=1
k∈Ui

where
• M = 400 is the number of ﬁles (primary units),
• N = 50 is the number of records per ﬁle,
• N = M × N = 400 × 50 = 20000 is the total number of records,
• Y i is the mean number of characters per record in ﬁle i,
• Ui is the set of identiﬁers for the records of ﬁle i.
We estimate Y without bias by

Y 1 Yi
Y = = ,
N N m/M
i∈S1

where
• S1 is the sample of ﬁles,
• Yi is the unbiased estimator of the total number of characters in ﬁle i
yi,k N
Yi = = yi,k ,
n̄/N n̄
k∈Si k∈S i

Report Copyright Violation

Written for

Institution: Universidad De Santiago De Chile
Course: Muestreo

All documents for this subject (8)

Document information

Uploaded on: January 26, 2022
File latest updated on: January 26, 2022
Number of pages: 49
Written in: 2021/2022
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

sampling
solved
statistics
multi stage sampling

$7.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

juanitomakambe

Get to know the seller

juanitomakambe Universidad de Santiago de Chile

View profile

Sold

Member since

4 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller juanitomakambe. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 49593 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Sampling Solved Exercises 4 (Multi-Stage Sampling)

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?