Exam (elaborations)

Sampling Solved Exercises 5 (Calibration with Auxiliary Variable)

Rating

Sold

Pages

Grade

A+

Uploaded on

26-01-2022

Written in

2021/2022

Sampling Solved Exercises 5 (Calibration with Auxiliary Variable)

Institution

Course

Content preview

6
Calibration with an Auxiliary Variable

6.1 Calibration with a qualitative variable

We assume that the sizes Nh , where h = 1, ..., H, of H types of a qualitative
variable are known in the population. The qualitative variable speciﬁes H
parts Uh , where h = 1, ..., H, called post-strata in the population. If the sample
S is selected in accordance with a simple design without replacement, then
the size of the sample intersecting post-strata h, being nh = #(Uh ∩ S) has a
hypergeometric distribution. If we denote Yh as the true total of a variable y
over Uh , we can construct the post-stratiﬁed estimator of the total

H
Ypost = Nh Y h ,
h=1
nh >0

where Y h = Yh /N
h . With a simple design without replacement,

Y h =
1
yk .
nh
k∈Uh ∩S

With a simple design without replacement, the post-stratiﬁed estimator is
unbiased as soon as we keep to the conditions of nh non-null for all h, and it
is all the more precise since the auxiliary variable is ‘linked’ to the variable of
interest. If n is ‘large enough’, the variance of Ypost is approximately, for the
simple design without replacement:

var(Ypost )
H
n1 n 1
H
Nh 2 Nh
≈ N2 1− S + 1− 1− 2
Syh ,
N n N yh N n2 N
h=1 h=1

,210 6 Calibration with an Auxiliary Variable

and is estimated by
Ypost )
var(
H
n1 n 1
H
Nh 2 Nh
=N 2
1− s + 1− 1− s2yh ,
N n N yh N n2 N
h=1 h=1

where
1
2
Syh = (yk − Y h )2 ,
Nh − 1
k∈Uh

and
(yk − Y h )2 .
1
s2yh =
nh − 1
k∈Uh ∩S

6.2 Calibration with a quantitative variable
If the total X of a quantitative variable x is known, we can use this information
to construct a more precise estimator. If X π and Yπ designate respectively the
Horvitz-Thompson estimators of the totals of variables x and y, then we can
construct
• the diﬀerence estimator:
YD = Yπ + X − X
π ,

• the ratio estimator:
X
YR = Yπ ,
π
X
• the regression estimator:
Yreg = Yπ + (X − X
π )b̂,

where b̂ is an estimator of the aﬃne regression coeﬃcient of y over x:
Sxy
b= ,
Sx2
and
1
Sxy = (xk − X)(yk − Y ).
N −1
k∈U

We can choose, to estimate b:

1 Xπ Yπ
xk − yk −
πk Nπ Nπ
k∈S
b̂ = 2 .
1 π
X
xk −
πk π
N
k∈S

, Exercise 6.1 211

All of these estimators satisfy a fundamental property of calibration, as they
estimate with null variance the total X (we are speaking about estimators
calibrated on x):
XD = X R = X
reg = X.

We can show that:

yk − xk

• var(YD ) = var ,
πk
⎛k∈S
⎞
Y
yk − xk
⎜ X ⎟
• var(YR ) ≈ var ⎝ ⎠ (n ‘large enough’),
πk
k∈S

(yk − Y ) − b(xk − X)

• var(Yreg ) ≈ var (n ‘large enough’),
πk
k∈S

which comes back to using the general expressions of Chapter 3 with new
individual variables. Thus, with simple random sampling, we estimate these
variances with:
n1 1
N2 1 − (yk − α − βxk )2 ,
N nn−1
k∈S

by holding:
• α = 0, β = 1 with YD ;
Y
• α = 0, β = with YR ;
X

(xk − X)(y k −Y)

α = Y − b X,
β = b = with Yreg .
k∈S
• 2
(xk − X)
k∈S

EXERCISES
Exercise 6.1 Ratio
In a population of 10 000 businesses, we want to estimate the average sales
Y . For that, we sample n = 100 businesses using simple random sampling.
Furthermore, we have at our disposal the auxiliary information ‘number of
employees’, denoted by x, for each business. The data coming from the sample
are:
• X = 50 employees (true mean for xk ),
• Y = 5.2 × 106 Euros (average sales in the sample),

, 212 6 Calibration with an Auxiliary Variable

• X = 45 employees (sample mean),
• s2y = 25 × 1010 (corrected sample variance of yk ),
• s2x = 15 (corrected sample variance of xk ),
• ρ = 0.80 (linear correlation coeﬃcient between x and y calculated in the
sample).

1. What is the ratio estimator? (We denote this as Y R .) Is this estimator
biased?
2. Recall the ‘true’ variance formula for this estimator.
3. Calculate an estimate of the true variance. Is the variance estimator used
biased?
4. Give a 95% conﬁdence interval for Y .

Solution

1. By deﬁnition:

Y 5.2 × 106
Y R = X = 50 × ≈ 5.8 × 106 Euros.
X 45

We have Y R > Y because the sample contains businesses that are on
average too small (in terms of employees), and thus with sales that are
a little bit too small. A priori, the estimator is biased: the 1/n term
appearing in the bias is null when
Sx Sy
=ρ .
X Y
None of the terms of this equality can be estimated without bias, but a
calculation of magnitudes (bias 1/n) compares:
sx sy
≈ 0.086 and ρ̂ ≈ 0.077.

X Y
Numerically, they are close values, which lets us think that the bias must
be very small.
2. For n ‘large’, we have:
1−f 2 1−f
var(Y R ) ≈ Su = [Sy2 + R2 Sx2 − 2RSxy ].
n n
Su2 is the population variance of ui , where ui = yi − Rxi with R = Y /X.
3. We have
1 − f 2 2 2
Y R ) =
var( [sy + R sx − 2Rs xy ].
n

Report Copyright Violation

Written for

Institution: Universidad De Santiago De Chile
Course: Muestreo

All documents for this subject (8)

Document information

Uploaded on: January 26, 2022
File latest updated on: January 26, 2022
Number of pages: 53
Written in: 2021/2022
Type: Exam (elaborations)
Contains: Questions & answers

Subjects

sampling
statistics
solved

$7.99

Get access to the full document:

Written by students who passed

Immediately available after payment

Read online or as PDF

Get to know the seller

juanitomakambe

Get to know the seller

juanitomakambe Universidad de Santiago de Chile

View profile

Sold

Member since

4 year

Number of followers

Documents

Last sold

0.0

0 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller juanitomakambe. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 49586 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 16 years now

Sampling Solved Exercises 5 (Calibration with Auxiliary Variable)

Content preview

Written for

Document information

Subjects

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Working on your references?

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?