Tentamen (uitwerkingen)

Sampling Solved Exercises 8 (Treatment of Non-response)

Beoordeling

Verkocht

Pagina's

Cijfer

A+

Geüpload op

26-01-2022

Geschreven in

2021/2022

Sampling Solved Exercises 8 (Treatment of Non-response)

Instelling

Vak

Voorbeeld van de inhoud

9
Treatment of Non-response

Non-response is an inevitable phenomenon in surveys. We distinguish total
non-response, which aﬀects individuals for which we do not have available any
workable collected information, and partial non-response, which corresponds
to ‘holes’ in the information collected for a given individual (certain variables
yk are known, but others are not). In all cases, this phenomenon generates a
bias and increases the variance that varies more or less explicitly as a function
of the inverse of the sample size of the respondents. There exist two large
classes of methods to correct the non-response: reweighting and imputation.

9.1 Reweighting methods
We denote φk as the probability of response of individual k: this entire ap-
proach rests on the idea that the decision of whether or not to respond is
random and is formalised by a probability, which we consider here, to sim-
plify, that it only depends on individual k (indeed, it could very likely depend
on the set of identiﬁers sampled). If φk is known, before an eventual calibra-
tion, we estimate without bias the total Y by:
yk
Yφ = ,
πk φk
k∈r

where πk is the regular inclusion probability, and r indicates the sample of re-
spondents (r ⊂ S). In practice, we try to model the probability φk (unknown)
to be able to estimate it subsequently. The leads are then multiple, but often
we try to partition the population U into sub-populations Uc inside of which
the φk are supposedly constant:

φk = φc when k ∈ Uc .

We are speaking of a homogeneous response model. We can also model φk by a
logistic function (for example) if we have available quantitative or qualitative

,320 9 Treatment of Non-response

auxiliary information that is suﬃciently reliable. Reweighting is essentially
used to treat total non-response.

9.2 Imputation methods
Contrary to the case of the method of reweighting, we directly model the
behaviour yk by using a vector of auxiliary information xk . For example, we
denote (model called ‘superpopulation’):
yk = ψ(xk b) + zk ,
where ψ is a known function and zk is a random variable of null expected
value and variance σ 2 . We use the information on the respondents to estimate
b and σ 2 and we predict yk , for each non-respondent k, with yk∗ . Lastly, we
calculate:
yk y∗
YI = + k
,
πk πk
k∈r k∈S
k∈r
/
which allows for the conservation of the initial weights. If, within any sub-
population, we believe in the model yk = b + zk , we can impute yk∗ = y ,
where is an identiﬁer selected at random in the respondent sub-population:
this is a technique called ‘hot deck’. The study of the quality of YI is performed
by bringing into play the random variable zk . Imputation is essentially used
to treat partial non-response.

EXERCISES
Exercise 9.1 Weight of an aeroplane
We wish to estimate the total weight of 250 passengers on a charter ﬂight. For
that, we select a simple random sample of 25 people for whom we intend to
ask their height (in centimetres) and their weight (in kilograms). Five people
refuse to respond, but we can all the same note their gender (1: male and 2:
female). Among the others, ﬁve have given their height but did not want to
say their weight. The collected data is ﬁnally presented in Table 9.1.
1. What methods can we use to correct the eﬀects of non-response? Justify
your decisions in a precise way, by explaining the models that you use.
Perform the numerical applications.
2. You learn that 130 passengers are men and 120 are women. Would you
modify your estimation method? Why?
3. Among the 10 non-responses for weight, we select a simple random sample
comprised of individuals b, g, w, x. Using a particularly persuasive inter-
viewer, we get them to admit their height and their weight. This com-
plementary information is given in Table 9.2. How can we take this into
consideration?

, Exercise 9.1 321

Table 9.1. Sample of 25 selected individuals: Exercise 9.1
Individual Gender Height Weight
a 1 170 60
b 1 170
c 1 180 70
d 1 190 80
e 1 190 80
f 1 170 70
g 1 170
h 1 180 80
i 1 180 80
j 1 180 80
k 1 180
l 1 190
m 1 190 90
n 2 150 40
o 2 160 50
p 2 170 60
q 2 150 50
r 2 160 60
s 2 180 70
t 2 180
u 1
v 1
w 2
x 2
y 2

Table 9.2. Complementary information for four individuals: Exercise 9.1
Individual Gender Weight Height
b 1 80 170
g 1 100 170
w 2 90 180
x 2 60 150

Solution

1. Two types of non-response appear: total non-response for individuals u to
y and partial non-response for b, g, k, l and t. The total non-response is
treated in general by modifying the weights of the respondents (technique
of ‘reweighting’). Since only the gender variable is known, we can con-
struct, at best, cells based on the gender variable. To justify this practice,
we can have two points of view:

, 322 9 Treatment of Non-response

• A ‘probabilistic’ point of view, which postulates that the non-respon-
dents of one given gender in fact account for a simple random sub-
sample of the initially selected sample (gender by gender), whose size
is equal to the number of respondents for the gender considered. A
second approach, equivalent in terms of the estimator, depends on a
Bernoulli type of response model: all individuals of a given gender have
the same probability of response, estimated by the response rate char-
acterising the gender (maximum likelihood estimator). A third way,
equivalent in terms of the estimator, of adhering to this point of view,
consists of saying that, conditionally on the gender, the weight variable
and the ‘response’ variable are independent (the fact of deciding not to
respond does not depend on the weight). With these three approaches,
the reweighting estimator is:
nh
Y φ = Y hr ,
n
h=1,2

where nh is the number of selected people of gender h (h = 1, 2) and
Y hr is the average weight of the respondents of gender h. If we treat
the partial non-responses as total non-responses, it is theoretically un-
biased if the probabilistic model is exact.
• A more ‘modellistic’ point of view, which is less interested in the pro-
cess of selecting the non-respondents but which postulates a statistical
model of type:
yhi = µh + εhi ,
where yhi is the weight of individual i of gender h, µh is ‘mean’ of the
weight characteristic of gender h and εhi is a random variable whose
expected value is 0 (it is a classical approach in statistics: everything
happens as if a random process had generated yhi according to this
model). The estimator is still Y φ , but this time we are interested in
its expected value under the model:
nh nh
E(Y φ ) = E(Y hr ) = µh .
n n
h=1,2 h=1,2

Therefore,
Nh
E E(Y φ ) = µh = E(Y ) = E E(Y ).
N
h=1,2

We have E E(Y φ − Y ) = 0, and therefore Y φ remains ‘unbiased’ if we
bring into play the expected value under the model.
The partial non-response is treated in general by imputation, using a
behaviour model. In every case, we use the auxiliary information given by
the variable ‘size’, which is strongly linked to weight. To treat the partial

Meld schending auteursrecht

Geschreven voor

Instelling: Universidad De Santiago De Chile
Vak: Muestreo

Alle documenten voor dit vak (8)

Documentinformatie

Geüpload op: 26 januari 2022
Aantal pagina's: 42
Geschreven in: 2021/2022
Type: Tentamen (uitwerkingen)
Bevat: Vragen en antwoorden

Onderwerpen

sampling
statistics

$7.99

Krijg toegang tot het volledige document:

Geschreven door studenten die geslaagd zijn

Direct beschikbaar na je betaling

Online lezen of als PDF

Maak kennis met de verkoper

juanitomakambe

Maak kennis met de verkoper

juanitomakambe Universidad de Santiago de Chile

Bekijk profiel

Volgen

Verkocht

Lid sinds

4 jaar

Aantal volgers

Documenten

Laatst verkocht

0.0

0 beoordelingen

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Je krijgt een PDF, die direct beschikbaar is na je aankoop. Het gekochte document is altijd, overal en oneindig toegankelijk via je profiel.

Tevredenheidsgarantie: hoe werkt dat?

Onze tevredenheidsgarantie zorgt ervoor dat je altijd een studiedocument vindt dat goed bij je past. Je vult een formulier in en onze klantenservice regelt de rest.

Van wie koop ik deze samenvatting?

Stuvia is een marktplaats, je koop dit document dus niet van ons, maar van verkoper juanitomakambe. Stuvia faciliteert de betaling aan de verkoper.

Zit ik meteen vast aan een abonnement?

Nee, je koopt alleen deze samenvatting voor $7.99. Je zit daarna nergens aan vast.

Is Stuvia te vertrouwen?

4,6 sterren op Google & Trustpilot (+1000 reviews) Afgelopen 30 dagen zijn er 49593 samenvattingen verkocht Opgericht in 2010, al 16 jaar dé plek om samenvattingen te kopen

Sampling Solved Exercises 8 (Treatment of Non-response)

Voorbeeld van de inhoud

Geschreven voor

Documentinformatie

Onderwerpen

Maak kennis met de verkoper

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Niet tevreden? Kies een ander document

Betaal zoals je wilt, start meteen met leren

Bezig met je bronvermelding?

Veelgestelde vragen

Wat krijg ik als ik dit document koop?

Tevredenheidsgarantie: hoe werkt dat?

Van wie koop ik deze samenvatting?

Zit ik meteen vast aan een abonnement?

Is Stuvia te vertrouwen?