Geschreven door studenten die geslaagd zijn Direct beschikbaar na je betaling Online lezen of als PDF Verkeerd document? Gratis ruilen 4,6 TrustPilot
logo-home
Tentamen (uitwerkingen)

Sampling Solved Exercises 8 (Treatment of Non-response)

Beoordeling
-
Verkocht
-
Pagina's
42
Cijfer
A+
Geüpload op
26-01-2022
Geschreven in
2021/2022

Sampling Solved Exercises 8 (Treatment of Non-response)

Instelling
Vak

Voorbeeld van de inhoud

9
Treatment of Non-response




Non-response is an inevitable phenomenon in surveys. We distinguish total
non-response, which affects individuals for which we do not have available any
workable collected information, and partial non-response, which corresponds
to ‘holes’ in the information collected for a given individual (certain variables
yk are known, but others are not). In all cases, this phenomenon generates a
bias and increases the variance that varies more or less explicitly as a function
of the inverse of the sample size of the respondents. There exist two large
classes of methods to correct the non-response: reweighting and imputation.


9.1 Reweighting methods
We denote φk as the probability of response of individual k: this entire ap-
proach rests on the idea that the decision of whether or not to respond is
random and is formalised by a probability, which we consider here, to sim-
plify, that it only depends on individual k (indeed, it could very likely depend
on the set of identifiers sampled). If φk is known, before an eventual calibra-
tion, we estimate without bias the total Y by:
 yk
Yφ = ,
πk φk
k∈r

where πk is the regular inclusion probability, and r indicates the sample of re-
spondents (r ⊂ S). In practice, we try to model the probability φk (unknown)
to be able to estimate it subsequently. The leads are then multiple, but often
we try to partition the population U into sub-populations Uc inside of which
the φk are supposedly constant:

φk = φc when k ∈ Uc .

We are speaking of a homogeneous response model. We can also model φk by a
logistic function (for example) if we have available quantitative or qualitative

,320 9 Treatment of Non-response

auxiliary information that is sufficiently reliable. Reweighting is essentially
used to treat total non-response.


9.2 Imputation methods
Contrary to the case of the method of reweighting, we directly model the
behaviour yk by using a vector of auxiliary information xk . For example, we
denote (model called ‘superpopulation’):
yk = ψ(xk b) + zk ,
where ψ is a known function and zk is a random variable of null expected
value and variance σ 2 . We use the information on the respondents to estimate
b and σ 2 and we predict yk , for each non-respondent k, with yk∗ . Lastly, we
calculate:
 yk  y∗
YI = + k
,
πk πk
k∈r k∈S
k∈r
/
which allows for the conservation of the initial weights. If, within any sub-
population, we believe in the model yk = b + zk , we can impute yk∗ = y ,
where  is an identifier selected at random in the respondent sub-population:
this is a technique called ‘hot deck’. The study of the quality of YI is performed
by bringing into play the random variable zk . Imputation is essentially used
to treat partial non-response.


EXERCISES
Exercise 9.1 Weight of an aeroplane
We wish to estimate the total weight of 250 passengers on a charter flight. For
that, we select a simple random sample of 25 people for whom we intend to
ask their height (in centimetres) and their weight (in kilograms). Five people
refuse to respond, but we can all the same note their gender (1: male and 2:
female). Among the others, five have given their height but did not want to
say their weight. The collected data is finally presented in Table 9.1.
1. What methods can we use to correct the effects of non-response? Justify
your decisions in a precise way, by explaining the models that you use.
Perform the numerical applications.
2. You learn that 130 passengers are men and 120 are women. Would you
modify your estimation method? Why?
3. Among the 10 non-responses for weight, we select a simple random sample
comprised of individuals b, g, w, x. Using a particularly persuasive inter-
viewer, we get them to admit their height and their weight. This com-
plementary information is given in Table 9.2. How can we take this into
consideration?

, Exercise 9.1 321

Table 9.1. Sample of 25 selected individuals: Exercise 9.1
Individual Gender Height Weight
a 1 170 60
b 1 170
c 1 180 70
d 1 190 80
e 1 190 80
f 1 170 70
g 1 170
h 1 180 80
i 1 180 80
j 1 180 80
k 1 180
l 1 190
m 1 190 90
n 2 150 40
o 2 160 50
p 2 170 60
q 2 150 50
r 2 160 60
s 2 180 70
t 2 180
u 1
v 1
w 2
x 2
y 2


Table 9.2. Complementary information for four individuals: Exercise 9.1
Individual Gender Weight Height
b 1 80 170
g 1 100 170
w 2 90 180
x 2 60 150



Solution

1. Two types of non-response appear: total non-response for individuals u to
y and partial non-response for b, g, k, l and t. The total non-response is
treated in general by modifying the weights of the respondents (technique
of ‘reweighting’). Since only the gender variable is known, we can con-
struct, at best, cells based on the gender variable. To justify this practice,
we can have two points of view:

, 322 9 Treatment of Non-response

• A ‘probabilistic’ point of view, which postulates that the non-respon-
dents of one given gender in fact account for a simple random sub-
sample of the initially selected sample (gender by gender), whose size
is equal to the number of respondents for the gender considered. A
second approach, equivalent in terms of the estimator, depends on a
Bernoulli type of response model: all individuals of a given gender have
the same probability of response, estimated by the response rate char-
acterising the gender (maximum likelihood estimator). A third way,
equivalent in terms of the estimator, of adhering to this point of view,
consists of saying that, conditionally on the gender, the weight variable
and the ‘response’ variable are independent (the fact of deciding not to
respond does not depend on the weight). With these three approaches,
the reweighting estimator is:
 nh 
Y φ = Y hr ,
n
h=1,2

where nh is the number of selected people of gender h (h = 1, 2) and
Y hr is the average weight of the respondents of gender h. If we treat
the partial non-responses as total non-responses, it is theoretically un-
biased if the probabilistic model is exact.
• A more ‘modellistic’ point of view, which is less interested in the pro-
cess of selecting the non-respondents but which postulates a statistical
model of type:
yhi = µh + εhi ,
where yhi is the weight of individual i of gender h, µh is ‘mean’ of the
weight characteristic of gender h and εhi is a random variable whose
expected value is 0 (it is a classical approach in statistics: everything
happens as if a random process had generated yhi according to this
model). The estimator is still Y φ , but this time we are interested in
its expected value under the model:
 nh  nh
E(Y φ ) = E(Y hr ) = µh .
n n
h=1,2 h=1,2

Therefore,
 Nh
E E(Y φ ) = µh = E(Y ) = E E(Y ).
N
h=1,2


We have E E(Y φ − Y ) = 0, and therefore Y φ remains ‘unbiased’ if we
bring into play the expected value under the model.
The partial non-response is treated in general by imputation, using a
behaviour model. In every case, we use the auxiliary information given by
the variable ‘size’, which is strongly linked to weight. To treat the partial

Geschreven voor

Instelling
Vak

Documentinformatie

Geüpload op
26 januari 2022
Aantal pagina's
42
Geschreven in
2021/2022
Type
Tentamen (uitwerkingen)
Bevat
Vragen en antwoorden

Onderwerpen

$7.99
Krijg toegang tot het volledige document:

Verkeerd document? Gratis ruilen Binnen 14 dagen na aankoop en voor het downloaden kun je een ander document kiezen. Je kunt het bedrag gewoon opnieuw besteden.
Geschreven door studenten die geslaagd zijn
Direct beschikbaar na je betaling
Online lezen of als PDF

Maak kennis met de verkoper
Seller avatar
juanitomakambe

Maak kennis met de verkoper

Seller avatar
juanitomakambe Universidad de Santiago de Chile
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
-
Lid sinds
4 jaar
Aantal volgers
0
Documenten
23
Laatst verkocht
-

0.0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Bezig met je bronvermelding?

Maak nauwkeurige citaten in APA, MLA en Harvard met onze gratis bronnengenerator.

Bezig met je bronvermelding?

Veelgestelde vragen