Sampling and Estimation from Finite Populations. Yves Tille
unit
and is unbiased for
The variance is
If
Indeed,
There are two other possibilities for estimating the total without bias. To do this, we use the reduction function
(2.8)
This function removes the multiplicity of units in the sense that units selected more than once in the sample are kept only once.
We then write
and
By keeping only the distinct units, we can then simply use the expansion estimator:
Obviously, if the design with replacement is of fixed sample size, in other words, if
the sample of distinct units does not necessarily have a fixed sample size. The expansion estimator is not necessarily more accurate than the Hansen–Hurwitz estimator.
A third solution consists of calculating the so‐called Rao–Blackwellized estimator. Without going into the technical details, it is possible to show that in the design‐based theory, the minimal sufficient statistic can be constructed by removing the information concerning the multiplicity of units. In other words, if a unit is selected several times in the sample with replacement, it is conserved only once (Basu & Ghosh, 1967; Basu, 1969; Cassel et al., 1977, 1993; Thompson & Seber, 1996, p. 35). Knowing a minimal sufficient statistic, one can then calculate the augmented estimator (also called the Rao–Blackwellized estimator) by conditioning an estimator with respect to the minimal sufficient statistic.
Concretely, we calculate the conditional expectation
(2.9)
This estimator is unbiased because
and
we have
The Hansen–Hurwitz estimator should therefore in principle never be used. It is said that the Hansen–Hurwitz estimator is not admissible in the sense that it can always be improved by calculating its conditional expectation. However, this conditional expectation can sometimes be very complex to calculate. Rao–Blackwellization is at the heart of the theory of adaptive sampling, which can lead to multiple selections of the same unit in the sample (Thompson, 1990; Félix‐Medina, 2000; Thompson, 1991a; Thompson & Seber, 1996).
Exercises
1 2.1 Show that
2 2.2 Let be a population with the following sampling design:Give the first‐order inclusion probabilities. Give the variance–covariance matrix of the indicator variables.
3 2.3 Let and have the following sampling design:Give the probability distributions of the expansion estimator and the Hájek estimator of the mean. Give the probability distributions of the two variance estimators of the expansion estimator and calculate their bias.Give the probability distributions of the two variance estimators of the expansion estimator of the mean in the case where .
4 2.4 Let be a sampling design without replacement applied to a population of size . Let and denote the first‐ and second‐order inclusion probabilities, respectively, and is the random sample. Consider the following estimator:For which function of interest is this estimator unbiased?
5 2.5 For a design without replacement with strictly positive inclusion probabilities, construct an unbiased estimator for .
6 2.6 Let be a finite population and let be the random sample of obtained by means of a design with inclusion probabilities and We suppose that this design is balanced on a variable . In other words,(2.10) The total of the variable of interest isand can be unbiasedly estimated byShow that(2.11) What particular result do we obtain when ?Show that(2.12) What result is generalized by Expression (2.12)?Construct