Скачать книгу considered so far were constructed in such a way that their points were observable. Thus, for any event , we were always able to tell whether it occurred or not.
The following examples show experiments and corresponding sample spaces with sample points that are only partially observable:
Example 1.8 Selection
Candidates for a certain job are characterized by their level of skills required for the job. The actual value of is not observable, though; what we observe is the candidate's score on a certain test. Thus, the sample point in is a pair , and only one coordinate of , , is observable.
The objective might be to find selection thresholds and , such that the rule: “accept all candidates whose score exceeds ” would lead to maximizing the (unobservable) number of persons accepted whose true level of skill exceeds . Naturally, to find such a solution, one needs to understand statistical relation between observable and unobservable .
Another example when the points in the sample space are only partially observable concerns studies of incidence of activities about which one may hesitate to respond truthfully, or even to respond at all. These are typically studies related to sexual habits or preferences, abortion, law and tax violation, drug use, and so on.
Let be the activity analyzed, and assume that the researcher is interested in the frequency of persons who ever participated in activity (for simplicity, we will call them ‐persons). It ought to be stressed that the objective is not to identify the ‐persons, but only to find the proportion of such persons in the population.
The direct question reduced to something like “Are you a ‐person?” is not likely to be answered truthfully, if at all. It is therefore necessary to make the respondent safe, guaranteeing that their responses will reveal nothing about them as regards . This can be accomplished as follows: The respondent is given a pair of distinguishable dice, for example, one green and one white. She throws them both at the same time, in such a way that the experimenter does not know the results of the toss (e.g., the dice are in a box and only the respondent looks into the box after it is shaken). The instruction is the following: If the green die shows an odd face (1, 3, or 5), then respond to the question “Are you a ‐person?” If the green die shows an even face (2, 4, or 6), then respond to the question, “Does the white die show an ace?” The scheme of this response is summarized by the flowchart in Figure 1.4.
The interviewer knows the answer “yes” or “no” but does not know whether it is the answer to the question about or the question about the white die. Here a natural sample space consists of points where and are outcomes on green and white die, respectively, while is 1 or 0 depending on whether or not the respondent is a ‐person. We have = “yes” if and or 5 for any , or if and for any . In all other cases, “no.”
One could wonder what is a possible advantage, if any, of not knowing the question asked and observing only the answer. This does not make sense if we need to know the truth about each individual respondent. However, one should remember that we are only after the overall frequency of ‐persons.
We are in fact “contaminating” the question by making the respondent answer either a ‐question or some other auxiliary question. But this is a “controlled contamination”: we know how often (on average) the respondents answer the auxiliary question, and how often their answer is “yes.” Consequently, as we will see in Chapter 11, we can still make an inference about the proportion of ‐persons based on the observed responses.