The Statistical Analysis of Doubly Truncated Data. Prof Carla Moreira
EqSRounded
in the DTDA
package contains this dataset, which is used in Chapter 2.
1.4.4 Quasar Data
A classical motivating example of doubly truncated data, introduced by Efron and Petrosian (1999), is found in cosmology when registering the luminosity of quasars. Quasars are observed only if the luminosity lies within a certain interval, bounded at both ends that are determined by detection limits of observation devices, so the data suffer from double truncation. The original dataset studied by Efron and Petrosian (1999) comprises
triplets , where is the luminosity in the log‐scale, obtained from a transformation model based on the redshift and the apparent magnitude of the th quasar. See Efron and Petrosian (1999) for further details on the transformation model. Due to experimental constraints, the distribution of each luminosity in the log‐scale is truncated to a known interval . Specifically, quasars with apparent magnitude above were too dim to yield dependent redshifts, and hence they were excluded from the study. In addition, the lower limit was used to avoid confusion with non‐quasar stellar objects. Some descriptive statistics are provided in Table 1.4.Table 1.3 Years to failure and number of failing units for the Equipment‐ S
Rounded Failure Time Data.
Years: | 0–4 | 5–9 | 10–14 | 15–19 | 20–24 | 25–29 | 30–34 |
---|---|---|---|---|---|---|---|
N. units: | 1 | 26 | 26 | 51 | 44 | 14 | 12 |
Table 1.4 Descriptive statistics for the Quasar Data. Luminosity in log‐scale (
) and observation interval .Variable | Min | Q1 | Q2 | Mean | Q3 | Max |
---|---|---|---|---|---|---|
X | 0.39 | 0.24 | 0.71 | 2.08 | ||
U | 0.26 | 0.75 | ||||
V | 0.15 | 1.78 | 2.10 | 1.95 | 2.36 | 2.58 |
The Quasar Data are used in Chapter 3. This classical example is also included in the DTDA
package (dataset Quasars
).
1.4.5 Parkinson's Disease Data
Clark et al. (2011) investigated the association of candidate single nucleotide polymorphisms (SNPs) and age of onset of Parkinson's disease (PD). For this, genomic DNA samples from human blood samples were obtained from the National Institute of Neurological Disorders and Stroke (NINDS) Human Genetics DNA and Cell Line Repository at the Coriell Institute for Medical Research (Camden, New Jersey). More specifically, one aim of the study was to detect association between the rs8192678 PGC‐1a and A10398G SNPs and the risk or age of onset of PD.
Table 1.5 Parkinson's Disease Data: age of onset for genetic groups. Early onset: 35–55 years; late onset: 63–87 years. Sample size
and mean (and standard deviation, SD) for the age of onset (years).Group | SNP | Alleles | Mean (SD) | |
---|---|---|---|---|
Early onset | A10398G | A | 76 | 46.93 |
G | 21 | 47.14 | ||
PGC‐1a | A | 8 | 50.00 | |
|