Applied Regression Modeling. Iain Pardoe

Applied Regression Modeling

Скачать книгу

alt="images"/> is normal with mean 280 and standard deviation images

. Then the standardized images

‐value from images

is standard normal with mean 0 and standard deviation 1. From the normal table in Section 1.2, the 90th percentile of a standard normal random variable is 1.282 (since the horizontal axis value of 1.282 corresponds to an upper‐tail area of 0.1). Then

Thus, the 90th percentile of the sampling distribution of images is images (to the nearest images ). In other words, under repeated sampling, images has a distribution with an area of 0.90 to the left of images (and an area of 0.10 to the right of images ). This illustrates a crucial distinction between the distribution of population images ‐values and the sampling distribution of images —the latter is much less spread out. For example, suppose for the sake of argument that the population distribution of images is normal (although this is not actually required for the central limit theorem to work). Then we can do a similar calculation to the one above to find the 90th percentile of this distribution (normal with mean 280 and standard deviation 50). In particular,

Thus, the 90th percentile of the population distribution of images is images (to the nearest images ). This is much larger than the value we got above for the 90th percentile of the sampling distribution of images ( images ). This is because the sampling distribution of images is less spread out than the population distribution of images —the standard deviations for our example are 9.129 for the former and 50 for the latter. Figure 1.5 illustrates this point.

Graph depicts the central limit theorem in action. The upper density curve (a) shows a normal population distribution for Y with mean 280 and standard deviation 50: the shaded area is 0.10, which lies to the right of the 90th percentile, 344.100. The lower density curve (b) shows a normal sampling distribution for MY with mean 280 and standard deviation 9.129: the shaded area is also 0.10, which lies to the right of the 90th percentile, 291.703. It is not necessary for the population distribution of Y to be normal for the central limit theorem to work—we have used a normal population distribution here just for the sake of illustration.

Figure 1.5 The central limit theorem in action. The upper density curve (a) shows a normal population distribution for images with mean images and standard deviation images : the shaded area is images , which lies to the right of the images th percentile, images . The lower density curve (b) shows a normal sampling distribution for images with mean images and standard deviation images : the shaded area is also images , which lies to the right of the images th percentile, images . It is not necessary for the population distribution of images to be normal for the central limit theorem to work—we have used a normal population distribution here just for the sake of illustration.

We can again turn these calculations around. For example, what is the probability that images is greater than 291.703? To answer this, consider the following calculation:

So, the probability that images is greater than 291.703 is 0.10.

1.4.2 Central limit theorem—t‐version

One major drawback to the normal version of the central limit theorem is that to use it we have to assume that we know the value of the population standard deviation, images . A generalization of the standard normal distribution called Student's t‐distribution solves this problem. The density curve for a t‐distribution looks very similar to a normal density curve, but the tails tend to be a little “thicker,” that is, t‐distributions are a little more spread out than the normal distribution. This “extra variability” is controlled by an integer number called the degrees of freedom. The smaller this number, the more spread out the t‐distribution density curve (conversely, the higher the degrees of freedom, the more like a normal density curve it looks).

For example, the following table shows critical values (i.e., horizontal axis values or percentiles) and tail areas for a t‐distribution with 29 degrees of freedom: Probabilities (tail areas) and percentiles (critical values) for a t‐distribution with images degrees of freedom.

Upper‐tail area

0.1

0.05

Скачать книгу