Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP. Bhisham C. Gupta
rel="nofollow" href="#fb3_img_img_2e48ff2f-e966-5dfe-9002-ab018831d178.png" alt="images"/> (read as sigma squared), is defined as
(2.5.6)
Further the sample variance, denoted by
(2.5.7)
For computational purposes, we give below the simplified forms for the population variance and the sample variances.
(2.5.8)
Note that one difficulty in using the variance as the measure of dispersion is that the units for measuring the variance are not the same as those for data values. Rather, variance is expressed as a square of the units used for the data values. For example, if the data values are dollar amounts, then the variance will be expressed in squared dollars. Therefore, for application purposes, we define another measure of dispersion, called the standard deviation, that is directly related to the variance. We note that the standard deviation is measured in the same units as used for the data values (see (2.5.10) and (2.5.11) given below).
Standard Deviation
A standard deviation is obtained by taking the positive square root (with positive sign) of the variance. The population standard deviation
Example 2.5.10 (Lengths of certain chips) The following data give the length (in millimeters) of material chips removed during a machining operation:
4, 2, 5, 1, 3, 6, 2, 4, 3, 5
Determine the variance and the standard deviation for these data.
Solution: There are three simple steps to calculate the variance of any data set.
1 Step 1. Calculate , the sum of all the data values, that is,
2 Step 2. Calculate , the sum of squares of all the observations, that is,
3 Step 3. Since the sample size is , by inserting the values and , calculated in Step 1 and Step 2 in formula (2.5.9), the sample variance is given by
The standard deviation is obtained by taking the square root of the variance, that is
Note: It is important to remember the value of
Empirical Rule
We now illustrate how the standard deviation of a data set helps us measure the variability of the data. If the data have a distribution that is approximately bell‐shaped, the following rule, known as the empirical rule, can be used to compute the percentage of data that will fall within
1 About 68% of the data will fall within one standard deviation of the mean, that is, between and .
2 About 95% of the data will fall within two standard deviations of the mean, that is, between and .
3 About 99.7% of the data will fall within three standard deviations of the mean, that is, between and .
Figure 2.5.3 illustrates these features of the empirical rule.
Figure 2.5.3 Application of the empirical rule.
For the case where μ and σ are unknown, the empirical rule is of the same form, but
Example 2.5.11 (Soft drinks) A soft‐drink filling machine is used to fill 16‐oz soft‐drink bottles. The amount of beverage slightly varies from bottle to bottle, and it is assumed that the actual amount of beverage in the bottle forms a bell‐shaped distribution with a mean 15.8 oz and standard deviation 0.15 oz. Use the empirical rule to find what percentage of bottles contain between 15.5 and 16.1 oz of beverage.
Solution: From the information provided to us in this problem, we have