Introduction to Linear Regression Analysis. Douglas C. Montgomery
TABLE 2.12 MlNITAB Output for Soft Drink Delivery Time Data
Regression Analysis: Time versus Cases
|
|||||
The regression equation is
|
|||||
Time = 3.32 + 2.18 Cases
|
|||||
Predictor
|
Coef
|
SE Coef
|
T
|
P
|
|
Constant
|
3.321
|
1.371
|
2.42
|
0.024
|
|
Cases
|
2.1762
|
0.1240
|
17.55
|
0.000
|
|
S = 4.18140
|
R- Sq= 93.0%
|
R- Sq(adj) = 92.7%
|
|||
Analysis of Variance
|
|||||
Source
|
DF
|
SS
|
MS
|
F
|
P
|
Regression
|
1
|
5382.4
|
5382.4
|
307.85
|
0.000
|
Residual Error
|
23
|
402.1
|
17.5
|
||
Total
|
24
|
5784.5
|
If we assume that delivery time and delivery volume are jointly normally distributed, we may test the hypotheses
using the test statistic
Since t0.025,23 = 2.069, we reject H0 and conclude that the correlation coefficient ρ ≠ 0. Note from the Minitab output in Table 2.12 that this is identical to the t-test statistic for H0: β1 = 0. Finally, we may construct an approximate 95% CI on ρ from (2.72). Since arctanh r = arctanh 0.9646 = 2.0082, Eq. (2.72) becomes
which reduces to
Although we know that delivery time and delivery volume are highly correlated, this information is of little use in predicting, for example, delivery time as a function of the number of cases of product delivered. This would require a regression model. The straight-line fit (shown graphically in Figure 1.1b) relating delivery time to delivery volume is
Further analysis would be required to determine if this equation is an adequate fit to the data and if it is likely to be a successful predictor.
PROBLEMS
1 2.1 Table B.1 gives data concerning the performance of the 26 National Football League teams in 1976. It is suspected that the number of yards gained rushing by opponents (x8) has an effect on the number of games won by a team (y).a. Fit a simple linear regression model relating games won y to yards gained rushing by opponents x8.b. Construct the analysis-of-variance table and test for significance of regression.c. Find a 95% CI on the slope.d. What percent of the total variability in y is explained by this model?e. Find a 95% CI on the mean number of games won if opponents’ yards rushing is limited to 2000 yards.
2 2.2 Suppose we would like to use the model developed in Problem 2.1 to predict the number of games a team will win if it can limit opponents’ yards rushing to 1800 yards. Find a point estimate of the number of games won when x8 = 1800. Find a 90% prediction interval on the number of games won.
3 2.3 Table B.2 presents data collected during a solar energy project at Georgia Tech.a. Fit a simple linear regression model relating total heat flux y (kilowatts) to the radial deflection of the deflected rays x4 (milliradians).b. Construct the analysis-of-variance table and test for significance of regression.c. Find a 99% CI on the slope.d. Calculate R2.e. Find a 95% CI on the mean heat flux when the radial deflection is 16.5 milliradians.
4 2.4 Table B.3 presents data on the gasoline mileage performance of 32 different automobiles.a. Fit a simple linear regression model relating gasoline mileage y (miles per gallon) to engine displacement xl (cubic inches).b. Construct the analysis-of-variance table and test for significance of regression.c. What percent of the total variability in gasoline mileage is accounted for by the linear relationship with engine displacement?d. Find a 95% CI on the mean gasoline mileage if the engine displacement is 275 in.3e. Suppose that we wish to predict the gasoline mileage obtained from a car with a 275-in.3 engine. Give a point estimate of mileage. Find a 95% prediction interval on the mileage.f. Compare the two intervals obtained in parts d and e. Explain the difference between them. Which one is wider, and why?
5 2.5 Consider