Handbook of Regression Analysis With Applications in R. Samprit Chatterjee

Handbook of Regression Analysis With Applications in R - Samprit  Chatterjee


Скачать книгу
(average) sale price for all homes of that type in the area, so they can give a justifiable interval estimate giving the precision of the estimate of the true expected value of the house, so a confidence interval for the fitted value is desired.

      Exact images intervals for a house with these characteristics can be obtained from statistical software, and turn out to be images for the prediction interval and images for the confidence interval. As expected, the prediction interval is much wider than the confidence interval, since it reflects the inherent variability in sale prices in the population of houses; indeed, it is probably too wide to be of any practical value in this case, but an interval with smaller coverage (that is expected to include the actual price only images of the time, say) might be useful (a images interval in this case would be images, so a seller could be told that there is a images chance that their house will sell for a value in this range).

Scatter plots of the residuals for the home price data. (a) Plot of residuals versus fitted values. (b) Normal plot of the residuals forming a roughly straight line. Scatter plots of residuals versus each of the predictors for the home price data that does not show any apparent patterns.

      In this chapter we have laid out the basic structure of the linear regression model, including the assumptions that justify the use of least squares estimation. The three main goals of regression noted at the beginning of the chapter provide a framework for an organization of the topics covered.

      1 Modeling the relationship between and :the least squares estimates summarize the expected change in for a given change in an , accounting for all of the variables in the model;the standard error of the estimate estimates the standard deviation of the errors; and estimate the proportion of variability in accounted for by ;and the confidence interval for a fitted value provides a measure of the precision in estimating the expected target for a given set of predictor values.

      2 Prediction of the target variable:substituting specified values of into the fitted regression model gives an estimate of the value of the target for a new observation;the rough prediction interval provides a quick measure of the limits of the ability to predict a new observation;and the exact prediction interval provides a more precise measure of those limits.

      3 Testing of hypotheses:the ‐test provides a test of the statistical significance of the overall relationship;the ‐test for each slope coefficient testing whether the true value is zero provides a test of whether the variable provides additional predictive power given the other variables;and the ‐tests can be generalized to test other hypotheses of interest about the coefficients as well.

      Since all of these methods depend on the assumptions holding, a fundamental part of any regression analysis is to check those assumptions. The residual plots discussed in this chapter are a key part of that process, and other diagnostics and tests will be discussed in future chapters that provide additional support for that task.

      KEY TERMS