Handbook of Regression Analysis With Applications in R. Samprit Chatterjee

Handbook of Regression Analysis With Applications in R - Samprit  Chatterjee


Скачать книгу
rel="nofollow" href="#fb3_img_img_eaebd95d-609f-5086-a9e0-7a8221df3a11.png" alt="images"/>) values are given for each predictor. It is apparent that there is virtually no collinearity among these predictors (recall that images is the minimum possible value of the images), which should make model selection more straightforward. The following output summarizes a best subsets fitting:

       P L r i Y o B v e p B a i L a e e t n o r r d h g t . t r r . . b y o o a s u . o o r i i t Mallows m m e z l a Vars R-Sq R-Sq(adj) Cp AICc S s s a e t x 1 35.3 34.6 21.2 1849.9 52576 X 1 29.4 28.6 30.6 1857.3 54932 X 1 10.6 9.5 60.3 1877.4 61828 X 2 46.6 45.2 5.5 1835.7 48091 X X 2 38.9 37.5 17.5 1847.0 51397 X X 2 37.8 36.3 19.3 1848.6 51870 X X 3 49.4 47.5 3.0 1833.1 47092 X X X 3 48.2 46.3 4.9 1835.0 47635 X X X 3 46.6 44.7 7.3 1837.5 48346 X X X 4 50.4 48.0 3.3 1833.3 46885 X X X X 4 49.5 47.0 4.7 1834.8 47304 X X X X 4 49.4 46.9 5.0 1835.1 47380 X X X X 5 50.6 47.5 5.0 1835.0 47094 X X X X X 5 50.5 47.3 5.3 1835.2 47162 X X X X X 5 49.6 46.4 6.7 1836.8 47599 X X X X X 6 50.6 46.9 7.0 1836.9 47381 X X X X X X

      1 Increase the number of predictors until the value levels off. Clearly, the highest for a given cannot be smaller than that for a smaller value of . If levels off, that implies that additional variables are not providing much additional fit. In this case, the largest values go from roughly to from to , which is clearly a large gain in fit, but beyond that more complex models do not provide much additional fit (particularly past ). Thus, this guideline suggests choosing either or .

      2 Choose the model that maximizes the adjusted . Recall from equation (1.7) that the adjusted equalsIt is apparent that explicitly trades off strength of fit () versus simplicity [the multiplier ], and can decrease if predictors that do not add any predictive power are added to a model. Thus, it is reasonable to not complicate a model beyond the point where its adjusted increases. For these data, is maximized at .

      The fourth column in the output refers to a criterion called Mallows' images (Mallows, 1973). This criterion equals

equation

      1 Choose the model that minimizes . In case of tied values, the simplest model (smallest ) would be chosen. In these data, this rule implies choosing .

      An additional operational rule for the use of images has been suggested. When a particular model contains all of the necessary predictors, the residual mean square for the model should be roughly equal to images. Since the model that includes all of the predictors should also include all of the necessary ones, images should also be roughly equal to images. This implies that if a model includes all of the necessary predictors, then

equation

      This suggests the following model selection rule:

      1 Choose the simplest model such that or smaller. In these data, this rule implies choosing .

      A weakness of the images criterion is that its value depends on the largest set of candidate predictors (through images), which means that adding predictors that provide no predictive power to the set of candidate models can change the choice of best model. A general approach that avoids this is through the use of statistical information. A detailed discussion of the determination of information measures is beyond the scope of this book, but Burnham and Anderson (2002) provides extensive discussion of the topic. The Akaike Information Criterion images, introduced by Akaike (1973),


Скачать книгу