Applied Regression Modeling. Iain Pardoe
As discussed at the beginning of this section, the 95% prediction interval for an individual value of
Unlike for confidence intervals for the population mean, statistical software does not generally provide an automated method to calculate prediction intervals for an individual
We derived the formula for a confidence interval for a univariate population mean from the t‐version of the central limit theorem, which does not require the data
1.8 Chapter Summary
We spent some time in this chapter coming to grips with summarizing data (graphically and numerically) and understanding sampling distributions, but the four major concepts that will carry us through the rest of the book are as follows:
1 Statistical thinking is the process of analyzing quantitative information about a random sample of observations and drawing conclusions (statistical inferences) about the population from which the sample was drawn. An example is using a univariate sample mean, , as an estimate of the corresponding population mean and calculating the sample standard deviation, , to evaluate the precision of this estimate.
2 Confidence intervals are one method for calculating the sample estimate of a parameter (such as the population mean) and its associated uncertainty. An example is the confidence interval for a univariate population mean, which takes the form
3 Hypothesis testing provides another means of making decisions about the likely values of a population parameter. An example is hypothesis testing for a univariate population mean, whereby the magnitude of a calculated sample test statistic,indicates which of two hypotheses (about likely values for the population mean) we should favor.
4 Prediction intervals, while similar in spirit to confidence intervals, tackle the different problem of predicting the value of an individual observation picked at random from the population. An example is the prediction interval for an individual univariate ‐value, which takes the form
Problems
“Computer help” refers to the numbered items in the software information files available from the book website. There are brief answers to the even‐numbered problems in Appendix F (www.wiley.com/go/pardoe/AppliedRegressionModeling3e).
1 1.1 Assume that weekly orders of a popular mobile phone at a local store follow a normal distribution with mean and standard deviation . Find the scores, , that correspond to the:95th percentile (i.e., find such that );50th percentile (i.e., find such that );2.5th percentile (i.e., find such that ). Suppose represents potential values of repeated sample means from this population for samples of size . Use the normal version of the central limit theorem to find the mean scores, , that correspond to the:95th percentile (i.e., find such that );50th percentile (i.e., find such that );2.5th percentile (i.e., find such that ).How many phones should the store order to be 95% confident they can meet demand for a particular week?
2 1.2 Assume that final scores in a statistics course follow a normal distribution with mean and standard deviation . Find the scores, , that correspond to the:90th percentile (i.e., find such that );99th percentile (i.e., find such that );5th percentile (i.e., find such that ). Suppose represents potential values of repeated sample means from this population for samples of size (e.g., average class scores). Use the normal version of the central limit theorem to find the mean scores, , that correspond to the:90th percentile (i.e., find such that );99th percentile (i.e., find such that );5th percentile (i.e., find such that ).If the bottom 5% of the class fail, what is the cut‐off percentage to pass the class?The university requires the long‐term average class score for this course to be no higher than 75%. Does this requirement seem feasible?
3 1.3 The NBASALARY data file contains salary information for 214 guards in the National Basketball Association (NBA) for 2009–2010 (obtained from the online USA Today NBA Salaries Database).Construct a histogram of the variable, representing 2009–2010 salaries in thousands of dollars [computer help #14].What would we expect the histogram to look like if the data were normal?Construct a QQ‐plot of the variable [computer help #22].What would we expect the QQ‐plot to look like if the data were normal?Compute the natural logarithm of guard salaries (call this variable ) [computer help #6], and construct a histogram of this variable [computer help #14]. Hint: The “natural logarithm” transformation (also known as “log to base‐e,” or by the symbols or ln) is a way to transform (rescale) skewed data to make them more symmetric and normal.Construct a QQ‐plot of the variable [computer help #22].Based on the plots in parts (a), (c), (e), and (f), say whether salaries or log‐salaries more closely follow a normal curve, and justify your response.
4 1.4 A company's pension plan includes 50 mutual funds, with each fund expected to earn a mean, , of 3% over the risk‐free rate with a standard deviation of %. Based on the assumption that the funds are randomly selected from a population of funds with normally distributed returns in excess of the risk‐free rate, find the probability that an individual fund's return in excess of the risk‐free rate is, respectively, greater than 34.1%, greater than 15.7%, or less than %. In other words, if represents potential values of individual fund returns, find:;;. Use the normal version of the central limit theorem