Applied Regression Modeling. Iain Pardoe
return in excess of the risk‐free rate is, respectively, greater than 7.4%, greater than 4.8%, or less than 0.7%. In other words, if represents potential values of repeated sample means, find:;;.
5 1.5 Consider the data on 2009–2010 salaries of 214 NBA guards from Problem 1.3.Calculate a 95% confidence interval for the population mean in thousands of dollars [computer help #23]. Hint: Calculate by hand (using the fact that the sample mean of is 3980.318, the sample standard deviation is 4525.378, and the 97.5th percentile of the t‐distribution with 213 degrees of freedom is approximately 1.971) and check your answer using statistical software.Consider , the natural logarithms of the salaries. The sample mean of is 7.664386. Re‐express this number in thousands of dollars (the original units of salary).Hint: To back‐transform a number in natural logarithms to its original scale, use the “exponentiation” function on a calculator [denoted exp(X) or , where X is the variable expressed in natural logarithms]. This is because exp((Y)) Y.Compute a 95% confidence interval for the population mean in natural logarithms of thousands of dollars [computer help #23].Hint: Calculate by hand (using the fact that the sample mean of is 7.664386, the sample standard deviation of is 1.197118, and the 97.5th percentile of the t‐distribution with 213 degrees of freedom is approximately 1.971) and check your answer using statistical software.Re‐express each interval endpoint of your 95% confidence interval computed in part (c) in thousands of dollars and say what this interval means in words.The confidence interval computed in part (a) is exactly symmetric about the sample mean of . Is the confidence interval computed in part (d) exactly symmetric about the sample mean of back‐transformed to thousands of dollars that you computed in part (b)? How does this relate to quantifying our uncertainty about the population mean salary?Hint: Looking at the histogram from Problem 3 part (a), if someone asked you to give lower and upper bounds on the population mean salary using your intuition rather than statistics, would you give a symmetric or an asymmetric interval?
6 1.6 The FINALSCORES data file contains values of variable , which measures final scores in a statistics course.Calculate the sample mean and sample standard deviation of [computer help #10].Calculate a 90% confidence interval for the population mean of [computer help #23]. Hint: Calculate by hand (using the sample mean and sample standard deviation from part (a), and the 95th percentile of the t‐distribution with 99 degrees of freedom, which is approximately 1.660) and check your answer using statistical software.
7 1.7 Gapminder is a “non‐profit venture promoting sustainable global development and achievement of the United Nations Millennium Development Goals.” It provides related time series data for all countries in the world at the website www.gapminder.org . For example, the COUNTRIES data file contains the 2010 population count (variable in millions) of the 55 most populous countries together with 2010 life expectancy at birth (variable in years).Calculate the sample mean and sample standard deviation of [computer help #10].Briefly say why calculating a confidence interval for the population mean would not be useful for understanding mean population counts for all countries in the world.Consider the variable , which represents the average number of years a newborn child would live if current mortality patterns were to stay the same. Suppose that for this variable, these 55 countries could be considered a random sample from the population of all countries in the world. Calculate a 95% confidence interval for the population mean of [computer help #23]. Hint: Calculate by hand (using the fact that the sample mean of is 69.787, the sample standard deviation is 9.2504, and the 97.5th percentile of the t‐distribution with 54 degrees of freedom is approximately 2.005) and check your answer using statistical software.
8 1.8 Consider the FINALSCORES data file from Problem 1.6.Do a hypothesis test to determine whether there is sufficient evidence at a significance level of 5% to conclude that the population mean of is greater than 66 [computer help #24].Repeat part (a) but test whether the population mean of is less than 73.Repeat part (a) but test whether the population mean of is not equal to 66.
9 1.9 Consider the COUNTRIES data file from Problem 1.7. A journalist speculates that the population mean of is greater than 68 years. Based on the sample of 55 countries, a smart statistics student thinks that there is insufficient evidence to conclude this. Do a hypothesis test to show who is correct based on a significance level of 5% [computer help #24].Hint: Make sure that you lay out all the steps involved—as in Section 1.6.1—and include a short sentence summarizing your conclusion; that is, who do you think is correct, the journalist or the student?
10 1.10Consider the housing market represented by the sale prices in the HOMES1 data file.As suggested in Section 1.3, calculate the probability of finding an affordable home (less than ) in this housing market. Assume that the population of sale prices () is normal, with mean and standard deviation .As suggested in Section 1.5, calculate a 90% confidence interval for the population mean in this housing market. Recall that the sample mean , the sample standard deviation , and the sample size . Check your answer using statistical software [computer help #23].Practice the mechanics of hypothesis tests by conducting the following tests using a significance level of 5%.: versus : ;: versus : ;: versus : ;: versus : .As suggested in Section 1.7, calculate a 90% prediction interval for an individual sale price in this market.
11 10.11Consider the COUNTRIES data file from Problem 7. Calculate a 95% prediction interval for the variable . Discuss why this interval is so much wider than the confidence interval calculated in Problem 7 part (c).Hint: Calculate by hand (using the fact that the sample mean of is 69.787, the sample standard deviation is 9.2504, and the 97.5th percentile of the t‐distribution with 54 degrees of freedom is approximately 2.005) and check your answer using statistical software (if possible—see the discussion of the“ones trick” in Section 1.7).
12 10.12This problem is adapted from one in Frees (1995). The HOSP data file contains data on charges for patients at a Wisconsin hospital in 1989, as analyzed by Frees (1994). Managers wish to estimate health care costs and to measure how reliable their estimates are. Suppose that a risk manager for a large corporation is trying to understand the cost of one aspect of health care, hospital costs for a small, homogeneous group of claims, the charges (in thousands of dollars) for female patients aged 30–49 who were admitted to the hospital for circulatory disorders.Calculate a 95% confidence interval for the population mean, . Use the following in your calculation: the sample mean, , is 2.9554, the sample standard deviation, , is 1.48104, and the 97.5th percentile of the t‐distribution with 32 degrees of freedom is 2.037. Check your answer using statistical software [computer help #23].Also calculate a 95% prediction interval for an individual claim, . Does this interval seem reasonable given the range of values in the data?Transform the data by taking the reciprocal of the claim values (i.e., ). Calculate a 95% confidence interval for the population mean of the reciprocal‐transformed claims. Use the following sample statistics: the sample mean of is 0.3956 and the sample standard deviation of is 0.12764. Check your answer using statistical software [computer help #23].Back‐transform the endpoints of the interval you just calculated into the original units of (thousands of dollars).Do the same for a 95% prediction interval—that is, calculate the reciprocal‐transformed interval and back‐transform to the original units. Does this interval seem reasonable given the range of values in the data? If so, why did transforming the data help here?
13 10.13The following questions allow you to practice important concepts from Chapter 1 without having to use a computer.In the construction of confidence intervals, will an increase in the sample size lead to a wider or narrower interval (if all other quantities are unchanged)?Suppose that a 95% confidence interval for the population mean, , turns out to be . Give a definition of what it means to be “95% confident” here.A government department is supposed to respond to requests for information within 5 days of receiving the request. Studies show a mean time to respond of 5.28 days and a standard deviation of 0.40 day for a sample of requests. Construct a 90% confidence interval for the mean time to respond. Then do an appropriate hypothesis test at significance level 5% to determine if