Probability with R. Jane M. Horgan
quantitative statistics; on the contrary, calculations of quantitative statistics should come after the exploratory data analysis using graphical displays.
Exercises 3.1
1 Use the data in “results.txt” to develop boxplots of all the subjects on the same graph.
2 Obtain a stem and leaf of each subject in “results.txt.” Are there patterns emerging?
3 For the class of 50 students of computing detailed in Exercise 1.1, use R toform the stem‐and‐leaf display for each gender, and discuss the advantages of this representation compared to the traditional histogram;construct a box‐plot for each gender and discuss the findings.
4 Plot the marks in Architecture 1 against those in Architecture 2 and obtain the line of best fit. In your opinion, is it a suitable model for predicting the results obtained in Architecture 2 from those obtained in Architecture 1?
5 The following table gives the number of hours spent studying for the probability examination and the result obtained (%) by each of 10 students.Study hours548710610400Exam results73648070855086502025Plot the data and decide if there is a linear trend. If there is, use R to obtain the line of best fit.
6 The percentage of households with access to the Internet in Ireland in each of the years 2010–2017 is given in the following table:Year20102011201220132014201520162017Internet access7278818282858789This set of data is to be used as a training set to estimate Internet access in the future.Plot the data and decide if there is a linear trend.If there is, obtain the line of best fit.Can you predict what the Internet access will be in 2019?
3.8 Projects
1 In Appendix B, we show that the line of best fit is obtained whenandWrite a program in R to calculate and and use it to obtain the line that best fits the data in Exercise 5 above. Check your results using the lm(y˜x) function given in R.
2 When plotting in Fig. 3.10, we used font.main = 1 to ensure the main titles are in plain font.Alternative fonts available are2 = bold,3 = italic,4 = bold italic5 = symbol.Fonts may also be changed on the ‐ and ‐axis labels, with font.lab. Explore the effect of changing the fonts in Fig. 3.7.
References
1 Anscombe, F.J. (1973), Graphs in statistical analysis, American Statistician, 27, 1721.
2 Girolami, M. (2015), A First Course in Machine Learning, CRC Press.
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.
Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.