The Big R-Book. Philippe J. S. De Brouwer
8.2: Summary information based on the dataset
mtcars
.
brand | avgDSP | avgCYL | minMPG | medMPG | avgMPG | maxMPG |
Fiat | 78.9 | 4.0 | 27.3 | 29.85 | 29.85 | 32.4 |
Horn | 309.0 | 7.0 | 18.7 | 20.05 | 20.05 | 21.4 |
Mazd | 160.0 | 6.0 | 21.0 | 21.00 | 21.00 | 21.0 |
Merc | 207.2 | 6.3 | 15.2 | 17.80 | 19.01 | 24.4 |
Toyo | 95.6 | 4.0 | 21.5 | 27.70 | 27.70 | 33.9 |
## <chr> <int> ## 1 Fiat 2 ## 2 Horn 2 ## 3 Mazd 2 ## 4 Merc 7 ## 5 Toyo 2 grouped_cars <- t %>% # start with cars filter(brand %in% top_brands$brand) %>% # only top-brands group_by(brand) %>% summarise( avgDSP = round(mean(disp), 1), avgCYL = round(mean(cyl), 1), minMPG = min(mpg), medMPG = median(mpg), avgMPG = round(mean(mpg),2), maxMPG = max(mpg), ) print(grouped_cars) ## # A tibble: 5 x 7 ## brand avgDSP avgCYL minMPG medMPGavgMPGmaxMPG ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 Fiat 78.8 4 27.3 29.8 29.8 32.4 ## 2 Horn 309 7 18.7 20.0 20.0 21.4 ## 3 Mazd 160 6 21 21 21 21 ## 4 Merc 207. 6.3 15.2 17.8 19.0 24.4 ## 5 Toyo 95.6 4 21.5 27.7 27.7 33.9
summarise()
The sections on knitr
and rmarkdown
(respectively Chapter 33 on page 703 and Chapter 32 on page 699) will explain how to convert this output via the function kable()
into Table 8.2.
There are a few things about group_by()
and summarise()
that should be noted in order to make working with them easier. For example, summarize
works opposite to group_by
and hence will peel back any existing grouping, it is possible to use expression in group by, new groups will preplace by default existing ones, etc. These aspects are illustrated in the following code.
# Each call to summarise() removes a layer of grouping: by_vs_am <- mtcars %>% group_by(vs, am) by_vs <- by_vs_am %>% summarise(n = n()) by_vs ## # A tibble: 4 x 3 ## # Groups: vs [2] ## vs am n ## <dbl> <dbl> <int> ## 1 0 0 12 ## 2 0 1 6 ## 3 1 0 7 ## 4 1 1 7 by_vs %>% summarise(n = sum(n)) ## # A tibble: 2 x 2 ## vs n ## <dbl> <int> ## 1 0 18 ## 2 1 14 # To removing grouping, use ungroup: by_vs %>% ungroup() %>% summarise(n = sum(n)) ## # A tibble: 1 x 1 ## n ## <int> ## 1 32 # You can group by expressions: this is just short-hand for # a mutate/rename followed by a simple group_by: mtcars %>% group_by(vsam = vs + am) ## # A tibble: 32 x 12 ## # Groups: vsam [3] ## mpg cyl disp hp drat wt qsec vs am ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 21 6 160 110 3.9 2.62 16.5 0 1 ## 2 21 6 160 110 3.9 2.88 17.0 0 1 ## 3 22.8 4 108 93 3.85 2.32 18.6 1 1 ## 4 21.4 6 258 110 3.08 3.22 19.4 1 0 ## 5 18.7 8 360 175 3.15 3.44 17.0 0 0 ## 6 18.1 6 225 105 2.76 3.46 20.2 1 0 ## 7 14.3 8 360 245 3.21 3.57 15.8 0 0 ## 8 24.4 4 147. 62 3.69 3.19 20 1 0 ## 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 ## 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 ## # … with 22 more rows, and 3 more variables: ## # gear <dbl>, carb <dbl>, vsam <dbl> # By default, group_by overrides existing grouping: mtcars %>% group_by(cyl) %>% group_by(vs, am) %>% group_vars() ## [1] “vs” “am” # Use add = TRUE to append grouping levels: mtcars %>% group_by(cyl) %>% group_by(vs, am, add = TRUE) %>% group_vars() ## [1] “cyl” “vs” “am”
Notes
1 1 More information about the concept “dispatcher function” is in Chapter 6 “The Implementation of OO” on page 87.
2 2 In the sections Chapter 32 “R Markdown” on page 699 and Chapter 33 “knitr and LATEX” on page 703 it will be explained how these results from R can directly be used in reports without the need to copy and paste things.
Конец ознакомительного фрагмента.
Текст предоставлен ООО «ЛитРес».
Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.
Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.