The Big R-Book. Philippe J. S. De Brouwer

The Big R-Book - Philippe J. S. De Brouwer


Скачать книгу
## 3 Paula Female 92 26 ## 4 Lisa Female 89 30 ## 5 Laura Female 84 35 # Get the last rows: tail(data_test) ## Name Gender Score Age ## 1 Piotr Male 78 42 ## 2 Pawel Male 88 38 ## 3 Paula Female 92 26 ## 4 Lisa Female 89 30 ## 5 Laura Female 84 35 # Extract the column 2 and 4 and keep all rows data_test.1 <- data_test[,c(2,4)] print(data_test.1) ## Gender Age ## 1 Male 42 ## 2 Male 38 ## 3 Female 26 ## 4 Female 30 ## 5 Female 35 # Extract columns by name and keep only selected rows data_test[c(2:4),c(2,4)] ## Gender Age ## 2 Male 38 ## 3 Female 26 ## 4 Female 30

      image Warning – Avoiding conversion to factors

      d <- data.frame( Name = c(“Piotr”, “Pawel”,“Paula”,“Lisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,35), stringsAsFactors = FALSE ) d$Gender <- factor(d$Gender) # manually factorize gender str(d) ## ‘data.frame’: 5 obs. of 4 variables: ## $ Name : chr “Piotr” “Pawel” “Paula” “Lisa” … ## $ Gender: Factor w/ 2 levels “Female”,“Male”: 2 2 1 1 1 ## $ Score : num 78 88 92 89 84 ## $ Age : num 42 38 26 30 35

      4.3.8.3 Editing Data in a Data Frame

      While one usually reads in large amounts of data and uses an IDE such as RStudio that facilitates the visualization and manual modification of data frames, it is useful to know how this is done when no graphical interface is available. Even when working on a server, all these functions will always be available.

       de()

       data.entry()

       edit()

      de(x) # fails if x is not defined de(x <- c(NA)) # works x <- de(x <- c(NA)) # will also save the changes data.entry(x) # de is short for data.entry x <- edit(x) # use the standard editor (vi in *nix)

      Of course, there are also multiple ways to address data directly in R.

      # The following lines do the same. data_test$Score[1] <- 80 data_test[3,1] <- 80

      4.3.8.4 Modifying Data Frames

      Add Columns to a Data-frame

      Typically, the variables are in the columns and adding a column corresponds to adding a new, observed variable. This is done via the function cbind().

       cbind()

      Adding Rows to a Data-frame

      Adding rows corresponds to adding observations. This is done via the function rbind().

       rbind()

      # To add a row, we need the rbind() function: data_test.to.add <- data.frame( Name = c(“Ricardo”, “Anna”), Gender = c(“Male”, “Female”), Score = c(66,80), Age = c(70,36), End_date = as.Date(c(“2016-05-05”,“2016-07-07”)) ) data_test.new <- rbind(data_test,data_test.to.add) print(data_test.new) ## Name Gender Score Age End_date ## 1 Piotr Male 80 42 2014-03-01 ## 2 Pawel Male 88 38 2017-02-13 ## 3 <NA> Female 92 26 2014-10-10 ## 4 Lisa Female 89 30 2015-05-10 ## 5 Laura Female 84 35 2010-08-25 ## 6 Ricardo Male 66 70 2016-05-05 ## 7 Anna Female 80 36 2016-07-07

      Merging data frames

      data_test.1 <- data.frame( Name = c(“Piotr”, “Pawel”,“Paula”,“Lisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,35) ) data_test.2 <- data.frame( Name = c(“Piotr”, “Pawel”,“notPaula”,“notLisa”,“Laura”), Gender = c(“Male”, “Male”,“Female”, “Female”,“Female”), Score = c(78,88,92,89,84), Age = c(42,38,26,30,135) ) data_test.merged <- merge(x=data_test.1,y=data_test.2, by.x=c(“Name”,“Age”),by.y=c(“Name”,“Age”)) # Only records that match in name and age are in the merged table: print(data_test.merged) ## Name Age Gender.x Score.x Gender.y Score.y ## 1 Pawel 38 Male 88 Male 88 ## 2 Piotr 42 Male 78 Male 78

       merge()

      Short-cuts

      R will allow the use of short-cuts, provided that they are unique. For example, in the data-frame data_test there is a column Name. There are no other columns whose name start with the letter “N”; hence. this one letter is enough to address this column.

       short-cut

      data_test$N ## [1] Piotr Pawel Paula Lisa Laura ## Levels: Laura Lisa Paula Pawel Piotr

      image Warning – Short-cuts can be dangerous

      Use “short-cuts” sparingly and only when working


Скачать книгу