The Big R-Book. Philippe J. S. De Brouwer

The Big R-Book

hard to predict and it is even harder to spot the programming error in a part of your code that previously worked fine.

Naming Rows and Columns

In the preceding code, we have named columns when we created the data-frame. It is also possible to do that later or to change column names …and it is even possible to name each row individually.

# Get the rownames. colnames(data_test) ## [1] “Name” “Gender” “Score” “Age” “End_date” rownames(data_test) ## [1] “1” “2” “3” “4” “5” colnames(data_test)[2] ## [1] “Gender” rownames(data_test)[3] ## [1] “3” # assign new names colnames(data_test)[1] <- “first_name” rownames(data_test) <- LETTERS[1:nrow(data_test)] print(data_test) ## first_name Gender Score Age End_date ## A Piotr Male 80 42 2014-03-01 ## B Pawel Male 88 38 2017-02-13 ## C <NA> Female 92 26 2014-10-10 ## D Lisa Female 89 30 2015-05-10 ## E Laura Female 84 35 2010-08-25

Question #7

1 Create 3 by 3 matrix with the numbers 1 to 9,

2 Convert it to a data-frame,

3 Add names for the columns and rows,

4 Add a column with the column-totals,

5 Drop the second column.

4.3.9 Strings or the Character-type

Strings are called the “character-type” in R. They follow some simple rules:

string

strings must start and end with single or double quotes,

a string ends when the same quotes are encountered the next time,

until then it can contain the other type of quotes.

Example: Using strings

a <- “Hello” b <- “world” paste(a, b, sep = “, “) ## [1] “Hello, world” c <- “A ‘valid’ string” paste()

Note – Paste

In many cases we do not need anything between strings that are concatenated. We can of course supply an empty string as separator ( sep = “), but it is also possible to use the custom function pate0():

paste0(12, ‘%’) ## [1] “12%”

past0()

Formatting with

format()

In many cases, it will be useful to format a date or number consistently and neatly in plot and tables. The function format() is a great tool to start formatting.

format()

Function use for format()

format(x, trim = FALSE, digits = NULL, nsmall = 0L, justify = c(“left”, “right”, “centre”, “none”), width = NULL, na.encode = TRUE, scientific = NA, big.mark = “”, big.interval = 3L, small.mark = “”, small.interval = 5L, decimal.mark = getOption(“OutDec”), zero.print = NULL, drop0trailing = FALSE, …)

x is the vector input.

digits is the total number of digits displayed.

nsmall is the minimum number of digits to the right of the decimal point.

scientific is set to TRUE to display scientific notation.

width is the minimum width to be displayed by padding blanks in the beginning.

justify is the display of the string to left, right or center.

Formatting examples

a<-format(100000000,big.mark=” “, nsmall=3, width=20, scientific=FALSE, justify=“r”) print(a) ## [1] “ 100 000 000.000”

Further information – format()

More information about the format-function can be obtained via ?format or help(format).

Other string functions

nchar(): returns the number of characters in a string

nchar()

toupper(): puts the string in uppercase

toupper()

tolower(): puts the string in lowercase

tolower()

substring(x,first,last): returnsa substring from x starting with the “first” and ending with the “last”

substring()

strsplit(x,split): splitthe elements of a vector into substrings according to matches of a substring “split.”there is also a family of search functions: grep(),

strsplit()

grep()

grepl(),

grepl()

regexpr(),

regexpr()

gregexpr(),

gregexpr()

and regexec()

regexec()

that supply powerful search and replace capabilities.

sub()

will replace the first of all matches and gsub()

gsub()

will replace all matches.

4.4 Operators

While we already encountered operators in previous sections when we introduced the data types, here we give a systematic overview of operators on base types.

operators