The Big R-Book. Philippe J. S. De Brouwer
c(1,2,3) v2 <- c(4,5,6) # Standard arithmetic v1 + v2 ## [1] 5 7 9 v1 - v2 ## [1] -3 -3 -3 v1 * v2 ## [1] 4 10 18
The dot-product and other non-element-per-element-operators are available via specialized operators such as %.%
: see Section 4.4.1 “Arithmetic Operators” on page 75
4.3.3.2 Vector Recycling
Vector recycling refers to the fact that in case an operation is requested with one too short vector, that this vector will be concatenated with itself till it has the required length.
# Define a short and long vector: v1 <- c(1, 2, 3, 4, 5) v2 <- c(1, 2) # Note that R ‘recycles’ v2 to match the length of v1: v1 + v2 ## Warning in v1 + v2: longer object length is not a multiple of shorter object length ## [1] 2 4 4 6 6
This behaviour is most probably different from what the experienced programmer will expect. Not only we can add or multiply vectors of different nature (e.g. long and real), but also we can do arithmetic on vectors of different size. This is usually not what you have in mind, and does lead to programming mistakes. Do take an effort to avoid vector recycling by explicitly building vectors of the right size.
4.3.3.3 Reordering and Sorting
To sort a vector, we can use the function sort()
.
sorting
sort()
# Example 1: v1 <- c(1, -4, 2, 0, pi) sort(v1) ## [1] -4.000000 0.000000 1.000000 2.000000 3.141593 # Example 2: To make sorting meaningful, all variables are coerced to # the most complex type: v1 <- c(1:3, 2 + 2i) sort(v1) ## [1] 1+0i 2+0i 2+2i 3+0i # Sorting is per increasing numerical or alphabetical order: v3 <- c("January", "February", "March", "April") sort(v3) ## [1] "April" "February" "January" "March" # Sort order can be reversed: sort(v3, decreasing = TRUE) ## [1] "March" "January" "February" "April"
The time series nottem
(from the package “datasets” that is usually loadedwhen R starts) contains the temperatures in Notthingham from 1920 to 1939 in Fahrenheit. Create a new object that contains a list of all temperatures in Celsius.
Note that nottem
is a time series object (see Chapter 10 “Time Series Analysis” on page 255) and not a matrix. Its elements are addressed with nottam[n]
where n is between 1 and length(nottam)
. However, when printed it will look like a matrix with months in the columns and years in the rows. This is because the print-function will use functionality specific to the time series object.a Remember that
temperature
length()
4.3.4 Matrices
Matrices are a very important class of objects. They appear in all sorts of practical problems: investment portfolios, landscape rendering in games, image processing in the medical sector, fitting of neural networks, etc.
matrix
4.3.4.1 Creating Matrices
A matrix is in two-dimensional data set where all elements are of the same type. The matrix()
function offers a convenient way to define it:
matrix()
# Create a matrix. M = matrix( c(1:6), nrow = 2, ncol = 3, byrow = TRUE) print(M) ## [,1] [,2] [,3] ## [1,] 1 2 3 ## [2,] 4 5 6 M = matrix( c(1:6), nrow = 2, ncol = 3, byrow = FALSE) print(M) ## [,1] [,2] [,3] ## [1,] 1 3 5 ## [2,] 2 4 6
It is also possible to create a unit or zero vector with the same function. If we supply one scalar instead a vector to the first argument of the function matrix()
, it will be recycled as much as necessary.
matrix()
# Unit vector: matrix (1, 2, 1) ## [,1] ## [1,] 1 ## [2,] 1 # Zero matrix or vector: matrix (0, 2, 2) ## [,1] [,2] ## [1,] 0 0 ## [2,] 0 0 # Recycling also works for shorter vectors: matrix (1:2, 4, 4) ## [,1] [,2] [,3] [,4] ## [1,] 1 1 1 1 ## [2,] 2 2 2 2 ## [3,] 1 1 1 1 ## [4,] 2 2 2 2 # Fortunately, R expects that the vector fits exactly n times in the matrix: matrix (1:3, 4, 4) ## Warning in matrix(1:3, 4, 4): data length [3] is not a sub-multiple or multiple of the number of rows [4] ## [,1] [,2] [,3] [,4] ## [1,] 1 2 3 1 ## [2,] 2 3 1 2 ## [3,] 3 1 2 3 ## [4,] 1 2 3 1 # So, the previous was bound to fail.
4.3.4.2 Naming Rows and Columns
While in general naming rows and/or columns is more relevant for datasets than matrices it is possible to work with matrices to store data if it only contains one type of variable.
row_names = c("row1", "row2", "row3", "row4") col_names = c("col1", "col2", "col3") M <- matrix(c(10:21), nrow = 4, byrow = TRUE, dimnames = list(row_names, col_names)) print(M) ## col1 col2 col3 ## row1 10 11 12 ## row2 13 14 15 ## row3 16 17 18 ## row4 19 20 21
dimnames
Once thematrix exists, the columns and rows can be renamed with the functions colnames()
and rownames()
colnames()
rownames()
colnames(M) <- c(‘C1’, ‘C2’, ‘C3’) rownames(M) <- c(‘R1’, ‘R2’, ‘R3’, ‘R4’) M ## C1 C2 C3 ## R1 10 11 12 ## R2 13 14 15 ## R3 16 17 18 ## R4 19 20 21
4.3.4.3 Access Subsets of a Matrix
It might be obvious, that we can access one element of a matrix by using the row and column number. That is not all, R has a very flexible – but logical – model implemented. Let us consider a few examples that speak for themselves.
M <- matrix(c(10:21),