The Big R-Book. Philippe J. S. De Brouwer

The Big R-Book

and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored] >

Listing 3.2: Another example of a command line instructions: factor, calc, and pi. This example only has CLI code and does not start R.

$ factor 1492 1492: 2 2 373 $ calc 2*2*373 1492 $ pi 60 3.14159265358979323846264338327950288419716939937510582097494

Note that in these environments, we do not “comment out” the output. We promise to avoid mixing input and output, but in some cases, the output will just be there. So in general it is only possible to copy line per line the commands to see the output on the screen. Copying the whole block and pasting it in the command prompt leads to error messages, rather than the code being executed. This is unlike the R code, which can be copied as a whole, pasted in the R-command prompt and it should all work fine.

Questions or tasks look as follows:

Question #1 Histogram

Consider Figure 3.1 on page 13. Now, imagine that you did not generate the data, but someone gave it to you, so that you do not know how it was generated. Then what could this data represent? Or, rephrased, what could xs be? Does it look familiar?

Bar chart depicts an example showing the histogram of data generated from the normal distribution.

Figure 3.1: An example showing the histogram of data generated from the normal distribution.

Questions or tasks can be answered by the tools and methods explained previously. Note that it might require to do some research by your own, such as looking into the help files or other documentation (we will of course explain how to access these). If you are using this book to prepare for an exam or test, it is probably a good preparation, but if you are in a hurry it is possible to skip these (in this book they do not add to the material explained). However, in general thinking about the issues presented will help you to solve your data-problems more efficiently.

Note that the answers to most questions can be found in E “Answers to Selected Questions” on page 1061.The answer might not always be detailed but it should be enough to solve the issue raised.

Definition: This is a definition

This is not an book about exact mathematics. This is a pragmatic book with a focus on practical applications. Therefore, we use the word “definition” also in a practical sense.

Definitions are not always rigorous definitions as a mathematician would be used to. We rather use practical definitions (e.g. how a function is implemented).

The use of a function is – mainly at the beginning of the book – highlighted as follows. For example:

Function use for mean()

mean (x, na.rm = FALSE, trim = 0, …)

Where

x is an R-object,

na.rm is a boolean (setting this to TRUE will remove missing values),

trim is the fraction of observations to be removed on both sides of the distribution before the mean is computed – the default is 0 and it cannot be higher than 0.5

From this example, it should be clear how the function mean() is used. Note the following:

The name of the function is in the title.

On the first line we repeat the function with its most important parameters.

The parameter x is an obligatory parameter.

The parameter na.rm can be omitted. When it is not provided, the default FALSE is used. A parameter with a default can be recognised in the first line via the equal sign.

The three dots indicate that other parameters can be passed that will not be used by the function mean(), but they are passed on to further methods.

Some functions can take a lot of parameters. In some cases, we only show the most important ones.

Later on in the book, we assume that the reader is able to use the examples and find more about the function in its documentation. For example, ?mean will display information about the function mean.

When a new concept or idea is built up with examples they generally appear just like that in the text. Additional examples after the point is made are usually highlighed as follows:

Example: Mean

An example of a function is mean(), as the name suggests it calculates the arithmetic mean (average) of data fed into the function.

# First, generate some data: x <- c(1,2,3) # Then calculate the mean: mean(x) ## [1] 2

Some example environments are split in two parts: the question and the solution as follows:

Example: Mean

What is the mean of all integer numbers from one to 100? Use the function mean().

mean(1:100) ## [1] 50.5

There are a few more special features in the layout that might be of interest.

A hint is something that adds additional practical information that is not part of the normal flow of the text.

Hint – Using the hint boxes

When first studying a section, skip the hints, and when reading it a second time paymore attention to the hints.

When we want to draw attention to something that might or might not be clear from the normal flow of the text, we put it in a “notice environment.” This looks as follows:

Note – Layout details

Note that hints, notes and warnings look all similar, but for your convenience, we have differentiating colours and layout details.

There are more such environments and we let them speak for themselves.

Digression – This is good to know

A digression does what you would expect from it. It is not necessary to read in order to understand the rest of the chapter. However, it provides further insight that is useful to gain a deeper insight of the subject discussed.

The Big R-Book. Philippe J. S. De Brouwer

The Big R-Book

Listing 3.2: Another example of a command line instructions: factor, calc, and pi. This example only has CLI code and does not start R.

Question #1 Histogram

Definition: This is a definition

Function use for mean()

Example: Mean

Example: Mean

Hint – Using the hint boxes

Note – Layout details

Digression – This is good to know

Warning – Read comments in code