R For Dummies. Vries Andrie de
55443 83382 110359 138801 167130 138808
9 10 11 12
110920 83389 55816 27945
Each line of R code in this example is preceded by one of two symbols:
✔ >
: The prompt symbol, >
, is not part of your code, and you should not type this when you try the code yourself.
✔ +
: The continuation symbol, +
, indicates that this line of code still belongs to the previous line of code. In fact, you don’t have to break a line of code into two, but we do this frequently, because it improves the readability of code and helps it fit into the pages of a book.
Lines that start without either the prompt or the continuation symbol are output produced by R. In this case, you get the total number of throws where the dice added up to the numbers 2 through 12. For example, out of 1 million throws of the dice, on 28,007 occasions the numbers on the dice added to 2.
You can copy these code snippets and run them in R, but you have to type them exactly as shown. There are only three exceptions:
✔ Don’t type the prompt symbol, >
.
✔ Don’t type the continuation symbol, +
.
✔ Where you put spaces or tabs isn’t critical, as long as it isn’t in the middle of a keyword. Pay attention to new lines, though.
Instructions to type code into the R console has the >
symbol to the left:
> print("Hello world!")
If you type this into a console and press Enter, R responds with:
[1] "Hello world!"
For convenience, we collapse these two events into a single block, like this:
> print("Hello world!")
[1] "Hello world!"
Functions, arguments, and other R keywords appear in monofont
. For example, to create a plot, you use the plot()
function. Function names are followed by parentheses – for example, plot()
. We don't add arguments to the function names mentioned in the text, unless it’s really important.
On some occasions we talk about menu commands, such as File⇒Save. This just means that you open the File menu and choose the Save option.
What You’re Not to Read
You can use this book however works best for you, but if you’re pressed for time (or just not interested in the nitty-gritty details), you can safely skip anything marked with a Technical Stuff icon. You also can skip sidebars (text in gray boxes); they contain interesting information, but nothing critical to your understanding of the subject at hand.
Foolish Assumptions
This book makes the following assumptions about you and your computer:
✔ You know your way around a computer. You know how to download and install software. You know how to find information on the Internet and you have Internet access.
✔ You’re not necessarily a programmer. If you are a programmer, and you’re used to coding in other languages, you may want to read the notes marked by the Technical Stuff icon – there, we fill you in on how R is similar to, or different from, other common languages.
✔ You’re not a statistician, but you understand the very basics of statistics. R For Dummies isn’t a statistics book, although we do show you how to do some basic statistics using R. If you want to understand the statistical stuff in more depth, we recommend Statistics For Dummies, 2nd Edition, by Deborah J. Rumsey, PhD (Wiley).
✔ You want to explore new stuff. You like to solve problems and aren’t afraid of trying things out in the R console.
How This Book Is Organized
The book is organized in six parts. Here’s what each of the six parts covers.
In this part, you write your first script. You use the powerful concept of vectors to make simultaneous calculations on many variables at once. You work with the R workspace (in other words, how to create, modify, or remove variables). You find out how to save your work and retrieve and modify script files that you wrote in previous sessions. We also introduce some fundamentals of R (for example, how to install packages).
In this part, we fill you in on the three R’s: reading, ’riting, and ’rithmetic – in other words, working with text and numbers (and dates for good measure). You also get to use the very important data structures of lists and data frames.
R is a programming language, so you need to know how to write and understand functions. In this part, we show you how to do this, as well as how to control the logic flow of your scripts by making choices using if
statements, as well as looping through your code to perform repetitive actions. We explain how to make sense of and deal with warnings and errors that you may experience in your code. Finally, we show you some tools to debug any issues that you may experience.
In this part, we introduce the different data structures that you can use in R, such as lists and data frames. You find out how to get your data in and out of R (for example, by reading data from files or the Clipboard). You also see how to interact with other applications, such as Microsoft Excel.
Then you discover how easy it is to do some advanced data reshaping and manipulation in R. We show you how to select a subset of your data and how to sort and order it. We explain how to merge different datasets based on columns they may have in common. Finally, we show you a very powerful generic strategy of splitting and combining data and applying functions over subsets of your data. When you understand this strategy, you can use it over and over again to do sophisticated data analyses in only a few small steps.
After reading this part, you’ll know how to describe and summarize your variables and data using R. You’ll be able to do some classical tests (for example, calculating a t-test). And you’ll know how to use random numbers to simulate some distributions.
Finally, we show you some of the basics of using linear models (for example, linear regression and analysis of variance). We also show you how to use R to predict the values of new data using models that you’ve fitted to your data.
They say that a picture is worth a thousand words. This is certainly the case when you want to share your results with other people. In this part, you discover how to create basic and more sophisticated plots to visualize your data. We move on from bar charts and line charts, and show you how to present cuts of your data using facets.
In this part, we show you how to do ten things in R that you probably use Microsoft Excel for at the moment (for example, how to do the equivalent of pivot tables and lookup tables). We also give you ten tips for working with packages that are not part of base R.
Icons Used in This Book
As you read this book, you’ll find little pictures in the margins. These pictures, or icons, mark certain types of text:
When you see the Tip icon, you can be sure to find a way to do something more easily or quickly.
You don’t have to memorize this book, but the Remember icon points out some useful things that you really should remember. Usually this indicates a design pattern