Probability with R. Jane M. Horgan

Probability with R - Jane M. Horgan


Скачать книгу
rel="nofollow" href="#fb3_img_img_f3914c82-5976-5229-b627-1c8c7c409d9b.png" alt="images"/> with
values is given by

      In standard programming languages, implementing this formula would necessitate initialization and loops, but with R, statistical calculations such as these are much easier to implement. For example,

      sum(downtime)

      gives

      576

      which is the sum of the elements in

      length(downtime)

      gives

       23

      gives the number of elements in

.

      To calculate the mean, write

      meandown <- sum(downtime)/length(downtime) meandown [1] 25.04348

      Let us also look at how to calculate the standard deviation of the data in

.

      The formula for the standard deviation of

data points stored in an
vector is

      We illustrate step by step how this is calculated for

.

       downtime - meandown [1] -25.04347826 -24.04347826 -23.04347826 -13.04347826 -13.04347826 [6] -11.04347826 -7.04347826 -4.04347826 -4.04347826 -2.04347826 [11] -1.04347826 -0.04347826 3.95652174 2.95652174 4.95652174 [16] 4.95652174 4.95652174 7.95652174 10.95652174 18.95652174 [21] 19.95652174 21.95652174 25.95652174

      Then, obtain the squares of these differences.

      (downtime - meandown)^2 [1] 6.271758e+02 5.780888e+02 5.310019e+02 1.701323e+02 1.701323e+02 [6] 1.219584e+02 4.961059e+01 1.634972e+01 1.634972e+01 4.175803e+00 [11] 1.088847e+00 1.890359e-03 1.565406e+01 8.741021e+00 2.456711e+01 [16] 2.456711e+01 2.456711e+01 6.330624e+01 1.200454e+02 3.593497e+02 [21] 3.982628e+02 4.820888e+02 6.737410e+02

      Sum the squared differences.

      sum((downtime - meandown)^2) [1] 4480.957

      Finally, divide this sum by length(downtime)‐1 and take the square root.

      sqrt(sum((downtime -meandown)^2)/(length(downtime)-1)) [1] 14.27164

      You will recall that R has built‐in functions to calculate the most commonly used statistical measures. You will also recall that the mean and the standard deviation can be obtained directly with

      mean(downtime) [1] 25.04348 sd(downtime) [1] 14.27164

      We took you through the calculations to illustrate how easy it is to program in R.

      2.4.1 Creating Functions

      The skewness coefficient is defined as

      A perfectly symmetrical set of data will have a skewness of 0; when the skewness coefficient is substantially greater than 0, the data are positively asymmetric with a long tail to the right, and a negative skewness coefficient means that data are negatively asymmetric with a long tail to the left. As a rule of thumb, if the skewness is outside the interval

, the data are considered to be highly skewed. If it is between
1 and
0.5 or 0.5 and 1, the data are moderately skewed.

      Example 2.2 A program to calculate skewness

      The following syntax calculates the skewness coefficient of a set of data and assigns it to a function called

that has one argument
.

      skew <- function(x) { xbar <- mean(x) sum2 <- sum((x-xbar)^2, na.rm = T) sum3 <- sum((x-xbar)^3, na.rm = T) skew <- (sqrt(length(x))* sum3)/(sum2^(1.5)) skew}

      You will agree that the conventions of vector calculations make it very easy to calculate statistical functions.

      When skew has been defined, you can calculate the skewness on any data set. For example,

      skew(downtime)

      gives

      [1] -0.04818095

      which indicates that the

data is slightly negatively skewed.

      skew(usage) [1] 1.322147

      skew(usage[3:9]) [1] 0.4651059

      which is very much smaller than that obtained with the full set.

      2.4.2 Scripts

      There are various ways of developing programs in R.

      The most useful way of writing programs is by means of R 's own built‐in editor called

. From
at the toolbar click on New Script (File/New Script). You are then presented with a blank screen to develop your program. When done, you may save and retrieve this program as you wish. File/Save causes the file to be saved. You may designate the name you want to call it, and it will be given a .R extension. In subsequent sessions, File/Open Script brings up all the .R files that you have saved. You can select the one you wish to use.

      When you want to execute a line or group of lines, highlight them and press Ctrl/R, that is, Ctrl and the letter R simultaneously. The commands are then transferred to the control window and executed.

      Alternatively, if the program is short, it may be developed interactively while working at your computer.

      Programs may also be developed in a text editor, like Notepad, saved with the .R extension and retrieved using the source


Скачать книгу