Probability with R. Jane M. Horgan
href="#fb3_img_img_1d5524f8-0129-580f-b9a1-6a417959714b.png" alt="images"/> from
Often, the data for supervised learning are randomly divided into two parts, one for training and the other for testing. In machine learning, we derive the line of best fit from the training set
The testing set is used to see how well the line actually fits. Usually, an
Example 3.1
Suppose there are 50 pairs
TABLE 3.1 The Training Set
Observation Numbers |
|
|
Observation Numbers |
|
|
1 | 11.8 | 31.3 | 21 | 15.1 | 80.1 |
2 | 10.8 | 59.9 | 22 | 14.7 | 66.9 |
3 | 8.6 | 27.6 | 23 | 10.5 | 42.0 |
4 | 10.3 | 57.7 | 24 | 10.9 | 72.9 |
5 | 8.5 | 50.2 | 25 | 11.6 | 67.8 |
6 | 11.6 | 52.1 | 26 | 9.1 | 45.3 |
7 | 14.4 | 79.1 | 27 | 5.4 | 30.2 |
8 | 8.6 | 32.3 | 28 | 8.8 | 49.6 |
9 | 12.4 | 58.8 | 29 | 11.2 | 44.3 |
10 | 14.9 | 79.5 | 30 | 7.4 | 46.1 |
11 | 8.9 | 57.0 | 31 | 7.9 | 45.1 |
12 | 8.7 | 35.1 | 32 | 12.2 | 46.5 |
13 | 11.7 | 68.2 | 33 | 8.5 | 42.7 |
14 | 11.4 | 60.1 | 34 | 9.3 | 56.3 |
15 | 8.8 | 44.5 | 35 | 10.0 | 27.4 |
16 | 5.9 | 28.9 | 36 | 3.8 | 20.2 |
17 | 13.5 | 75.8 | 37 | 14.9 | 68.5 |
18 | 8.7 | 48.7 | 38 | 12.4 | 72.6 |
19 | 11.0 | 54.7 | 39 |
|