Applied Biostatistics for the Health Sciences. Richard J. Rossi
Frontal and vertex balding areas merge into one and increase in size.
7 VII All hair is lost along the front hairline and crown.
Clearly, the values of the variable Baldness indicate an increasing degree of hair loss, and thus, Baldness as measured on the Norwood–Hamilton scale is an ordinal variable. This variable is also measured on the Offspring Cohort in the Framingham Heart Study.
2.1.2 Quantitative Variables
A quantitative variable is a variable that takes only numeric values. The values of a quantitative variable are said to be measured on an interval scale when the difference between two values is meaningful; the values of a quantitative variable are said to be measured on a ratio scale when the ratio of two values is meaningful. The key difference between a variable measured on an interval scale and a ratio scale is that on a ratio scale there is a “natural zero” representing absence of the attribute being measured, while there is no natural zero for variables measured on only an interval scale. Some scales of measurement will have natural zero and some will not. When a measurement scale has a natural zero, then the ratio of two measurements is a meaningful measure of how many times larger one value is than the other. For example, the variable Fat that represents the grams of fat in a food product is measured on a ratio scale because the value Fat = 0 indicates that the unit contained absolutely no fat. When a scale of measurement does not have a natural zero, then only the difference between two measurements is a meaningful comparison of the values of the two measurements. For example, the variable Body Temperature is measured on a scale that has no natural zero since Body Temperature = 0 does not indicate that the body has no temperature.
Since interval scales are ordered, the difference between two values measures how much larger one value is than another. A ratio scale is also an interval scale but has the additional property that the ratio of two values is meaningful. Thus, for a variable measured on an interval scale the difference of two values is the meaningful way to compare the values, and for a variable measured on a ratio scale both the difference and the ratio of two values are meaningful ways to compare difference values of the variable. For example, body temperature in degrees Fahrenheit is a variable that is measured on an interval scale so that it is meaningful to say that a body temperature of 98.6 and a body temperature of 102.3 differ by 3.7 degrees; however, it would not be meaningful to say that a temperature of 102.3 is 1.04 times as much as a temperature of 98.6. On the other hand, the variable weight in pounds is measured on a ratio scale, and therefore, it would be proper to say that a weight of 210 lb is 1.4 times a weight of 150 lb; it would also be meaningful to say that a weight of 210 lb is 60 lb more than a weight of 150 lb.
Example 2.4
The following questions were asked in the Framingham Heart Study on the Offspring Cohort and the corresponding variables recorded. The variables are listed in parentheses after each question. Determine which of these variables are qualitative and which quantitative. For the qualitative variables determine whether they are nominal or ordinal variables.
1 What is your gender? (Gender)
2 Systolic blood pressure (Systolic Blood Pressure)
3 Do you smoke? (Smoke)
4 How many cigarettes do you smoke per day? (No. Cigarettes)
5 What is your age? (Age)
6 How many times per week do you engage in intense physical activity? (No. Physical Activity)
7 How is your health now? (Health)
Solutions
1 Gender is a nominal qualitative variable.
2 Systolic Blood Pressure is a quantitative variable.
3 Smoke is a nominal qualitative variable.
4 No. Cigarettes is a quantitative variable.
5 Age is a quantitative variable.
6 No. Physical Activity is a quantitative variable.
7 Health is an ordinal qualitative variable.
A quantitative variable can also be classified as either a discrete variable or a continuous variable. A quantitative variable is a discrete variable when it can take on a finite or a countable number of values; a quantitative variable is a continuous variable when it can take on any value in one or more intervals. Note that the values that a discrete variable can take on are distinct, isolated, and can be counted. In the previous example, the variables Age, No. Cigarettes, and No. Physical Activity are discrete variables. A counting variable is a specialized discrete variable that simply counts how many times a particular event has occurred. The values a counting variables can take on are the values 0,1,2,3,…,∞. For example, in the Framingham Heart Study the variables No. Cigarettes and No. Physical Activity are counting variables.
Example 2.5
The following variables are all counting variables:
1 The number of cancer patients in remission following treatment at a hospital.
2 The number of laboratory mice that survive in an experiment.
3 The number of white blood cells in a 10 ml blood sample.
Continuous variables are variables that can take on any value in one or more intervals. Examples of continuous variables are the exact weight of a subject, the exact dose of a drug, and the exact height of a subject. In most problems, there will be variables of interest that are continuous variables, but because the variable can only be measured to a specific accuracy, a discrete version of the variable is used. For example, the weight of a subject is a continuous variable, but when it is measured only in pounds or tenths of pounds, it is a discrete variable.
Example 2.6
The following variables are continuous variables that might be measured on a discrete measurement scale:
1 Body temperature since it is usually measured in tenths of degrees.
2 Lung capacity since it is a volume and is usually measured in cubic centimeters.
3 Tumor size since it is measured as a depth in tenths of centimeters.
It is important that a variable truly reflects the characteristic being studied. A variable is said to be a valid variable when the measurements on the variable truly represent the characteristic the variable is supposed to be measuring. The validity of a variable depends on the characteristic being measured and the measuring device being used to measure the characteristic. When a characteristic of a unit is subjective in nature, it will be difficult to measure the characteristic accurately, and in this case, the validity of any variables used to measure this subjective characteristic is usually questionable.
Example 2.7
The intelligence of an individual is a subjectively measured characteristic. There are many tests that have been developed to measure intelligence. For example, the Fagan test measures the amount of time an infant spends inspecting a new object and compares this time with the time spent inspecting a familiar object (Fagan and Detterman, 1992). The validity of the Fagan test as a measure of intelligence, however, has been questioned by several scientists who have studied the relationship between intelligence and the Fagan test scores.
The diagram in Figure 2.1 summarizes the different types of variables/data that can be observed.
Figure 2.1 Different types of classifications for variables.
2.1.3 Multivariate Data
In most research problems, there will be many variables that need to be measured.