Interpreting and Using Statistics in Psychological Research. Andrew N. Christopher
we are left to figure out how frequently each score occurred. Furthermore, we don’t know with much precision what “high” and “low” scores are at this point.
So let’s now consider Table 3.3, which contains a second type of table to organize data for a variable. This table contains a frequency distribution in which all of the burnout scores are listed along with how often (frequently) each score appears in the dataset. Importantly, we can discern the relative frequency (percentage of the sample that had a particular score on that variable) of a given score, allowing us to be more precise in what we mean by “high” and “low” scores on this variable. As you can see in Table 3.3, the score of 41 occurred three times, representing 2.78% of the total scores. Likewise, the scores of 28 and 31 both occurred more than any other score in this dataset.
Frequency distribution: table that contains all scores in a dataset, along with each score’s frequency of occurrence.
Of course, similar to Table 3.2, the presentation in Table 3.3 is perhaps a bit much to understand, at least easily. Therefore, we have a third type of frequency table, what’s called a grouped frequency distribution table. You can see a grouped frequency distribution in Table 3.4. What we’ve done is combine scores into categories. We present these categories of scores and the frequency with which a score falls into each category. A grouped frequency distribution does not tell us about every score in the dataset, but in research, the reality is that we need to be parsimonious in how we present data. That is, we need to simplify our presentations, and a grouped frequency distribution allows us to do this.
Grouped frequency distribution table: table that contains all scores in a dataset, clustering the scores into categories, and presents the frequency of occurrence for each category rather than for individual scores.
Before we interpret the information in Table 3.4, let’s discuss two guidelines for making a grouped frequency distribution table.
1 We should have approximately 10 categories of scores (Hinkle, Wiersma, & Jurs, 1988). Although more categories are not necessarily “wrong,” remember we want to be parsimonious in our presentation. For most people, 10 categories are enough to be informative without being overwhelming.
2 The range of possible scores for each category should be the same. To determine the range of possible scores for the categories, locate the highest and lowest score in the dataset. Take the difference between these two scores, and then divide that difference by the number of categories you want to have. For example, in Wendt’s (2013) data, the lowest score was 16 and the highest score was 55. This means that, including the scores of 16 and 55, we have a range of 40 scores. With 10 categories, we have four possible scores that can fall into each of the categories.
Now let’s interpret the information in Table 3.4. Perhaps as you might expect, most of the scores fall in the middle of the distribution. For example, approximately 66% of the burnout scores fall in the middle four categories (that is, scores between 28 and 31, between 32 and 35, between 36 and 39, and between 40 and 43). As we move away from those four middle categories, what do you notice? Indeed, the further away from the middle we go, the progressively fewer scores we see. As noted earlier, the possible range of scores on this measure was 15 to 75. With no scores greater than 55 in this sample, it appears that we have a sample that is not terribly burned out when it comes to school. This kind of information that a frequency distribution provides is helpful in understanding the sample that we are studying.
Table 3.1
Table 3.2
Table 3.3
Table 3.4
Frequency Distribution Graphs
Frequency distribution tables, as I am sure you will agree, are great tools to organize data. But, of course, a picture is worth 1,000 words. So in addition to frequency distribution tables, frequency distribution graphs can be helpful in understanding a dataset. In fact, in psychological research, it is more common to see a frequency distribution graph than a similar type of table. Let’s now discuss the three major types of frequency distribution graphs.
A critical consideration in selecting a frequency distribution graph is the scale of measurement for a variable. In the previous chapter, we discussed nominal, ordinal, and scale measurements. If our data are nominal, we use what’s called a bar graph. If our data are ordinal or scale, we use what are called histograms and frequency polygons.
A bar graph uses vertical bars above each category listed on the x-axis to display the frequency for a category. There is a space between the bars because each category is distinctly different from the other categories. For example, Figure 3.1 contains a bar graph for a nominal variable in Wendt’s (2013) research, students’ year in college. This is a nominal variable because each participant was either a first-year or a senior. No person could be in both categories.
Bar graph: graphical representation of the frequency of nominal data in which each category appears on the x-axis and the frequency of occurrence for a given score appears on the y-axis.
Looking closely at Figure 3.1, notice that each category is listed on the x-axis, with the name of the variable beneath the category names. Each bar is centered above its category name. There is a space between the two bars because there can be no overlap between the categories; that is, a first-year cannot be a senior, and a senior cannot be a first-year. Finally, the y-axis provides the frequency numbers.
Figure 3.1 Displaying Frequency Nominal Data With a Bar Graph
Whereas a bar graph is used for nominal data, we use a histogram for ordinal and scale data. Once again we have the values of a variable on the x-axis and the frequencies on the y-axis. A histogram for Wendt’s (2013) burnout scores appears in Figure 3.2. Notice here how there is no space between the values along the x-axis as there was for a bar graph. Because we are dealing with scale data, we have values that inherently increase as we move from the left to the right side of the axis (the same would be true of ordinal data). If you look back at the bar graph in Figure 3.1, there is no reason that the category of “senior” could not appear before the category of “first-year.”1
Figure 3.2 Displaying Frequency Scale Data With a Histogram
Histogram: similar to a bar graph, except used for ordinal and scale data that are discrete; that is, each score is different from all other scores.
Similar to histograms, frequency polygons are used for ordinal and scale data. Rather than bars, a polygon uses a line graph to display the frequency of scores or categories of scores. We again have scores or categories of scores on