Applied Univariate, Bivariate, and Multivariate Statistics. Daniel J. Denis
As we can see, alpha indeed did increase to 0.824 as indicated it would based on our previous output. Hence, according to coefficient alpha, dropping item 5 may be worthwhile in the hopes of improving the instrument and making its items a bit more interrelated.
Though we have provided an easy demonstration of Cronbach's alpha, it would be negligent at this point to not issue a few cautions and caveats regarding its everyday use. According to Green and Yang (2009), the regular employment of coefficient alpha for assessing reliability should be discouraged based on the fact that assumptions for the statistic are rarely ever met, and hence the statistic can exhibit a high degree of bias. What is more, according to a now classic paper by Schmitt (1996), alpha should not be used to conclude anything about unidimensionality of a test, and thus should not be interpreted as such. Confirmatory factor analysis models (Chapter 15) are typically better suited for assessing and establishing the dimensionality of a set of items. What is more, cut‐offs for alpha regarding what is low versus high internal consistency can be very difficult to define, and as argued by Schmitt, low levels of alpha may still be useful. Hence, though easily computable in SPSS and other software, the reader should be cautious about the unrestricted employment of alpha in their work. For more details on how it should be used, in addition to the aforementioned sources, Cortina (1993) and Miller (1995) are very informative readings and should be read before you readily and regularly adopt alpha in your everyday statistical toolkit.
2.18 COVARIANCE AND CORRELATION MATRICES
Having reviewed the concept of covariance, we need a way to account for the covariance of many variables. For this, we write the sample covariance in matrix form:
where sjk are the covariances for variables j by k. The population covariance matrix ∑ can be analogously defined:
where along the main diagonal of the covariance matrix are variances σ11, σ22, etc., for variables 1, 2, etc., up to σpp, the variance of the pth variable.
When we standardize the covariance matrix, dividing each of its elements by respective products of standard deviations, we obtain the correlation matrix:
where r12 is the correlation between variables 1 and 2, etc., and r1p is the correlation between variable 1 and the pth variable.
An example of a correlation matrix (Heston, 1948) is that between different tests on the GRE (Graduate Record Examination):
Intercorrelations Among The G.R.E. Tests Of General Education Math P.S. B.S. Soc. Lit. Arts Exp. Voc. Mathematics .55 .44 .51 .36 .35 .52 .38 Physical Science .55 .49 .43 .20 .40 .32 .29 Biological Science .44 .49 .57 .42 .42 .46 .50 Social Studies .51 .43 .57 .54 .40 .61 .59 Literature .36 .20 .42 .54 .39 .53 .54 Arts .35 .40 .42 .40 .39 .42 .52 Effecive Expression .52 .32 .46 .61 .53 .42 .66 Vocabulary .38 .29 .50 .59 .54 .52 .66
From the matrix, we can see that most correlations are low to moderate, with the correlation between Effective Expression and Vocabulary relatively large at a value of 0.66. The correlation between Physical Science and Vocabulary is relatively small, equaling 0.29.
2.19 OTHER CORRELATION COEFFICIENTS
It often happens that once we hear of Pearson's r, this becomes the only correlation coefficient in one's vocabulary, and too often the concept, rather than calculation, of a correlation is automatically linked to Pearson's r. Pearson r is but one of many correlation coefficients available at one's disposal in applied research. Recall that Pearson r captures linear relationships between (typically) continuous variables. If the relationship is not linear, or one or more variables are not continuous, or again if the data are in the form of ranks, then other correlation coefficients are generally more suitable. We briefly review Spearman's rho, although a host of other correlation coefficients exist that are well‐suited for a variety of particular types of data.8
Spearman's rs (“rho”), named after Charles Spearman who developed the coefficient in 1904,9is a correlation coefficient suitable for data on two variables that are expressed in terms of ranks rather than actual measurements on a continuous scale. Mathematically, the Spearman correlation coefficient is equivalent to a Pearson r when the data are ranked. There are important differences between these two coefficients. Spearman's rs can be defined as:
where Rx and Ry are the ranks on xi and yi for the ith individual in the data,
> cor.test(parent, child, method = "spearman") Spearman's rank correlation rho data: parent and child S = 76569964, p-value < 2.2e-16 alternative hypothesis: true rho is not equal to 0 sample estimates: rho 0.4251345
We see that rs of 0.425 is slightly less than was Pearson r of 0.459.
To understand why Spearman's rank correlation and Pearson coefficient differ, consider data (Table 2.5) on the rankings of favorite movies for two individuals. In parentheses are subjective scores of “favorability” of these movies, scaled 1–10, where 1 = least favorable and 10 = most favorable.
From the table, we can see that Bill very much favors Star Wars (rating of 10) while least likes Batman (rating of 2.1). Mary's favorite movie is Scarface (rating of 9.7) while her least favorite movie is Batman (rating of 7.6). We will refer to these subjective scores in a moment. For now, we focus only on the ranks. For instance, Bill's ranking of Scarface is third, while Mary's ranking of Star Wars is third.
Table 2.5 Favorability of Movies for Two Individuals in Terms of Ranks
Movie | Bill | Mary |
---|---|---|
Batman | 5 (2.1) | 5 (7.6) |
Star Wars | 1 (10.0) | 3 (9.0) |
Scarface | 3 (8.4) | 1 (9.7) |
Back to the Future |