Exploratory Factor Analysis. W. Holmes Finch
Dfi = Value of discriminant function f for individual i
wfp= Discriminant weight relating function f and variable p
xpi = Value of variable p for individual i.
For each of these discriminant functions (Df), there is a set of weights that are akin to regression coefficients and correlations between the observed variables and the functions. Interpretation of the DA results usually involves an examination of these correlations. An observed variable having a large correlation with a discriminant function is said to be associated with that function in much the same way that indicator variables with large loadings are said to be associated with a particular factor. Quite frequently, DA is used as a follow-up procedure to a statistically significant multivariate analysis of variance (MANOVA). Variables associated with discriminant functions with statistically significantly different means among the groups can be concluded to contribute to the group mean difference associated with that function. In this way, the functions can be characterized just as factors are, by considering the variables that are most strongly associated with them.
Canonical correlation (CC) works in much the same fashion as DA, except that rather than having a set of continuous observed variables and a categorical grouping variable, CC is used when there are two sets of continuous variables for which we want to know the relationship. As an example, consider a researcher who has collected intelligence test data that yields five subtest scores. In addition, she has also measured executive functioning for each subject in the sample, using an instrument that yields four subtests. The research question to be addressed in this study is, how strongly related are the measures of intelligence and executive functioning? Certainly, individual correlation coefficients could be used to examine how pairs of these variables are related to one another. However, the research question in this case is really about the extent and nature of relationships between the two sets of variables. CC is designed to answer just this question, by combining each set into what are known as canonical variates. As with DA, these canonical variates are orthogonal to one another so that they extract all of the shared variance between the two sets. However, whereas DA created the discriminant function by finding the linear combinations of the observed indicators that maximized group mean differences for the functions, CC finds the linear combinations for each variable set that maximize the correlation between the resulting canonical variates. Just as with DA, each observed variable is assigned a weight that is used in creating the canonical variates. The canonical variate is expressed as in Equation 1.2.
Cvi = wc1 x1i + wc2 x2i + ⋅⋅⋅ + wcp xpi (Equation 1.2)
where
Cvi = Value of canonical variate v for individual i
wcp = Canonical weight relating variate v and variable p
xpi = Value of variable p for individual i.
Note how similar Equation 1.1 is to Equation 1.2. In both cases, the observed variables are combined to create one or more linear combination scores. The difference in the two approaches is in the criteria used to obtain the weights. As noted above, for DA the criteria involve maximizing group separation on the means of Df, whereas for CC the criteria is the maximization of correlation between Cv for the two sets of variables.
The final statistical model that we will contrast with EFA is partial least squares (PLS), which is similar to CC in that it seeks to find linear combinations of two sets of variables such that the relationship between the sets will be maximized. This goal stands in contrast to EFA, in which the criterion for determining factor loadings is the optimization of accuracy in reproducing the observed variable covariance/correlation matrix. PLS differs from CC in that the criterion it uses to obtain weights involves both the maximization of the relationship between the two sets of variables as well as maximizing the explanation of variance for the variables within each set. CC does not involve this latter goal. Note that PCA, which we discuss in Chapter 3, also involved the maximization of variance explained within a set of observed variables. Thus, PLS combines, in a sense, the criteria of both CC and PCA (maximizing relationships among variable sets and maximizing explained variance within variable sets) in order to obtain linear combinations of each set of variables.
A Brief Word About Software
There are a large number of computer software packages that can be used to conduct exploratory factor analysis. Many of these are general statistical software packages, such as SPSS, SAS, and R. Others are specifically designed for latent variable modeling, including Mplus and EQS. For many exploratory factor analysis problems, these various software packages are all equally useful. Therefore, you should select the one with which you are most comfortable, and to which you have access. On the other hand, when faced with a nonstandard factor analysis problem, such as having multilevel data, the use of specialized software designed for these cases might be necessary. In order to make this text as useful as possible, on the book website at study.sagepub.com/researchmethods/qass/finch-exploratory-factor-analysis, I have included example computer code and the annotated output for all of the examples included in the text, as well as additional examples designed to demonstrate the various analyses described here. I have attempted to avoid including computer code and output in the book itself so that we can keep our focus on the theoretical and applied aspects of exploratory factor analysis, without getting too bogged down in computer programming. However, this computer-related information does appear on the book website, and I hope that it will prove helpful to you.
Outline of the Book
The focus of this book is on the various aspects of conducting and interpreting exploratory factor analysis. It is designed to serve as an accessible introduction to this topic for readers who are wholly unfamiliar with factor analysis and as a reference to those who are familiar with it and who need a primer on some aspect of the method. In Chapter 2, we will lay out the mathematical foundations of factor analysis. This discussion will start with the correlation and covariance matrices for the observed variables, which serves as the basis upon which the parameters associated with the factor analysis model are estimated. We will then turn our attention to the common factor model, which expresses mathematically what we see in Figure 1.1. We will conclude Chapter 2 with a discussion of some important statistics that will be used throughout the book to characterize the quality of a particular factor solution, including eigenvalues, communalities, and error variances.
Chapter 3 presents the first major step in conducting a factor analysis, extraction of the factors themselves. Factor extraction involves the initial estimation of the latent variables that underlie a set of observed indicators. We will see that there are a wide range of methods for extracting the initial factor structure, all with the goal of characterizing the latent variables in terms of the observed ones. The relationships between the observed and latent variables are expressed in the form of factor loadings, which can be interpreted as correlations between the observed and latent variables. The chapter describes various approaches for estimating these loadings, with a focus on how they differ from one another. Finally, we conclude Chapter 3 with an example. Chapter 4 picks up with the initially extracted factor loadings, with a discussion of the fact that the initially extracted loadings are rarely interpretable. In order to render them more useful in practice, we must transform them using a process known as rotation. We will see that there are two general types of rotation: one allowing factors to be correlated (oblique) and the other which restricts the correlations among the factors to be 0 (orthogonal). We will then describe how several of the more popular of these rotations work, after which we present a full example,