Factor analysis is used to uncover the latent structure (dimensions) of a set of variables. It reduces attribute space from a larger number of variables to a smaller number of factors and as such is a "non-dependent" procedure (that is, it does not assume a dependent variable is specified). Factor analysis could be used for any of the following purposes:
To reduce a large number of variables to a smaller number of factors for modeling purposes, where the large number of variables precludes modeling all the measures individually. As such, factor analysis is integrated in structural equation modeling (SEM), helping confirm the latent variables modeled by SEM. However, factor analysis can be and is often used on a stand-alone basis for similar purposes.
To establish that multiple tests measure the same factor, thereby giving justification for administering fewer tests. Factor analysis originated a century ago with Charles Spearman's attempts to show that a wide variety of mental tests could be explained by a single underlying intelligence factor (a notion now rejected, by the way).
To validate a scale or index by demonstrating that its constituent items load on the same factor, and to drop proposed scale items which cross-load on more than one factor.
To select a subset of variables from a larger set, based on which original variables have the highest correlations with the principal component factors.
To create a set of factors to be treated as uncorrelated variables as one approach to handling multicollinearity in such procedures as multiple regression
To identify clusters of cases and/or outliers.
To determine network groups by determining which sets of people cluster together (using Q-mode factor analysis, discussed below)
A non-technical analogy: A mother sees various bumps and shapes under a blanket at the bottom of a bed. When one shape moves toward the top of the bed, all the other bumps and shapes move toward the top also, so the mother concludes that what is under the blanket is a single thing, most likely her child. Similarly, factor analysis takes as input a number of measures and tests, analogous to the bumps and shapes. Those that move together are considered a single thing, which it labels a factor. That is, in factor analysis the researcher is assuming that there is a "child" out there in the form of an underlying factor, and he or she takes simultaneous movement (correlation) as evidence of its existence. If correlation is spurious for some reason, this inference will be mistaken, of course, so it is important when conducting factor analysis that possible variables which might introduce spuriousness, such as anteceding causes, be included in the analysis and taken into account.
Factor analysis is part of the general linear model (GLM) family of procedures and makes many of the same assumptions as multiple regression: linear relationships, interval or near-interval data, untruncated variables, proper specification (relevant variables included, extraneous ones excluded), lack of high multicollinearity, and multivariate normality for purposes of significance testing. Factor analysis generates a table in which the rows are the observed raw indicator variables and the columns are the factors or latent variables which explain as much of the variance in these variables as possible. The cells in this table are factor loadings, and the meaning of the factors must be induced from seeing which variables are most heavily loaded on which factors. This inferential labeling process can be fraught with subjectivity as diverse researchers impute different labels.
There are several different types of factor analysis, with the most common being principal components analysis (PCA), which is preferred for purposes of data reduction. However, common factor analysis is preferred for purposes of causal analysis anf for confirmatory factor analysis in structural equation modeling, among other settings..
Senin, 09 Februari 2009
Langganan:
Posting Komentar (Atom)
Tidak ada komentar:
Posting Komentar