Hostname: page-component-cd9895bd7-dk4vv Total loading time: 0 Render date: 2025-01-04T04:32:58.223Z Has data issue: false hasContentIssue false

Note on Factor Analysis

Published online by Cambridge University Press:  01 January 2025

Edwin B. Wilson
Affiliation:
Harvard School of Public Health, Boston, Mass
Jane Worcester
Affiliation:
Harvard School of Public Health, Boston, Mass

Abstract

Certain assumptions and procedures basic to factor analysis are examined from the point of view of the mathematician. It is demonstrated that the Hotelling method does not yield meaningful traits, and an example from the theory of gas mixtures with convertible components is cited as evidence. The justification of current methods for determining the adequacy of the reproduction of a correlation matrix by a factorial matrix is questioned, and a x2 criterion, practical only for a small matrix, is proposed. By means of a hypothetical example from geometry, it is shown that results of a Hotelling analysis are necessarily relative to the population at hand. The factorial effects of the adjunction of a “total test” to a group of tests are considered. Some of the general considerations and questions raised are pertinent to types of analysis other than the Hotelling.

Type
Original Paper
Copyright
Copyright © 1939 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

*

The calculations which lie in the background of this Note, but of which only relatively few are in evidence, were made possible by a grant from the Carnegie Corporation of New York to whom grateful acknowledgment is hereby made.

References

* It is to be noted that the measurements x are ordinarily not relative to the means of any group but definite positive numbers like pressure or temperature and that to introduce the device of referring them t o the mean of a group itself leads us away from the concept of a trait as a characteristic of an individual of course if the traits in which we are interested are those of the group, this matter may be quite different.

The assumption of noncorrelation should be noted. It is a mathematical assumption imposed in advance of the definition of what factors or traits really are and, of course, tremendously constrains the definition. Independence of variables as used in the physical sciences is not related to noncorrelation of those variables in groups of physical objects. How powerfully the assumption of noncorrelation impinges upon our notion of what traits are or upon our choice of groups which may be submitted to factor analysis may be conjectured from considering the statement: No two ratable characteristics of individuals which should be found to be correlated in any group which might properly be subjected to a factor analysis could both be considered as traits.

In this statement we see again the encroachment of the ideological element “group” upon the ideological element “individual.” It is doubtful if there is utility, it may be there would be disutility in introducing standard measures of such a thing as pressure or temperature or, in the psychological field, of I.Q. or any other individual trait. Ordinarily in scientific measurements “zeros” of scales and "“units” of scales are not defined relative to groups.

* We shall confine our illustrations almost exclusively to the Hotelling type of analysis which apparently has been adopted by Kelley (Essential Traits of Mental Life), neglecting for the most part both the early Spearman (The Abilities of Man) and Kelley (Crossroads in the Mind of Man) analyses and the later Thurstone configurational analysis (The Vectors of Mind). However, it may be that some of the general considerations and points of view of this Note are applicable with slight modifications to other types of analyses than the Hotelling — for example, §§ 4-5 should apply to any analysis which aims to reproduce a set of correlation coefficients.

An arithmetical word about the factor loadings. By the method of obtaining these coefficients the sum of the squares of the loadings for each factor is 1, it being understood that the factors γ obtained by using these loadings are not unit vectors but have the variance γ. When the analysis is complete, i.e., for as many factors as there are tests, the sum of the squares of the loadings for each test on all the factors is also 1. Moreover, the products of the corresponding loadlugs for any two factors is zero and the products, if the analysis is complete, of the corresponding loadings of any pair of tests for all factors is also zero. This may be stated briefly by saying that the cemplete set of factor loadings is an orthogonal set of coefficients, i.e., such a set as arises in rotating one set of orthogonal axes into another. However, if we use these coefficients to express the tests in terms of the factors, the tests being non-orthogonal unit vectors, the factors must be considered to include their standard deviations, being thus orthogonal non-unit vectors of specific magnitude; and if we use the coefficients to express the factors in terms of the tests, the factors again are non-unit orthogonal vectors. The use of the orthogonal coefficients is very convenient because the tests and factors solve immediately either for the other. There seems to be a confusion in this matter on p. 60 of Kelley’s Essenticd Traits of Mental Life.

* Thurstone (The Vectors of Mind, p. 131) maintains that the Hotelling unit vectors and orthogonal transforms of them are without psychological significance unless it be that some particular transform is sh6~m to have such significance.

Wilson, E. B., Empiricism and rationalism, Science, 1926, 64, 47-57. The detailed study of the simple case of three intercorrelated variables throws enough light on some of these questions connected with the Hotelling factor analysis (as it does on the Spearman analysis for g) that explicit formulas by which the three values of λ may be directly computed are worth while. If θ is that angle less than 180° for which , then . The formulas are readily evaluated by logarithms.

* This point seems often to be omitted in statistical treatments although it is of course implied in some of the advanced discussions of least squares and of correlation. A detailed illustration of what the point may mean for particular cases may be found in Talbot, F. B., Wilson, E. B. and Worcester, Jane, Basal metabolism of girls, Amer. J. Diseases Children, 1937, 53, pt. 2, 275-347 with especial reference to §§14-17.

* Thurstone, L. L., loc. cit., 113-116. The mean value of the 105 correlation coefficients is .316; the largest is .641 and five others are above .5; whether a simple test based on standard deviations without reference to the effects of correlation is valid may be doubted.

* Lorge, I. and Morrison, N., The reliability of principal components. Science, 1938, 87, 491-2. The authors compare two batteries of 5 tests each which we understand them to consider to be psychologically equivalent. From the names of the individual tests in the two series one is tempted to believe that the tests are considered to be equivalent individually in pairs. If this be so one may make up a large number of equivalent batteries and apply a similar analysis to each.

* As used by Thurstone, loc. cit., p. 86.

This value of t has and is in reality that team test of unit variance which correlates most highly with g , part of which is in reality unknown. This t is identical with the unit vector γ above. For g we can only write , where e is unknown. The formula for t is , where , etc.

* This assumption of noncorrelation between the variations x,y,z for the different bricks of a group is of course a gratuitous assumption which might or might not be true of any set of bricks depending on the genetics of their manufacture. A similar analysis could be given for any specified values of rxy, rxz, ryz.

* If there had been correlations between x,y,z, no rotation of the γ's could give the x's because the latter would not be an orthogonal set. It may be remarked that no Spearman analysis (for one general) can be given to these two examples because, in each, one of the partial correlations, rvei is negative.

* See footnote on p. 143.

Carter, H. D., Conrad, H. S. and Jones, M. C. A multiple factor analysis of children's annoyances. Jour. Gen. Psychol., 1935, 47, 282-298.

If the test x has the n ratings x 1, x 2, . . . , x n for the n individuals of a group, the expression xx in Gibbs’ notation is the matrix A discussion of dyadics following the Gibbs’ system may be found for 3 dimensions in Gibbs-Wilson, Vector Analysis, and for any number of dimensions in Wilson, E. B., On the theory of double products and strains in hyperspace., Trans. Conn. Acad. Sci., 1908, 14, 1-57.

* If one should desire to add a “total test” to the system of three tests on which one had performed a Spearman analysis (§6) one would find that the necessity for the vanishing of the tetrads limited the total tests to being precisely the team test t which had already been found and that the team test t’ for the system of four tests (three original plus t) would be identical with t but that g would be changed in that its known part would be increased and its unknown part diminished, which does not make sense inasmuch as the team test t contains no new information.

G. H. Thomson has run a long series of penetrating comments on factor analysis from at least as early as 1916 to the present time. No attempt will be made to cite from the series; the student of factor analysis should be familiar with them. We may mention also Tryon, R. C., The theory of psychological components — an alternative to “mathematical factors,” Psychol. Rev., 1935, 42, 425-454.