Hostname: page-component-745bb68f8f-hvd4g Total loading time: 0 Render date: 2025-01-07T18:10:21.474Z Has data issue: false hasContentIssue false

Factor Uniqueness of the Structural Parafac Model

Published online by Cambridge University Press:  01 January 2025

Paolo Giordani*
Affiliation:
Sapienza Università di Roma
Roberto Rocci
Affiliation:
Sapienza Università di Roma
Giuseppe Bove
Affiliation:
University of Roma Tre
*
Correspondence should bemade to Paolo Giordani,Department of Statistical Sciences, Sapienza Università di Roma, P.le Aldo Moro, 5, 00185 Rome, Italy. Email: paolo.giordani@uniroma1.it
Rights & Permissions [Opens in a new window]

Abstract

Factor analysis is a well-known method for describing the covariance structure among a set of manifest variables through a limited number of unobserved factors. When the observed variables are collected at various occasions on the same statistical units, the data have a three-way structure and standard factor analysis may fail. To overcome these limitations, three-way models, such as the Parafac model, can be adopted. It is often seen as an extension of principal component analysis able to discover unique latent components. The structural version, i.e., as a reparameterization of the covariance matrix, has been also formulated but rarely investigated. In this article, such a formulation is studied by discussing under what conditions factor uniqueness is preserved. It is shown that, under mild conditions, such a property holds even if the specific factors are assumed to be within-variable, or within-occasion, correlated and the model is modified to become scale invariant.

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © 2020 The Author(s)

1. Introduction

Factor analysis (FA) (Bartholomew, Knott, & Moustaki, Reference Bartholomew, Knott and Moustaki2011) is a well-known method explaining the relationships among a set of manifest variables, observed on a sample of statistical units, in terms of a limited number of latent variables. In FA, data are stored in a matrix, say X, of order ( I × J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I \times J$$\end{document} ) being I and J the number of statistical units and variables, respectively. Thus, FA deals with two-way two-mode data, where the modes are the entities of the data matrix, i.e., statistical units and manifest variables, and the ways are the indexes of the elements of X, i.e., i = 1 , , I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i = 1, \ldots , I$$\end{document} and j = 1 , , J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j = 1, \ldots , J$$\end{document} . In many practical situations, it may occur that the scores on the same manifest variables with respect to a sample of statistical units are replicated across K different occasions, e.g., time, locations, conditions, etc. Examples can be found in several domains. In medicine, these are the daily measures of some vital characteristics on a set of patients. In chemistry, samples measured at different emission wavelengths and excitation wavelengths. In social sciences, think about the yearly evaluation of the main cities of a given area in terms of indicators assessing the quality of life. In marketing, ratings on some goods expressed by a sample of consumers. In psychology, the scores for a group of children on a set of personality variables (traits) rated by different judges (methods). In the previous examples, there are three sets of entities (statistical units, manifest variables and occasions), hence three modes. For instance, in the latter one, these are children, traits and methods. The available information is stored in the so-called array, or tensor, usually denoted by X ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{X }}}$$\end{document} of order ( I × J × K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I \times J \times K$$\end{document} ). Its generic element is x ijk \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_{ijk}$$\end{document} , i = 1 , , I , j = 1 , , J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i = 1, \ldots , I, j = 1, \ldots , J$$\end{document} and k = 1 , , K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k = 1, \ldots , K$$\end{document} , expressing the score of statistical unit i (e.g., child) on manifest variable j (e.g., trait) at occasion k (e.g., method). Therefore, the elements have three indexes and the array three ways. For all of these reasons, data are three-way three-mode. For further details, refer to, e.g., Kiers (Reference Kiers2000) and Kroonenberg (Reference Kroonenberg2008).

The basic FA model is not adequate to handle three-way three-mode data. The reason is that FA does not take into account properly the multiway multimode structure of the data. In principle, FA could still be applied. Specifically, it is possible to convert X ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{X }}}$$\end{document} into a matrix by the so-called matricization or unfolding. For instance, one can easily obtain the matrix X A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }_{\mathrm {A}}$$\end{document} of order ( I × JK \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I \times \textit{JK}$$\end{document} ) by juxtaposing next to each other the frontal slabs of X ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{X }}}$$\end{document} (statistical unit-mode matricization), where the frontal slabs of X ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{X }}}$$\end{document} are the standard two-way two-mode matrices X k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }_{k}$$\end{document} ( k = 1 , , K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k = 1, \ldots , K$$\end{document} ) of order ( I × J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I \times J$$\end{document} ) collected at the different occasions. Nonetheless, the FA model fitted to X A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }_{\mathrm {A}}$$\end{document} discovers latent variables ignoring that the same manifest variables are replicated across the occasions. For this reason, the standard FA model has been extended in order to take into account and exploit the increasing complexity of three-way three-mode data.

The most famous three-way three-mode extensions of FA are essentially based on the three-mode factor analysis model (Tucker, Reference Tucker and Harris1963; Tucker, Reference Tucker1966), usually named Tucker3, and parallel factor analysis model (Harshman, Reference Harshman1970), usually named Parafac. The latter can be seen as a particular case of the former with a useful property of parameter uniqueness (Kruskal, Reference Kruskal1977). Although such extensions were presented as suitable generalizations of FA, they can be considered as extensions of the principal component analysis (PCA) solution to the FA model. As a matter of fact, Carroll & Chang (Reference Carroll and Chang1970) proposed a canonical decomposition equivalent to Parafac, named Candecomp; as an extension of PCA in multidimensional scaling, Kroonenberg & De Leeuw (Reference Kroonenberg and De Leeuw1980) named three-mode principal component analysis a method based on the ordinary least squares (OLS) estimation of Tucker3. For these reasons, hereinafter such models will be said to follow a component-based approach. The reader interested in the relationships between FA and PCA may refer to, for instance, Unkel & Trendafilov (Reference Unkel and Trendafilov2010) and Adachi & Trendafilov (Reference Adachi and Trendafilov2019).

Some authors revised Tucker3 and Parafac as structural models for the covariance structure of the manifest variables (see, for example, Bentler, Poon, & Lee, Reference Bentler, Poon and Lee1988 and references therein). In other terms, the data generation process is explicitly specified. The I statistical units are considered independent observations of the same random variable, and the models are reformulated as a suitable reparameterization of the parameters of its distribution. In particular, each model specifies a certain structure for the variables × \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} occasions covariance matrix where the covariance matrix of the specific factors, i.e., the error term, is explicitly considered (see, for example, Kroonenberg & Oort, Reference Kroonenberg and Oort2003).

As far as we know, in the literature, the component-based approach has received much more attention than the structural one. In this paper, after recalling the main features of methods and models characterizing the two approaches, a structural extension of Parafac is considered. It will be shown how its fundamental factor uniqueness property is preserved even when some specific factors are correlated across occasions, or variables, and/or its structure is modified to become scale invariant. The effectiveness of the proposal is illustrated by a real-life example.

2. The Component-Based Approach

The Tucker3 model (Tucker, Reference Tucker1966) summarizes the three-way three-mode tensor X ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{X }}}$$\end{document} by looking for a limited number of components for the modes. The matrix formulation of the Tucker3 model in terms of the statistical unit-mode matricization X A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }_{\mathrm {A}}$$\end{document} is

(1) X A = AG A ( C B ) + E A , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{X }_{\mathrm {A}} = \mathbf{AG }_{\mathrm {A}} (\mathbf{C} \otimes \mathbf{B} )^{\prime } + \mathbf{E }_{\mathrm {A}}, \end{aligned}$$\end{document}

where the symbol ‘ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\otimes $$\end{document} ’ denotes the Kronecker product of matrices and A, B, C have order ( I × P \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I \times P$$\end{document} ), ( J × Q \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J \times Q$$\end{document} ), ( K × R \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K \times R$$\end{document} ), respectively. The Tucker3 model can be seen as a generalization of PCA where the JK columns of X A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }_{\mathrm {A}}$$\end{document} , J variables measured at K different occasions, are modeled through the linear combinations of P latent components for the statistical units (the columns of A), with weights G A ( C B ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{G }_{\mathrm {A}} (\mathbf{C} \otimes \mathbf{B} )^{\prime }$$\end{document} . By exploiting the symmetry of the model with respect to the modes, we derive that P, Q and R are the number of components for the statistical units, the manifest variables and the occasions, respectively, while the elements of A, B and C are the scores of the entities of the various modes on the corresponding components. Finally, G A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{G }_{\mathrm {A}}$$\end{document} is the statistical unit-mode matricization, of order ( P × QR \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P \times {\textit{QR}}$$\end{document} ), of the so-called core tensor G ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{G }}}$$\end{document} , of order ( P × Q × R \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P \times Q \times R$$\end{document} ), expressing the strength of the triple interactions among the components of the three modes. Like PCA, the model parameters are estimated in the OLS sense by minimizing E A 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Vert \mathbf{E }_{\mathrm {A}}\Vert ^{2}$$\end{document} , where · \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Vert \cdot \Vert $$\end{document} is the Frobenius norm of matrices. The solutions are not obtained by the singular value decomposition (SVD, Eckart & Young, Reference Eckart and Young1936) of a particular matrix as is for PCA. For this purpose, alternating least squares (ALS) algorithms can be applied. See, for instance, Kroonenberg & De Leeuw (Reference Kroonenberg and De Leeuw1980). As a possible extension of SVD to the multiway case, we mention the higher-order singular value decomposition (De Lathauwer, De Moor, & Vandewalle, Reference De Lathauwer, De Moor and Vandewalle2000).

The applicability of the Tucker3 model may be limited for several reasons. First of all, the interpretability of the solution, in particular of the elements of the core tensor, is usually a very complex issue. Moreover, differently from standard PCA, the solutions are not nested. Finally, the obtained solution is not unique. In fact, it is straightforward to see that we can post-multiply A, B and C by square non-singular transformation matrices of appropriate order, say P, Q and R, obtaining the new component matrices A T = AP \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{A }_{\mathrm {T}} = \mathbf{AP }$$\end{document} , B T = BQ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{B }_{\mathrm {T}} = \mathbf{BQ }$$\end{document} and C T = CR \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{C }_{\mathrm {T}} = \mathbf{CR }$$\end{document} . If we compensate such transformations in the core by setting G AT = P - 1 G A ( R - 1 Q - 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{G }_{\mathrm {AT}} = \mathbf{P }^{-1} \mathbf{G }_{\mathrm {A}} (\mathbf{R }^{-1}\otimes \mathbf{Q }^{-1})^{\prime }$$\end{document} , we get an equally well-fitting solution because

(2) A T G AT ( C T B T ) = APP - 1 G A ( R - 1 Q - 1 ) ( R Q ) ( C B ) = AG A ( C B ) . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{A }_{\mathrm {T}} \mathbf{G }_{\mathrm {AT}} (\mathbf{C }_{\mathrm {T}}\otimes \mathbf{B }_{\mathrm {T}})^{\prime } = \mathbf{APP }^{-1} \mathbf{G }_{\mathrm {A}} (\mathbf{R }^{-1}\otimes \mathbf{Q }^{-1}) ^{\prime } (\mathbf{R} \otimes \mathbf{Q} )^{\prime } (\mathbf{C} \otimes \mathbf{B} )^{\prime } = \mathbf{AG }_{\mathrm {A}} (\mathbf{C} \otimes \mathbf{B} )^{\prime }. \end{aligned}$$\end{document}

To overcome the possible limitations of the Tucker3 model, it may be convenient to consider the Parafac one, independently proposed by Carroll & Chang (Reference Carroll and Chang1970) and Harshman (Reference Harshman1970). The Parafac model can be seen as a constrained version of the Tucker3 one where the same number of components, say S, is sought for all the modes ( P = Q = R = S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P = Q = R = S$$\end{document} ) and the core tensor is equal to the identity tensor ( G ̲ = I ̲ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\underline{\mathbf{G }}} = {\underline{\mathbf{I }}}$$\end{document} , i.e., g pqr = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g_{pqr} = 1$$\end{document} , if p = q = r \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p = q = r$$\end{document} , and g pqr = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g_{pqr} = 0$$\end{document} , otherwise). We have

(3) X A = AI A ( C B ) + E A = A ( C B ) + E A , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{X }_{\mathrm {A}} = \mathbf{AI }_{\mathrm {A}} (\mathbf{C} \otimes \mathbf{B} )^{\prime } + \mathbf{E }_{\mathrm {A}} = \mathbf{A} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + \mathbf{E }_{\mathrm {A}}, \end{aligned}$$\end{document}

where the symbol ‘ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bullet $$\end{document} ’ denotes the Khatri–Rao product of matrices, i.e., it is C B = [ c 1 b 1 , , c S b S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{C} \bullet \mathbf{B} = [\mathbf{c }_{1}\otimes \mathbf{b }_{1}, \ldots , \mathbf{c }_{S}\otimes \mathbf{b }_{S}$$\end{document} ], where b s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{b }_{s}$$\end{document} and c s \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{c }_{s}$$\end{document} are the sth columns of B and C, respectively ( s = 1 , , S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$s = 1, \ldots , S$$\end{document} ). As for the Tucker3 case, the parameter estimates are found in the OLS sense by ALS algorithms.

The most interesting feature of Parafac is that under mild conditions, the solution is unique. This point has been deeply investigated by Kruskal (Reference Kruskal1977), who has found the following result. Let us denote by k-rank(Z) the so-called k-rank of a matrix Z. It is defined as the largest number k such that every subset of k columns of Z is linearly independent. Moreover, let (A, B, C) and ( A T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{A }_{\mathrm {T}}$$\end{document} , B T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{B }_{\mathrm {T}}$$\end{document} , C T ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{C }_{\mathrm {T}})$$\end{document} be two Parafac solutions. Kruskal (Reference Kruskal1977) has shown that if

(4) k -rank ( A ) + k -rank ( B ) + k -rank ( C ) 2 S + 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} k\hbox {-rank}(\mathbf{A }) + k\hbox {-rank}(\mathbf{B }) + k\hbox {-rank}(\mathbf{C }) \ge 2S + 2 \end{aligned}$$\end{document}

then, by considering (3),

(5) A ( C B ) = A T ( C T B T ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{A }(\mathbf{C } \bullet \mathbf{B })^{\prime } = \mathbf{A }_{\mathrm {T}} (\mathbf{C }_{\mathrm {T}} \bullet \mathbf{B }_{\mathrm {T}})^{\prime } \end{aligned}$$\end{document}

implies that there exist a permutation matrix P and three diagonal matrices D A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {A}}$$\end{document} , D B \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {B}}$$\end{document} and D C \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {C}}$$\end{document} , for which D A D B D C = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {A}} \mathbf{D }_{\mathrm {B}} \mathbf{D }_{\mathrm {C}} = \mathbf{I} $$\end{document} , where I denotes the identity matrix, such that

(6) A T = APD A , B T = BPD B , C T = CPD C . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{A }_{\mathrm {T}} = \mathbf{APD }_{\mathrm {A}}, \mathbf{B }_{\mathrm {T}} = \mathbf{BPD }_{\mathrm {B}}, \mathbf{C }_{\mathrm {T}} = \mathbf{CPD }_{\mathrm {C}}. \end{aligned}$$\end{document}

In other words, if (4) holds, then the solution (A, B, C) is unique up to scaling and a simultaneous column permutation. Although Kruskal’s condition has been extended by some other authors (Jiang & Sidiropoulos, Reference Jiang and Sidiropoulos2004; Stegeman, ten Berge, & De Lathauwer, Reference Stegeman, ten Berge and De Lathauwer2006; Stegeman, Reference Stegeman2009; Domanov & De Lathauver, Reference Domanov and De Lathauver2013a; Domanov & De Lathauver, Reference Domanov and De Lathauver2013b), what follows is based on such a condition because practitioners mainly refer to it in their applications. In particular, without loss of generality, we set the scaling of the factor loading matrices by assuming that B and C are column-wise normalized, i.e.,

(7) ( B B ) I = ( C C ) I = I , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} (\mathbf{B} ^{\prime } \mathbf{B} ) * \mathbf{I} = (\mathbf{C} ^{\prime } \mathbf{C} ) * \mathbf{I} = \mathbf{I} , \end{aligned}$$\end{document}

where ‘*’ denotes the Hadamard product, i.e., element-wise product of matrices. Therefore, (4) and (5) imply that there exists a permutation matrix P such that

(8) A T = AP , B T = BP , C T = CP . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{A }_{\mathrm {T}} = \mathbf{AP }, \, \mathbf{B }_{\mathrm {T}} = \mathbf{BP} , \, \mathbf{C }_{\mathrm {T}} = \mathbf{CP }. \end{aligned}$$\end{document}

It is important to note that parameter estimates are not scale invariant. A rescaling of the data does not imply only an analogous rescaling of the estimates. It follows that there exists a way to rescale the data better, or more appropriate, than another, making crucial the choice. This is witnessed by a remarkably large number of papers devoted to the preprocessing step. See, for instance, Harshman & Lundy (Reference Harshman, Lundy, Law, Snyder, Hattie and McDonald1984), Bro & Smilde (Reference Bro and Smilde2003), Kiers (Reference Kiers2006) and references therein. The goal of preprocessing is similar to that for standard PCA. Namely, data are standardized in order to eliminate unwanted differences among the variables. Difficulties arise because it is no longer obvious what is the best strategy to adopt. For instance, a three-way three-mode array can be normalized within one of the three modes or even within a combination of two modes. Different ways of preprocessing the data lead to different solutions, and a wrong preprocessing may lead to unfeasible results (see, e.g., Kiers, Reference Kiers2006). Obviously, a scale-invariant model would be simpler to apply in practice especially for non-expert users because such a crucial decision on the best preprocessing strategy would be no longer necessary.

For a thorough discussion on three-way three-mode analysis according to a component-based approach, see the monographies of Bro (Reference Bro1998), Smilde, Bro, & Geladi (Reference Smilde, Bro and Geladi2004) and Kroonenberg (Reference Kroonenberg2008) and the reviews of Acar & Yener (Reference Acar and Yener2009), Kolda & Bader (Reference Kolda and Bader2009), Mørup (Reference Mørup2011) and Giordani & Kiers (Reference Giordani and Kiers2018).

3. The Structural Approach

3.1. Models

Starting from the original formulations of the models, we can derive what is the covariance structure corresponding to each model. Let us consider the Tucker3 model in (1) by limiting our attention to the ith row of X A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{X }_{\mathrm {A}}$$\end{document} , say x Ai \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{Ai}^{\prime }$$\end{document} , pertaining to the ith statistical unit. x Ai \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{Ai}^{\prime }$$\end{document} is the vector of length JK containing the scores of statistical unit i on the J manifest variables during the K occasions. We get

(9) x Ai = a i G A ( C B ) + e Ai , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x }_{Ai}^{\prime } = \mathbf{a }_{i}^{\prime } \mathbf{G }_{\mathrm {A}} (\mathbf{C} \otimes \mathbf{B} )^{\prime } + \mathbf{e }_{Ai}^{\prime }, \end{aligned}$$\end{document}

where with obvious notation, e Ai \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{e }_{Ai}^{\prime }$$\end{document} represents the ith row of E A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{E }_{\mathrm {A}}$$\end{document} and a i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{a }_{i}^{\prime }$$\end{document} is the ith row of A containing the component scores for statistical unit i. To simplify the notation, we omit the subscripts ‘A’ and ‘i,’ and we rewrite (9) in terms of column vectors, explicitly considering a vector of intercepts

(10) x = μ + ( C B ) G a + e . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x} = {\varvec{\upmu }} + (\mathbf{C} \otimes \mathbf{B} )\mathbf{G} ^{\prime } \mathbf{a} + \mathbf{e} . \end{aligned}$$\end{document}

As usual in standard FA, we assume that the common factors a and the specific factors e are random with E ( a ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${E}(\mathbf{a} ) = \mathbf{0} $$\end{document} and E ( e ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${E}(\mathbf{e} ) = \mathbf{0} $$\end{document} , without loss of generality because of μ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\upmu }}$$\end{document} , and E ( ae ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${E}(\mathbf{ae }^{\prime }) = \mathbf{0} $$\end{document} . If E ( aa ) = Φ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${E}(\mathbf{aa }^{\prime }) = {\varvec{\Phi }}$$\end{document} and E ( ee ) = Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${E}(\mathbf{ee } ^{\prime }) = {\varvec{\Psi }}$$\end{document} , then the covariance matrix of x is given by

(11) Σ = E [ ( x - μ ) ( x - μ ) ] = ( C B ) G Φ G ( C B ) + Ψ . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = {E}[(\mathbf{x} - {\varvec{\upmu }}) (\mathbf{x} - {\varvec{\upmu }})^{\prime }] = (\mathbf{C} \otimes \mathbf{B} )\mathbf{G} ^{\prime } {\varvec{\Phi }} \mathbf{G} (\mathbf{C} \otimes \mathbf{B} )^{\prime } + {\varvec{\Psi }}. \end{aligned}$$\end{document}

In what follows, Φ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Phi }}$$\end{document} and Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} will be assumed to be positive definite. The generic element of the matrix Σ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }}$$\end{document} (of order JK × JK \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textit{JK}} \times {\textit{JK}}$$\end{document} ), σ j k , j k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm{{\varvec{\sigma }}}_{jk,j^{\prime } k^{\prime }}$$\end{document} , holds the covariance between manifest variable j at occasion k and manifest variable j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j^{\prime }$$\end{document} at occasion k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k^{\prime }$$\end{document} ( j , j = 1 , , J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j, j^{\prime } = 1, \ldots , J$$\end{document} ; k , k = 1 , , K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k, k^{\prime } = 1, \ldots , K$$\end{document} ). Bearing in mind the standard FA model, it should be clear that the Tucker3 model is a constrained version of standard FA. If we set Λ = ( C B ) G \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }} = (\mathbf{C} \otimes \mathbf{B} ) \mathbf{G }^{\prime }$$\end{document} , then (11) can be rewritten as

(12) Σ = Λ Φ Λ + Ψ , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = {\varvec{\Lambda \Phi \Lambda }}^{\prime } + {\varvec{\Psi }}, \end{aligned}$$\end{document}

which is the oblique FA model, where Λ = ( C B ) G \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }} = (\mathbf{C} \otimes \mathbf{B} ) \mathbf{G }^{\prime }$$\end{document} is the matrix of factor loadings having a particular form taking into account the three-way three-mode structure of the data.

Several papers available in the literature investigate the model in (11). Interested readers may refer to Bloxom (Reference Bloxom1968), Bentler & Lee (Reference Bentler and Lee1978), Bentler & Lee (Reference Bentler and Lee1979), Lee & Fong (Reference Lee and Fong1983), Bloxom (Reference Bloxom, Law, Snyder, Hattie and McDonald1984), Bentler, Poon, & Lee (Reference Bentler, Poon and Lee1988), Kroonenberg & Oort (Reference Kroonenberg and Oort2003) and references therein. Other multilinear decompositions with Kronecker structured covariance matrices are presented and investigated by, for instance, Gerard & Hoff (Reference Gerard and Hoff2015) and Hoff (Reference Hoff2016).

The Parafac model can be derived from (10) setting G = G A = I A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{G} = \mathbf{G }_{\mathrm {A}} = \mathbf{I }_{\mathrm {A}}$$\end{document} as in (3). We have

(13) x = μ + ( C B ) a + e . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x} = {\varvec{\upmu }} + (\mathbf{C} \bullet \mathbf{B} )\mathbf{a} + \mathbf{e} . \end{aligned}$$\end{document}

This shows that even the Parafac model is the same as the classical FA model, where the observed variables are expressed as a linear combination of a limited number of common factors (a) and specific factors (e). In order to take into account the three-way three-mode structure of the data, the loadings are constrained to be equal to C B \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{C} \bullet \mathbf{B} $$\end{document} . Similarly, the structural version of Parafac can be derived from (11) imposing the restriction G = G A = I A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{G} = \mathbf{G }_{\mathrm {A}} = \mathbf{I }_{\mathrm {A}}$$\end{document} obtaining

(14) Σ = E [ ( x - μ ) ( x - μ ) ] = ( C B ) Φ ( C B ) + Ψ . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = E[(\mathbf{x} - {\varvec{\upmu }}) (\mathbf{x} - {\varvec{\upmu }})^{\prime }] = (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }}. \end{aligned}$$\end{document}

Comparing (10) with (13), we see that the restriction G = G A = I A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{G} = \mathbf{G }_{\mathrm {A}} = \mathbf{I }_{\mathrm {A}}$$\end{document} greatly simplifies model interpretation. This aspect can be highlighted by focusing on a single occasion, say k. Let x k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{k}$$\end{document} be the vector of the J manifest variables measured at occasion k, with obvious notation, the Tucker3 and Parafac assume the form

(15) x k = μ k + B r c kr G r a + e k , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x }_{k}= & {} {\varvec{\upmu }}_{k} + \mathbf{B} \left( {\sum }_{r} \, c_{kr} \mathbf{G }_{r}\right) ^{\prime } \mathbf{a} + \mathbf{e }_{k}, \end{aligned}$$\end{document}
(16) x k = μ k + B diag ( c k ) a + e k , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x }_{k}= & {} {\varvec{\upmu }}_{k} + \mathbf{B} \hbox {diag}(\mathbf{c }_{k}) \mathbf{a} + \mathbf{e }_{k}, \end{aligned}$$\end{document}

respectively, where diag(z) is the diagonal matrix with the elements of z on the main diagonal. Under the Tucker3 model, B may be interpreted as the matrix of factor loadings for the variables on the common factors ( Σ r c kr G r ) a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$({\varvec{\Sigma }}_{r} \, c_{kr} \mathbf{G }_{r})^{\prime } \mathbf{a} $$\end{document} varying the covariance structure across the occasions. Even under the Parafac, the factor loadings for the variables are the same (B) in each occasion, but the common factors diag ( c k ) a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {diag} (\mathbf{c }_{k}) \mathbf{a} $$\end{document} vary only in variance. By exploiting the symmetry of the models with respect to the modes, we may interpret C as the factor loading matrix for the occasions.

In the following section, we reconsider the structural Parafac model by analyzing whether the constraints Λ = C B \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }} = \mathbf{C} \bullet \mathbf{B} $$\end{document} affect the parameter identifiability under different covariance structure of the specific factors.

3.2 Scale Invariance

The practical applicability of FA is favored by the property of scale invariance. As is well known, the factorial structure of the standard FA model is not destroyed by variable rescaling. For example, if x is rescaled as x L = Lx \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{\mathrm {L}} = \mathbf{Lx} $$\end{document} where L is a diagonal matrix, then the covariance matrix of x L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{\mathrm {L}}$$\end{document} , Σ L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }}_{\mathrm {L}}$$\end{document} , is equal to

(17) Σ L = V ( Lx ) = L V ( x ) L = L ( Λ Φ Λ + Ψ ) L = ( L Λ ) Φ ( L Λ ) + L Ψ L = Λ L Φ Λ L + Ψ L , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }}_{\mathrm {L}} = V(\mathbf{Lx }) = \mathbf{L} \hbox {V}(\mathbf{x} )\mathbf{L} = \mathbf{L} ({\varvec{\Lambda \Phi \Lambda }}^{\prime } + {\varvec{\Psi }}) \mathbf{L} = (\mathbf{L} {\varvec{\Lambda }}) {\varvec{\Phi }} (\mathbf{L} {\varvec{\Lambda }})^{\prime } + \mathbf{L} {\varvec{\Psi }} \mathbf{L} = {\varvec{\Lambda }}_{\mathrm {L}} {\varvec{\Phi \Lambda }}_{\mathrm {L}}^{\prime } + {\varvec{\Psi }}_{\mathrm {L}},\nonumber \\ \end{aligned}$$\end{document}

with Λ L = L Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }}_{\mathrm {L}} = \mathbf{L} {\varvec{\Lambda }}$$\end{document} and Ψ L = L Ψ L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}_{\mathrm {L}} = \mathbf{L} {\varvec{\Psi }} \mathbf{L} $$\end{document} . Hence, the factor loadings are the same but expressed in the new units of measurements. This guarantees that when the model is estimated by a scale-invariant method, like maximum likelihood, a rescaling of the data implies an analogous rescaling of the estimates. If L contains standard deviations, Σ L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }}_{\mathrm {L}}$$\end{document} is the correlation matrix of x and standard FA applied to covariance or correlation matrices leads to the same conclusions. This is not true for PCA. The components extracted by using Σ L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }}_{\mathrm {L}}$$\end{document} or Σ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }}$$\end{document} differ, and therefore, a crucial question to answer before applying PCA is how to preprocess the data.

In the structural case, the Parafac model introduced in (13), and the resulting covariance matrix given in (14), is generally not scale invariant in the sense given by Browne (Reference Browne and Hawkins1982). In fact, if x in (13) is rescaled as x L = Lx \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{\mathrm {L}} = \mathbf{Lx }$$\end{document} , then

(18) Σ L = L [ ( C B ) Φ ( C B ) + Ψ ] L = [ L ( C B ) ] Φ [ L ( C B ) ] + Ψ L ; \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }}_{\mathrm {L}} = \mathbf{L} [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }} ]\mathbf{L} = [\mathbf{L} (\mathbf{C} \bullet \mathbf{B} )] {\varvec{\Phi }} [\mathbf{L} (\mathbf{C} \bullet \mathbf{B} )]^{\prime } + {\varvec{\Psi }}_{\mathrm {L}}; \end{aligned}$$\end{document}

it follows that, in general, the structural Parafac model in (14) is not scale invariant because the equation

(19) L ( C B ) = C L B L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{L} (\mathbf{C} \bullet \mathbf{B} ) = \mathbf{C }_{\mathrm {L}} \bullet \mathbf{B }_{\mathrm {L}} \end{aligned}$$\end{document}

does not always have a solution. In other terms, the matrix L usually destroys the Khatri–Rao structure of the factor loadings unless L has a particular structure.

Bearing in mind that the same variables are collected at different occasions, it may be reasonable to perform the same scaling of the variables in the various occasions and the same scaling of the occasions across the different variables. This is equivalent to say that

(20) L = ( L O L V ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{L} = (\mathbf{L }_{\mathrm {O}}\otimes \mathbf{L }_{\mathrm {V}}), \end{aligned}$$\end{document}

where L V = diag ( [ l V 1 , , l V j , , l V J ] ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{L }_{\mathrm {V}} = \hbox {diag} ([l_{\mathrm {V1}}, \ldots , l_{{\mathrm {V}} j}, \ldots , l_{{\mathrm {V}} J}]^{\prime })$$\end{document} and L O = diag ( [ l O 1 , , l O k , , l O K ] ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{L }_{\mathrm {O}} = \hbox {diag} ([l_{\mathrm {O1}}, \ldots , l_{{\mathrm {O}} k}, \ldots , l_{{\mathrm {O}} K}]^{\prime })$$\end{document} . The generic element l V j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_{{\mathrm {V}} j}$$\end{document} gives the scaling of variable j ( j = 1 , , J ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j \, (j = 1, \ldots , J)$$\end{document} . Similarly, l O k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_{{\mathrm {O}} k}$$\end{document} represents the scaling of occasion k ( k = 1 , , K ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k \, (k = 1, \ldots , K)$$\end{document} . It follows that x jk \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x_{jk}$$\end{document} , i.e., variable j at occasion k, is rescaled as l V j l O k x jk \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_{{\mathrm {V}} j} l_{{\mathrm {O}} k}x_{jk}$$\end{document} . In this particular case, by substituting (20) into (19), we get

(21) L ( C B ) = ( L O L V ) ( C B ) = ( L O C L V B ) = C L B L , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{L} (\mathbf{C} \bullet \mathbf{B} ) = (\mathbf{L }_{\mathrm {O}}\otimes \mathbf{L }_{\mathrm {V}})(\mathbf{C} \bullet \mathbf{B} ) = (\mathbf{L }_{\mathrm {O}}{} \mathbf{C} \bullet \mathbf{L }_{\mathrm {V}}{} \mathbf{B} ) = \mathbf{C }_{\mathrm {L}}\bullet \mathbf{B }_{\mathrm {L}}, \end{aligned}$$\end{document}

with B L = L V B \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{B }_{\mathrm {L}} = \mathbf{L }_{\mathrm {V}} \mathbf{B} $$\end{document} and C L = L O C \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{C }_{\mathrm {L}} = \mathbf{L }_{\mathrm {O}} \mathbf{C} $$\end{document} . Therefore, the scaling is absorbed by the Khatri–Rao product and the particular factor loading structure is not destroyed.

However, in general, the structural Parafac model is not scale invariant and a researcher may be interested in knowing how to modify its structure to derive a scale-invariant version. Following Lee & Fong (Reference Lee and Fong1983), the Parafac formulation in (13) can be modified by incorporating a positive definite diagonal matrix D in the model able to absorb scale changes. We get

(22) x = μ + D [ ( C B ) a + e ] . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x} = {\varvec{\upmu }} + \mathbf{D} [(\mathbf{C} \bullet \mathbf{B} )\mathbf{a} + \mathbf{e} ]. \end{aligned}$$\end{document}

From (22), the covariance matrix of x is

(23) Σ = D [ ( C B ) Φ ( C B ) + Ψ ] D . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = \mathbf{D} [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }}]\mathbf{D} . \end{aligned}$$\end{document}

In this way, the Parafac model becomes scale invariant. In fact, when x is rescaled as x L = Lx \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{\mathrm {L}} = \mathbf{Lx }$$\end{document} , (22) becomes

(24) x L = Lx = L { μ + D [ ( C B ) a + e ] } = μ L + D L [ ( C B ) a + e ] , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{x }_{\mathrm {L}} = \mathbf{Lx} = \mathbf{L} \{ {\varvec{\upmu }} + \mathbf{D} [(\mathbf{C} \bullet \mathbf{B} )\mathbf{a} + \mathbf{e} ]\} = {\varvec{\upmu }}_{\mathrm {L}} + \mathbf{D }_{\mathrm {L}} [(\mathbf{C} \bullet \mathbf{B} )\mathbf{a} + \mathbf{e} ], \end{aligned}$$\end{document}

with μ L = L μ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\upmu }}_{\mathrm {L}} = \mathbf{L} {\varvec{\upmu }}$$\end{document} and D L = LD \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {L}} = \mathbf{LD }$$\end{document} . The covariance matrix of x L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{\mathrm {L}}$$\end{document} is

(25) Σ L = D L [ ( C B ) Φ ( C B ) + Ψ ] D L . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }}_{\mathrm {L}} = \mathbf{D }_{\mathrm {L}} [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }}] \mathbf{D }_{\mathrm {L}}. \end{aligned}$$\end{document}

Hence, the model is not affected by L because the scaling is absorbed in the diagonal matrix D L \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {L}}$$\end{document} . It is important to note that if we impose a particular structure on D, then the same should be on D L = LD \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {L}} = \mathbf{LD }$$\end{document} for every possible L. It follows that we cannot impose any particular structure on D. In order to identify the diagonal matrix D, either the constraint ( ( C B ) Φ ( C B ) + Ψ ) I = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$((\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }})* \mathbf{I} = \mathbf{I} $$\end{document} (Lee & Fong, Reference Lee and Fong1983) or Ψ = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }} = \mathbf{I} $$\end{document} (Bentler, Poon, & Lee, Reference Bentler, Poon and Lee1988) is imposed. The two formulations are not equivalent. The former allows us to interpret the model as a reparameterization of the correlation matrix, and the latter has an interpretation less clear but avoids complex constraints on the estimates of the parameter matrices except for Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} .

4. Factor Uniqueness of the Structural Parafac Model

A fundamental property of the Parafac model is its factor uniqueness. In this section, we start showing that Parafac maintains this property even in the structural formulation when, as in the standard FA model in (12), the specific factors are assumed to be uncorrelated, i.e., the matrix Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is diagonal. However, in three-way three-mode FA, we note that the uncorrelation of the specific factors might be a too restrictive assumption. For instance, it might be reasonable to assume that Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is block diagonal, i.e., the specific factors of the different variables are correlated within the same occasion

(26) Ψ = blockdiag ( Ψ 11 , , Ψ kk , , Ψ K K ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Psi }} = \hbox {blockdiag} ({\varvec{\Psi }}_{11}, \ldots , {\varvec{\Psi }}_{kk}, \ldots , {\varvec{\Psi }}_{K\! K}), \end{aligned}$$\end{document}

where blockdiag ( Z 1 , , Z K ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {blockdiag} (\mathbf{Z }_{1}, \ldots , \mathbf{Z }_{K})$$\end{document} denotes the block-diagonal matrix with diagonal elements equal to the square matrices Z 1 , , Z K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z }_{1}, \ldots , \mathbf{Z }_{K}$$\end{document} and Ψ kk \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}_{kk}$$\end{document} denotes the covariance matrix of order ( J × J \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J \times J$$\end{document} ) for the specific factors at occasion k, k = 1 , , K \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k = 1, \ldots , K$$\end{document} . The Parafac covariance model in (14) with the correlation structure of the specific factors given in (26) represents a more realistic model able to fit reasonably well in many practical situations.

To interpret/understand correlations between specific factors, we note that they are allowed only between specific factors in the same occasion. In other words, the idea is that correlations between variables measured in different occasions are due only to common factors, while correlations between variables in the same occasion can also be due to other factors (i.e., the specific factors in that occasion) that are not present in other occasions and therefore out of our interest. An accurate estimation of the ‘true’/interesting common factors should consider explicitly the presence of such factors that are in ‘common’ only with the variables measured in the same occasion. From a statistical point of view, such covariances are only nuisance parameters. In the literature, the assumption of correlated specific factors is not new. See, e.g., Browne (Reference Browne1984a) and Kroonenberg & Oort (Reference Kroonenberg and Oort2003).

In the following, we investigate under what conditions factor uniqueness holds even with a block-diagonal structure for Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} . Finally, we address the scale invariance problem studying the factor uniqueness property of the scale-invariant version of Parafac in (2223).

It is important to note that what follows can be extended to the case where the specific factors of the same variable are correlated across the different occasions. Such an extension can be easily obtained by exploiting the symmetry of the model with respect to variables and occasions.

4.1. Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} Diagonal

Let us start by showing the factor uniqueness property of the structural Parafac when Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is diagonal. Note that disjoint submatrices are submatrices having no entry in common.

Result 1

Suppose that the following hold:

  • (a)

    (27) ( C B ) Φ ( C B ) + Ψ = ( C T B T ) Φ T ( C T B T ) + Ψ T , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \qquad (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }} = (\mathbf{C }_{\mathrm {T}} \bullet \mathbf{B }_{\mathrm {T}}) \, {\varvec{\Phi }}_{\mathrm {T}} (\mathbf{C }_{\mathrm {T}} \bullet \mathbf{B }_{\mathrm {T}})^{\prime } + {\varvec{\Psi }}_{\mathrm {T}}, \end{aligned}$$\end{document}
  • (b)

    (28) k -rank ( B ) + k -rank ( C ) S + 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \qquad k \hbox {-rank}(\mathbf{B} ) + k\hbox {-rank}(\mathbf{C} ) \ge S+2, \end{aligned}$$\end{document}
  • (c) if any row of C B \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{C} \bullet \mathbf{B} $$\end{document} is deleted, there remain two disjoint submatrices of rank S,

where Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} and Ψ T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}_{\mathrm {T}}$$\end{document} are diagonal matrices and all the remaining ones have S columns.

Under the scaling set in (7), there exists a permutation matrix P such that

(29) B T = BP , C T = CP , Φ T = P Φ P . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{B }_{\mathrm {T}} = \mathbf{BP }, \mathbf{C }_{\mathrm {T}} = \mathbf{CP} , {\varvec{\Phi }}_{\mathrm {T}} = \mathbf{P} {\varvec{\Phi }} \mathbf{P} . \end{aligned}$$\end{document}

Proof

Let us rewrite equality (a) in terms of standard FA models with uncorrelated common factors

(30) Σ = [ ( C B ) Φ 1 / 2 ] [ ( C B ) Φ 1 / 2 ] + Ψ = [ ( C T B T ) Φ T 1 / 2 ] [ ( C T B T ) Φ T 1 / 2 ] + Ψ T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = \big [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2}\big ] \big [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2}\big ]^{\prime } + {\varvec{\Psi }} = \big [(\mathbf{C }_{\mathrm {T}} \bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}^{1/2}\big ] \big [(\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}^{1/2}\big ]^{\prime } + {\varvec{\Psi }}_{\mathrm {T}}. \end{aligned}$$\end{document}

Condition (c) implies that if any row of ( C B ) Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf{C} \bullet \mathbf{B} ) \, {\varvec{\Phi }}^{1/2}$$\end{document} is deleted, there remain two disjoint submatrices of rank S since rank ( Φ ) = rank ( Φ 1 / 2 ) = S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {rank} ({\varvec{\Phi }}) = \hbox {rank} ({\varvec{\Phi }}^{1/2}) = S$$\end{document} . From Theorem 5.1 of Anderson & Rubin (Reference Anderson and Rubin1956), we know that when this is true, there exists a column-wise orthonormal matrix T such that

(31) ( C B ) Φ 1 / 2 = ( C T B T ) Φ T 1 / 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2} = (\mathbf{C }_{\mathrm {T}} \bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }} _{\mathrm {T}}^{1/2} \mathbf{T} . \end{aligned}$$\end{document}

Inequality (b) implies

(32) k -rank ( Φ 1 / 2 ) + k -rank ( B ) + k -rank ( C ) 2 S + 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} k\hbox {-rank} \big ({\varvec{\Phi }}^{1/2}\big ) + k\hbox {-rank} (\mathbf{B} ) + k\hbox {-rank} (\mathbf{C} ) \ge 2 S+2, \end{aligned}$$\end{document}

indicating that Kruskal’s condition is met and the two decompositions in (31) are equal up to a column permutation under constraints (7). Let us indicate with P such a permutation matrix, we have

(33) B T = BP , C T = CP , Φ T 1 / 2 = T Φ 1 / 2 P = Φ T = P Φ P . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{B }_{\mathrm {T}} = \mathbf{BP} , \mathbf{C }_{\mathrm {T}} = \mathbf{CP} , {\varvec{\Phi }}_{\mathrm {T}}^{1/2} = \mathbf{T} ^{\prime } {\varvec{\Phi }}^{1/2}{} \mathbf{P} =\>{\varvec{\Phi }}_{\mathrm {T}} = \mathbf{P} ^{\prime } {\varvec{\Phi }} \mathbf{P} . \end{aligned}$$\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

Finally, we note that, in the previous proposition, Kruskal’s condition can be replaced with any other sufficient condition available in the literature.

4.2. Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} Block Diagonal

When Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is block diagonal, the conditions of Anderson & Rubin (Reference Anderson and Rubin1956) cannot be longer applied as it assumes a diagonal structure for Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} . To prove the factor uniqueness, the following proposition is considered.

Proposition

(Browne, Reference Browne1980) Let us consider the decomposition Σ = Λ Λ + Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }} = {\varvec{\Lambda \Lambda }}^{\prime } +{\varvec{\Psi }}$$\end{document} , where Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }}$$\end{document} is partitioned as Λ = [ Λ 1 , , Λ k , , Λ K ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }} = [{\varvec{\Lambda }}_{1}^{\prime }, \ldots , {\varvec{\Lambda }}_{k}^{\prime }, \ldots , {\varvec{\Lambda }} _{K}^{\prime }]^{\prime }$$\end{document} and Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} has the corresponding block-diagonal structure Ψ = blockdiag ( Ψ 11 , , Ψ k k , , Ψ KK ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }} = \hbox {blockdiag} ({\varvec{\Psi }}_{11}, \ldots , {\varvec{\Psi }}_{k\! k}, \ldots , {\varvec{\Psi }}_{\textit{KK}})$$\end{document} . The identification of Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} and Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }}$$\end{document} up to post-multiplication by an orthogonal matrix T holds if at least three of the submatrices Λ k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }}_{k}$$\end{document} are of full column rank.

The above proposition has been formulated in the context of the FA model for multiple batteries of tests. The matrices Λ k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }}_{k}$$\end{document} and Ψ kk \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}_{kk}$$\end{document} refer to the kth battery. The factor loading matrix is obtained by juxtaposing the factor loading matrices of the various batteries, and the specific factors are uncorrelated between the batteries, but correlated within the batteries. The proposition can be applied in order to prove the factor uniqueness of the structural Parafac model when the specific factors are correlated within the occasions. A sufficient condition for the factor uniqueness of the Parafac solution is given in Result 2.

Result 2

Suppose that the following hold:

  • (a)

    (34) ( C B ) Φ ( C B ) + Ψ = ( C T B T ) Φ T ( C T B T ) + Ψ T , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \qquad (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }} = (\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}(\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}})^{\prime } + {\varvec{\Psi }}_{\mathrm {T}}, \end{aligned}$$\end{document}
  • (b)

    (35) k -rank ( B ) + k -rank ( C ) S + 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \qquad k\hbox {-rank}(\mathbf{B} ) + k\hbox {-rank}(\mathbf{C} ) \ge S+2, \end{aligned}$$\end{document}
  • (c)

    (36) rank ( B diag ( c k ) ) = S for at least three occasions , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \qquad \hbox {rank} (\mathbf{B} \hbox {diag} (\mathbf{c }_{k})) = S \, \hbox {for at least three occasions}, \end{aligned}$$\end{document}

where Ψ = blockdiag ( Ψ 11 , , Ψ kk , , Ψ K K ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }} = \hbox {blockdiag} ({\varvec{\Psi }}_{11}, \ldots , {\varvec{\Psi }}_{kk}, \ldots , {\varvec{\Psi }}_{K\! K})$$\end{document} is block diagonal with blocks defined by the occasions, Ψ T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}_{\mathrm {T}}$$\end{document} has the same structure, all the remaining matrices have S columns, and c k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{c }_{k}$$\end{document} is the vector containing the elements of the kth row of C.

Under the scaling set in (7), there exists a permutation matrix P such that

(37) B T = BP , C T = CP , Φ T = P Φ P . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{B }_{\mathrm {T}} = \mathbf{BP }, \mathbf{C }_{\mathrm {T}} = \mathbf{CP }, {\varvec{\Phi }}_{\mathrm {T}} = \mathbf{P} {\varvec{\Phi }} \mathbf{P} . \end{aligned}$$\end{document}

Proof

Let us rewrite equality (a) in terms of standard FA models with uncorrelated common factors

(38) Σ = [ ( C B ) Φ 1 / 2 ] [ ( C B ) Φ 1 / 2 ] + Ψ = [ ( C T B T ) Φ T 1 / 2 ] [ ( C T B T ) Φ T 1 / 2 ] + Ψ T , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = \big [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2}\big ] \big [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2}\big ]^{\prime } + {\varvec{\Psi }} = \big [(\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}^{1/2}\big ]\big [(\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}^{1/2}\big ]^{\prime } + {\varvec{\Psi }}_{\mathrm {T}}, \end{aligned}$$\end{document}

where the matrix of factor loadings ( C B ) Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf{C} \bullet \mathbf{B} ) \, {\varvec{\Phi }}^{1/2}$$\end{document} , if partitioned according to the blocks of Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} , takes the form ( C B ) Φ 1 / 2 = [ ( B diag ( c 1 ) Φ 1 / 2 ) , , ( B diag ( c k ) Φ 1 / 2 ) , , ( B diag ( c K ) Φ 1 / 2 ) ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathbf{C} \bullet \mathbf{B} ) \, {\varvec{\Phi }}^{1/2}= [(\mathbf{B} \hbox {diag} (\mathbf{c }_{1}) {\varvec{\Phi }}^{1/2})^{\prime }, \ldots , (\mathbf{B} \hbox {diag} (\mathbf{c }_{k}) {\varvec{\Phi }}^{1/2})^{\prime }, \ldots , (\mathbf{B} \hbox {diag} (\mathbf{c }_{K}) {\varvec{\Phi }}^{1/2})^{\prime }]^{\prime }$$\end{document} . Condition (c) implies rank ( B diag ( c k ) Φ 1 / 2 ) = S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {rank} (\mathbf{B} \hbox {diag} (\mathbf{c }_{k}) {\varvec{\Phi }}^{1/2}) = S$$\end{document} for at least three occasions since rank ( Φ ) = rank ( Φ 1 / 2 ) = S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {rank} ({\varvec{\Phi }}) = \hbox {rank} ({\varvec{\Phi }}^{1/2}) = S$$\end{document} . This allows us to use Browne’s proposition in order to show that (38) implies the existence of a column-wise orthonormal matrix T such that

(39) ( C B ) Φ 1 / 2 = ( C T B T ) Φ T 1 / 2 T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2} = (\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}^{1/2}{} \mathbf{T} . \end{aligned}$$\end{document}

Inequality (b) implies

(40) k -rank ( Φ 1 / 2 ) + k -rank ( B ) + k -rank ( C ) 2 S + 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} k\hbox {-rank} \big ({\varvec{\Phi }}^{1/2}\big ) + k\hbox {-rank} (\mathbf{B} ) + k\hbox {-rank} (\mathbf{C} ) \ge 2 S+2, \end{aligned}$$\end{document}

indicating that Kruskal’s condition is met and the two decompositions in (39) are equal up to a column permutation under constraints (7). Let us indicate with P such a permutation matrix, we have

(41) B T = BP , C T = CP , Φ T 1 / 2 = T Φ 1 / 2 P Φ T = P Φ P . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{B }_{\mathrm {T}} = \mathbf{BP }, \mathbf{C }_{\mathrm {T}} = \mathbf{CP }, {\varvec{\Phi }}_{\mathrm {T}}^{1/2} = \mathbf{T }^{\prime } {\varvec{\Phi }}^{1/2} \mathbf{P} \Rightarrow {\varvec{\Phi }}_{\mathrm {T}} = \mathbf{P }^{\prime } {\varvec{\Phi }} \mathbf{P} . \end{aligned}$$\end{document}

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

4.3. Scale Invariance

It remains to prove that the scale-invariant versions of the structural Parafac model in (25), with constraint (a) ( ( C B ) Φ ( C B ) + Ψ ) I = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$((\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }})* \mathbf{I} = \mathbf{I} $$\end{document} (Lee & Fong, Reference Lee and Fong1983) or (b) Ψ I = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }} *\mathbf{I} = \mathbf{I} $$\end{document} (Bentler, Poon, & Lee, Reference Bentler, Poon and Lee1988), satisfy the factor uniqueness property. Note that the latter constraint is modified with respect to the original version to also handle the case where Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is block diagonal. The following results hold for either of the two cases where Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is diagonal or block diagonal.

Let us consider the equality

(42) D [ ( C B ) Φ ( C B ) + Ψ ] D = D T [ ( C T B T ) Φ T ( C T B T ) + Ψ T ] D T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{D} [(\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} (\mathbf{C} \bullet \mathbf{B} )^{\prime } + {\varvec{\Psi }}]\mathbf{D} = \mathbf{D }_{\mathrm {T}}[(\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}}) {\varvec{\Phi }}_{\mathrm {T}}(\mathbf{C }_{\mathrm {T}}\bullet \mathbf{B }_{\mathrm {T}})^{\prime } + {\varvec{\Psi }}_{\mathrm {T}}]\mathbf{D }_{\mathrm {T}} \end{aligned}$$\end{document}

In order to show that the previous results hold even in this case, it is sufficient to show that the use of either constraint (a) or (b) implies D = D T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D} = \mathbf{D }_{\mathrm {T}}$$\end{document} .

In case (a), by focusing the attention on the main diagonal elements of the covariance matrices on the left and right of equality (42), we deduce that

(43) DD = D T D T D = D T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{DD } = \mathbf{D }_{\mathrm {T}} \mathbf{D }_{\mathrm {T}} \Rightarrow \mathbf{D} = \mathbf{D }_{\mathrm {T}} \end{aligned}$$\end{document}

because D and D T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D }_{\mathrm {T}}$$\end{document} are required to be positive definite.

Case (b) is more complex to show because now the constraint applies only on the covariance matrix of the specific factors and it becomes effective in the identification only if the decomposition of the total covariance is unique. To this end, we rewrite (42) in terms of FA models. Namely, setting Λ = D ( C B ) Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }} = \mathbf{D} (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }} ^{1/2}$$\end{document} and Z = D Ψ D \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z } = \mathbf{D} {\varvec{\Psi }} \mathbf{D} $$\end{document} , with obvious notation, we get

(44) Σ = Λ Λ + Z = Λ T Λ T + Z T . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }} = {\varvec{\Lambda \Lambda }}^{\prime } + \mathbf{Z } = {\varvec{\Lambda }}_{\mathrm {T}} {\varvec{\Lambda }}_{\mathrm {T}}^{\prime } + \mathbf{Z }_{\mathrm {T}}. \end{aligned}$$\end{document}

If the decomposition of the total covariance into variability due to the common factors and variability due to the specific factors is unique, then it must be Z = Z T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Z } = \mathbf{Z }_{\mathrm {T}}$$\end{document} , i.e., D Ψ D = D T Ψ T D T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D} {\varvec{\Psi }} \mathbf{D} = \mathbf{D }_{\mathrm {T}} {\varvec{\Psi }}_{\mathrm {T}} \mathbf{D }_{\mathrm {T}}$$\end{document} ; hence, D = D T \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D} = \mathbf{D }_{\mathrm {T}}$$\end{document} because Ψ I = Ψ T I = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}*\mathbf{I} = {\varvec{\Psi }}_{\mathrm {T}}*\mathbf{I} = \mathbf{I} $$\end{document} . It remains to discuss under what conditions the total covariance decomposition is unique. In particular, we are going to show that this is obtained when condition (c) in either Result 1 or 2 is satisfied.

In the case of a diagonal Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} , when condition (c) of Result 1 is met, if any row of Λ = D ( C B ) Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Lambda }} = \mathbf{D} (\mathbf{C} \bullet \mathbf{B} ) {\varvec{\Phi }}^{1/2}$$\end{document} is deleted, there remain two disjoint submatrices of rank S. This allows us to apply Theorem 5.1 of Anderson & Rubin (Reference Anderson and Rubin1956) obtaining the required factor uniqueness.

In the case of a block diagonal Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} , when condition (c) of Result 2 is met, at least three submatrices of D ( C B ) Φ 1 / 2 = [ ( D 1 B diag ( c 1 ) Φ 1 / 2 ) , , ( D k B diag ( c k ) Φ 1 / 2 ) , , ( D K B diag ( c K ) Φ 1 / 2 ) ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D} (\mathbf{C} \bullet \mathbf{B} ) \, {\varvec{\Phi }}^{1/2} = [(\mathbf{D }_{1} \mathbf{B} \hbox {diag} (\mathbf{c }_{1}) \, {\varvec{\Phi }}^{1/2})^{\prime }, \ldots , (\mathbf{D }_{k} \mathbf{B} \hbox {diag} (\mathbf{c }_{k}) \, {\varvec{\Phi }}^{1/2})^{\prime }, \ldots , (\mathbf{D }_{K} \mathbf{B} \hbox {diag} (\mathbf{c }_{K}) \, {\varvec{\Phi }}^{1/2})^{\prime }]^{\prime }$$\end{document} , where D = blockdiag ( D 1 , , D k , , D K ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{D} = \hbox {blockdiag} (\mathbf{D }_{1}, \ldots , \mathbf{D }_{k}, \ldots , \mathbf{D }_{K})$$\end{document} , are of full column rank. This allows us to apply Browne’s proposition obtaining the required factor uniqueness.

5. Estimation of the Structural Parafac Model

The parameter estimation of the structural Parafac model can be done in several ways. Ordinary least squares (OLS), generalized least squares (GLS), maximum likelihood (ML) approaches can be applied. They differ in the discrepancy function to be minimized between the sample covariance matrix and the one induced by the Parafac model. If observations are independent and identically distributed as a multinormal distribution, the estimate is obtained in such a way to minimize

(45) f ( θ ) = tr { [ S - Σ ( θ ) ] W - 1 } 2 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {f} ({\uptheta }) = \hbox {tr} \{[\mathbf{S} {-} {\varvec{\Sigma }} ({\uptheta })]\mathbf{W }^{-1}\}^{2}/2 \end{aligned}$$\end{document}

where θ = { B , C , Φ , Ψ } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\uptheta } = \{\mathbf{B }, \mathbf{C }, {\varvec{\Phi }}, {\varvec{\Psi }}\}$$\end{document} denotes the set of parameters, S is the sample covariance matrix of order ( JK × JK \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\textit{JK}} \times {\textit{JK}}$$\end{document} ), Σ ( θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }} ({\uptheta })$$\end{document} is the estimated covariance matrix according to the Parafac model, and W is a weight matrix. If W = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{W} = \mathbf{I} $$\end{document} in (45), then the OLS estimate is found. The OLS approach is, however, not recommended because it is not scale invariant regardless the model properties. If we set W = S \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{W} = \mathbf{S} $$\end{document} , then the so-called best GLS estimate is found, while the ML estimate is almost certainly obtained setting W = Σ ( θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{W} = {\varvec{\Sigma }} ({\uptheta })$$\end{document} when I is large enough (Browne, Reference Browne1974). The ‘best’ GLS and ML estimators are asymptotically equivalent and therefore share the same asymptotic properties. The most relevant difference is that the GLS approach does not exactly reproduce the sample variances of the manifest variables (see, e.g., Bentler & Lee, Reference Bentler and Lee1979). If the multivariate normality assumption is violated, the asymptotically distribution-free (ADF) estimation method (Browne, Reference Browne1984b) can be adopted. The ADF method minimizes the discrepancy function in (45) with a particular choice of W involving the sample fourth-order moments about the mean (see, e.g., Lee, Reference Lee2007, p. 45). A Bayesian strategy can also be adopted (see, e.g., Hoff, Reference Hoff2016).

A closed-form solution to the minimization of (45) does not exist. However, Bentler, Poon, & Lee (Reference Bentler, Poon and Lee1988) show that well-known software packages designed for structural equation models (for a review, see Narayanan, Reference Narayanan2012) can be adapted for solving three-way three-mode FA estimation problems such as the Parafac one in (45).

6. Related Models

The structural Parafac model is estimated by using a scale-invariant method of estimation, i.e., GLS or ML, and it is not the only attempt to go beyond the component Parafac and its OLS estimation. In the component approach, Bro, Sidiropoulos, & Smilde (Reference Bro, Sidiropoulos and Smilde2002) address the estimation problem of various models for which OLS fitting procedures exist, such as the Parafac one, when errors are no longer homoscedastic and uncorrelated. They minimize a discrepancy function like

(46) g ( θ ) = tr { [ X A - Y A ( θ ) ] W - 1 } 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {g}({\uptheta }) = \hbox {tr} \{[\mathbf{X }_{\mathrm {A}} {-} \mathbf{Y }_{\mathrm {A}}({\uptheta })]\mathbf{W }^{-1}\} ^{2} \end{aligned}$$\end{document}

where θ = { A , B , C } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\uptheta } = \{ \mathbf{A }, \mathbf{B} , \mathbf{C} \}$$\end{document} and Y A ( θ ) = A ( C B ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{Y }_{\mathrm {A}}({\uptheta } ) = \mathbf{A} (\mathbf{C} \bullet \mathbf{B} )^{\prime }$$\end{document} . Following the same framework, Vega-Montoto & Wentzell (Reference Vega-Montoto and Wentzell2003) propose some ALS algorithms for estimating the parameters of the component Parafac model assuming different error structures. Further theoretical and practical results can also be found in two companion papers (Vega-Montoto, Gu, & Wentzell, Reference Vega-Montoto, Gu and Wentzell2005; Vega-Montoto & Wentzell, Reference Vega-Montoto and Wentzell2006).

In between the component and structural approaches, we found the proposal of Harshman (Reference Harshman1972), named Parafac2, modeling the within-occasion covariance matrices as, with obvious notation,

(47) S k = Σ k ( θ ) + E k , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \mathbf{S }_{k}= & {} {\varvec{\Sigma }}_{k}({\uptheta } ) + \mathbf{E }_{k}, \end{aligned}$$\end{document}
(48) Σ k ( θ ) = B diag ( c k ) Φ diag ( c k ) B , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\varvec{\Sigma }}_{k}({\uptheta } )= & {} \mathbf{B} \hbox {diag}(\mathbf{c }_{k}) {\varvec{\Phi }} \hbox {diag} (\mathbf{c }_{k}) \mathbf{B }^{\prime }, \end{aligned}$$\end{document}

where S k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{S }_{k}$$\end{document} is the sample covariance matrix of the manifest variables at occasion k and E k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{E }_{k}$$\end{document} is an error term. The model is estimated by OLS, i.e., by minimizing the loss

(49) h ( θ ) = k tr { [ S k - Σ k ( θ ) ] W - 1 } 2 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {h} ({\uptheta }) = {\sum }_{k} \hbox {tr} \{[\mathbf{S }_{k} {-} {\varvec{\Sigma }}_{k}({\uptheta })] \mathbf{W }^{-1}\}^{2}, \end{aligned}$$\end{document}

with W = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{W} = \mathbf{I} $$\end{document} . It is important to note that (48) differs from (14) because in the former, covariances among different occasions and among specific factors are not considered. Mayekawa (Reference Mayekawa1987) extends the Parafac2 model by introducing the variances of the specific factors and providing the ML estimation.

About the structural approach, it is worth mentioning the work of Stegeman & Lam (Reference Stegeman and Lam2014), where an estimation procedure, based on minimum rank FA, for the structural Parafac model in (14) is proposed under the assumption of uncorrelated specific factors. In the proposal, properties of the estimators are not studied and their standard errors are not derived. Moreover, the adequacy of the model is not evaluated by goodness-of-fit tests.

7. Application

The type (b) scale-invariant structural Parafac model is applied to the data introduced by Bentler & McClain (Reference Bentler and McClain1976). The data refer to the correlation matrix observed on a sample of I = 68 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$I = 68$$\end{document} fifth-grade children with respect to J = 4 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J = 4$$\end{document} personality variables (E: Extraversion; A: Test anxiety; I: Impulsivity; M: Academic achievement motivation) measured by Peer report (P), Teacher rating (T) and Self-rating (S), hence K = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$K = 3$$\end{document} . They are reported in Table 1. The data in Table 1 represent an example of multitrait multimethod matrix. The traits are the personality variables, and the methods refer to the judge ratings.

Table 1. Bentler & McClain (Reference Bentler and McClain1976) correlation data.

Several scale-invariant structural Parafac models are estimated from the correlation data in Table 1. They are obtained by varying the number of factors (either S = 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = 2$$\end{document} or S = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = 3$$\end{document} ), the covariance structure of the common factors (either orthogonal or oblique) and the covariance matrix of the specific factors (i) diagonal, the specific factors are uncorrelated; (ii) block diagonal, the specific factors are correlated only within the methods; (iii) banded, the specific factors are correlated only within the traits). The scale invariance is obtained by adding the matrix D as in (23) and the constraint Ψ I = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }} *\mathbf{I} = \mathbf{I} $$\end{document} for identification purposes. Estimation is addressed by ML assuming that the vectors x i , i = 1 , , I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{x }_{i}, i = 1, \ldots , I$$\end{document} , are independent and identically distributed as a multivariate normal. The analysis is performed by using the procedure CALIS for structural equation models available in the SAS software. After some experimentations, we found that the best strategy to limit the risk of non-convergence or convergence to a local optimum is to adopt the Newton–Raphson algorithm and operate as follows. For each model with Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} diagonal, the best-fitting solution is determined by considering 100 runs of the Newton–Raphson algorithm. For every run, the starting point is the OLS solution, i.e., the minimizer of (45) with W = I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbf{W} = \mathbf{I} $$\end{document} , randomly initialized. This choice limits the risk of attaining local optima. When S = 2 , 82 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = 2, 82$$\end{document} (orthogonal common factors) and 83 (oblique common factors) times (out of 100), the global optimum is found, whilst when S = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = 3$$\end{document} , such numbers decrease to 60 and 46, respectively. Such calculations are made by considering the first four decimal digits. The solution corresponding to the minimum loss function value is considered the best-fitting solution. Such best-fitting solutions are then used as starting points of the corresponding models with Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} block diagonal or banded. We observed that alternative starting points led to local optima.

In the scientific community, there is no agreement on what is the best way to select a model. The suggestion is to use several different methods and choose the model on which there is the maximum agreement. Following this, we consider the χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}$$\end{document} test of overall fit (Browne, Reference Browne and Hawkins1982), AIC (Akaike, Reference Akaike1974) and BIC (Schwarz, Reference Schwarz1978), which are widely used for model selection in this domain. See, for instance, Akaike (Reference Akaike1987), Bartholomew, Knott, & Moustaki (Reference Bartholomew, Knott and Moustaki2011) and, in the three-way three-mode context, Kroonenberg & Oort (Reference Kroonenberg and Oort2003). The χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}$$\end{document} test refers to the null hypothesis

H 0 : Σ = Σ ( θ ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {{H}}_{0}: \, {\varvec{\Sigma }} = {\varvec{\Sigma }} ({\uptheta }), \end{aligned}$$\end{document}

against the general alternative

H 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{H}}_{1}$$\end{document} : Σ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }}$$\end{document} is any positive definite matrix different from Σ ( θ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Sigma }} ({\uptheta } )$$\end{document} .

Table 2. Fit, AIC, BIC and number of parameters for different Parafac models applied to the Bentler & McClain (Reference Bentler and McClain1976) correlation data.

The summary of the twelve models considered in terms of p-value of the χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}$$\end{document} test, AIC, BIC and number of parameters is reported in Table 2. By inspecting Table 2, we conclude that there is an agreement toward the use of S = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$S = 3$$\end{document} factors. In particular, at the level α = 0.05 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha = 0.05$$\end{document} , the only four models for which the null hypothesis of the χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}$$\end{document} test is not rejected are the ones where Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is non-diagonal and the common factors are either orthogonal or oblique. The first model is also the best one according to the AIC criterion. We thus investigate the solution of such a model.

Table 3. Diagonal elements of the scaling matrix D. (Standard errors are within parentheses.)

Table 4. Estimated factor loading matrix for the traits. (Standard errors are within parentheses.)

Table 5. Estimated factor loading matrix for the methods. (Standard errors are within parentheses.)

First of all, we report the estimates of the diagonal scaling matrix D in Table 3. All these estimates appear to be significant. The estimated factor loadings for the traits and the methods are given in Tables 4 and 5. Note that, without loss of fit, in the estimation process the elements of the first rows of the factor loading matrices for the traits and the methods are constrained to 1. Factor 1 is related to the duality between Impulsivity (with positive sign) versus Academic achievement motivation (with negative sign) mainly measured by Peer report. Factor 2 depends on Test anxiety and Academic achievement motivation (both with negative sign) with respect to the traits and Teacher rating and Peer report (both with positive sign) with respect to the methods. Finally, Factor 3 is essentially related to the measurements by Teacher rating for Extraversion. (All such factor loadings are positive.) The square root of the estimated covariance matrix for the common factors, Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Phi }}^{1/2}$$\end{document} , and of the banded covariance matrix for the unique factors, Ψ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}^{1/2}$$\end{document} , are given in Tables 6 and 7, respectively. From Table 7, we can see that several off-diagonal elements are rather different from zero. The banded structure of Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} implies differences in the estimated common factors in comparison with the model where Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is assumed to be diagonal. To clarify this point, we briefly compare the previously described solution with the corresponding one where Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} is diagonal. Although the null hypothesis of the χ 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\chi ^{2}$$\end{document} test is rejected, such a model is characterized by the lowest BIC. The interpretation is slightly modified due to the loading of Impulsivity which remarkably increases (3.66). The most relevant differences for Factor 2 involve Academic achievement motivation (loading from - 2.92 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-\,2.92$$\end{document} to - 1.35 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-\,1.35$$\end{document} ) and Teacher rating (from 1.70 to 1.26). With respect to Factor 3, we observe differences in the methods. In detail, the loadings of Teacher rating and Self-rating reduce to 2.13 and 1.03, respectively. Obviously, these differences are related to the fact that assuming Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}$$\end{document} to be diagonal leads to a model such that the common factors fit some amount of error, i.e., the correlations between the specific factors within the traits. Finally, by looking at the standard errors reported in Tables 3, 4, 5, 6 and 7, we note that in some cases, especially for the variable factor loadings, some estimates are not significantly different from zero. This is quite obvious when 42 parameters are estimated by using only 68 observations.

Table 6. Diagonal elements of the square root of the estimated covariance matrix for the common factors Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Phi }}^{1/2}$$\end{document} . (Standard errors are within parentheses.)

Table 7. Square root of the estimated covariance matrix for the unique factors Ψ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Psi }}^{1/2}$$\end{document} . (Standard errors are within parentheses.)

8. Conclusion and Final Remarks

The structural Parafac model has been discussed in this paper. The focus has been on its major property: uniqueness of factor loading matrices. We have shown that, under mild conditions, such a property, proved only for the component version of the model, holds. In particular, we have shown that it holds even if the specific factors are assumed to be within-variable or within-occasion correlated and the model is modified to become scale invariant. In this respect, a consideration is in order. Our uniqueness proofs are based on Kruskal’s sufficient condition (Reference Kruskal1977) formulated for the component version of the model. However, this could be substituted with any other sufficient condition that exploits the particular features of the structural model. As an example, in this context, Kruskal’s condition is always applied by having one of the three matrices square and of full rank, i.e., Φ 1 / 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{\Phi }}^{1/2}$$\end{document} . Conditions taking into account such a feature (e.g., Jiang & Sidiropoulos, Reference Jiang and Sidiropoulos2004) probably would lead to more powerful results. Furthermore, we have focused our attention to the structural Parafac model with unconstrained matrices B and C. However, especially in the component-based version, such matrices and A can be constrained (e.g., column-wise orthonormality, linear dependency, nonnegativity, unimodality, symmetry). In general, imposing constraints implies replacing Kruskal’s condition with more relaxed uniqueness conditions. For instance, in case of linear dependency, see Stegeman & Lam (Reference Stegeman and Lam2012). Our results might be improved by using such relaxed conditions.

Acknowledgements

Open access funding provided by Università degli Studi di Roma La Sapienza within the CRUI-CARE Agreement.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Acar, E., & Yener, B. (2009). Unsupervised multiway data analysis: A literature survey. IEEE Transactions on Knowledge and Data Engineering, 21 6 20. CrossRefGoogle Scholar
Adachi, K., & Trendafilov, N. T. (2019). Some inequalities contrasting principal component and factor analyses solutions Japanese Journal of Statistics and Data Science 2, 31 47. CrossRefGoogle Scholar
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716 723. CrossRefGoogle Scholar
Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317 332. CrossRefGoogle Scholar
Anderson, T. W., & Rubin, H. (1956). Statistical inference in factor analysis. In Proceedings of the third berkeley symposium on mathematical statistics and probability, Volume 5: Contributions to econometrics, industrial research, and psychometry (pp. 111–150). California: University of California. Google Scholar
Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach 3 Chichester: Wiley. CrossRefGoogle Scholar
Bentler, P. M., & Lee, S. -Y. (1978). Statistical aspects of a three-mode factor analysis model. Psychometrika, 43, 343 352. CrossRefGoogle Scholar
Bentler, P. M., & Lee, S. -Y. (1979). A statistical development of three-mode factor analysis model. British Journal of Mathematical and Statistical Psychology, 32, 87 104. CrossRefGoogle Scholar
Bentler, P. M., & McClain, J. (1976). A multitrait-multimethod analysis of reflection-impulsivity. Child Development, 47, 218 226. CrossRefGoogle Scholar
Bentler, P. M., Poon, W. -Y., & Lee, S. -Y. (1988). Generalized multimode latent variable models: Implementation by standard programs. Computational Statistics and Data Analysis, 6, 107 118. CrossRefGoogle Scholar
Bloxom, B. (1968). A note on invariance in three-mode factor analysis. Psychometrika, 33, 347 350. CrossRefGoogle ScholarPubMed
Bloxom, B., Law, H. G., Snyder, C. W., Hattie, J. A., & McDonald, R. P. (1984). Tucker’s three-mode factor analysis. Research methods for multimode data analysis New York: Praeger. 104 121. Google Scholar
Bro, R. (1998). Multi-way analysis in the food industry, models algorithms and applications. Amsterdam: University of Amsterdam. Google Scholar
Bro, R., Sidiropoulos, N. D., & Smilde, A. K. (2002). Maximum likelihood fitting using ordinary least squares algorithms. Journal of Chemometrics, 16, 387 400. CrossRefGoogle Scholar
Bro, P., & Smilde, A. K. (2003). Centering and scaling in component analysis. Journal of Chemometrics, 17, 16 33. CrossRefGoogle Scholar
Browne, M. W. (1974). Generalized least squares estimators in the analysis of covariance structures. South African Statistical Journal, 8, 1 24. Google Scholar
Browne, M. W. (1980). Factor analysis of multiple batteries by maximum likelihood. British Journal of Mathematical and Statistical Psychology, 33, 184 199. CrossRefGoogle Scholar
Browne, M. W. Hawkins, D. M. (1982). Covariance structures. Topics in applied multivariate analysis Cambridge: Cambridge University Press. 72 141. CrossRefGoogle Scholar
Browne, M. W. (1984a). The decomposition of multitrait-multimethod matrices. British Journal of Mathematical and Statistical Psychology, 37, 1 21. CrossRefGoogle ScholarPubMed
Browne, M. W. (1984b). Asymptotically distribution-free methods for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 37, 62 83. CrossRefGoogle ScholarPubMed
Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via an n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n$$\end{document} -way generalization of Eckart–Young decomposition. Psychometrika, 35, 283 319. CrossRefGoogle Scholar
De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21, 1253 1278. CrossRefGoogle Scholar
Domanov, I., & De Lathauver, L. (2013a). On the uniqueness of the canonical polyadic decomposition of third-order tensors—Part I: Basic results and uniqueness of one factor matrix. SIAM Journal on Matrix Analysis and Applications, 34, 855 875. CrossRefGoogle Scholar
Domanov, I., & De Lathauver, L. (2013b). On the uniqueness of the canonical polyadic decomposition of third-order tensors—Part II: Uniqueness of the overall decomposition. SIAM Journal on Matrix Analysis and Applications, 34, 876 903. CrossRefGoogle Scholar
Eckart, C., & Young, G. (1936). The approximation of one matrix by another of lower rank. Psychometrika, 1, 211 218. CrossRefGoogle Scholar
Gerard, D., & Hoff, P. D. (2015). Equivariant minimax dominators of the MLE in the array normal model. Journal of Multivariate Analysis, 137, 32 49. CrossRefGoogle ScholarPubMed
Giordani, P., & Kiers, H. A. L. (2018). A review of tensor-based methods and their application to hospital care data. Statistics in Medicine, 37, 137 156. CrossRefGoogle ScholarPubMed
Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an ‘explanatory’ multi-model factor analysis. UCLA Working Papers in Phonetics, 16, 1 84. Google Scholar
Harshman, R. A. (1972). PARAFAC2: Mathematical and technical notes. UCLA Working Papers in Phonetics, 22, 33 44. Google Scholar
Harshman, R. A., & Lundy, M. E. Law, H. G., Snyder, C. W., Hattie, J. A., & McDonald, R. P. (1984). Data preprocessing and the extended PARAFAC model. Research methods for multimode data analysis, 216 284. New York: Praeger. Google Scholar
Hoff, P. D. (2016). Equivariant and scale-free Tucker decomposition models. Bayesian Analysis, 11, 627 648. CrossRefGoogle Scholar
Jiang, T., & Sidiropoulos, N. D. (2004). Kruskal’s permutation lemma and the identification of CANDECOMP/PARAFAC and bilinear models with constant modulus constraints. IEEE Transactions on Signal Processing, 52, 2625 2636. CrossRefGoogle Scholar
Kiers, H. A. L. (2000). Towards a standardized notation and terminology in multiway analysis. Journal of Chemometrics, 14, 105 122. 3.0.CO;2-I>CrossRefGoogle Scholar
Kiers, H. A. L. (2006). Properties of and algorithms for fitting three-way component models with offset terms. Psychometrika, 71, 231 256. CrossRefGoogle ScholarPubMed
Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51, 455 500. CrossRefGoogle Scholar
Kroonenberg, P. M., & De Leeuw, J. (1980). Principal component analysis of three-mode data by means of alternating least squares algorithms. Psychometrika, 45, 69 97. CrossRefGoogle Scholar
Kroonenberg, P. M., & Oort, F. J. (2003). Three-mode analysis of multi-mode covariance matrices. British Journal of Mathematical and Statistical Psychology, 56, 305 336. CrossRefGoogle Scholar
Kroonenberg, P. M. (2008). Applied multiway data analysis, Hoboken, NJ: Wiley. CrossRefGoogle Scholar
Kruskal, J. B. (1977). Three-way arrays: Rank and uniqueness of trilinear decomposition, with application to arithmetic complexity and statistics. Linear Algebra and Its Applications, 18, 95 138. CrossRefGoogle Scholar
Lee, S. -Y. (2007). Structural equation modeling: A Bayesian approach. Chichester: Wiley CrossRefGoogle Scholar
Lee, S. -Y., & Fong, W. (1983). A scale invariant model for three-mode factor analysis. British Journal of Mathematical and Statistical Psychology, 36, 217 223. CrossRefGoogle Scholar
Mayekawa, S. (1987). Maximum likelihood solution to the PARAFAC model. Behaviormetrika, 21, 45 63. CrossRefGoogle Scholar
Mørup, M. (2011). Applications of tensor (multi-way array) factorizations and decompositions in data mining. WIREs Data Mining and Knowledge Discovery, 1, 24 40. CrossRefGoogle Scholar
Narayanan, A. (2012). A review of eight software packages for structural equation modeling. The American Statistician, 66, 129 138. CrossRefGoogle Scholar
Schwarz, G. E. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461 464. CrossRefGoogle Scholar
Smilde, A. K., Bro, R., & Geladi, P. (2004). Multi-way analysis: Applications in the chemical sciences. Chichester: Wiley. CrossRefGoogle Scholar
Stegeman, A. (2009). On uniqueness conditions for Candecomp/Parafac and Indscal with full column rank in one mode. Linear Algebra and Its Applications, 431, 211 227. CrossRefGoogle Scholar
Stegeman, A., & Lam, T. T. T. (2012). Improved uniqueness conditions for canonical tensor decompositions with linearly dependent loadings. SIAM Journal on Matrix Analysis and Applications, 33, 1250 1271. CrossRefGoogle Scholar
Stegeman, A., & Lam, T. T. T. (2014). Three-mode factor analysis by means of Candecomp/Parafac. Psychometrika, 79, 426 443. CrossRefGoogle ScholarPubMed
Stegeman, A., ten Berge, J. M. F., & De Lathauwer, L. (2006). Sufficient conditions for uniqueness in Candecomp/Parafac and Indscal with random component matrices. Psychometrika, 71, 219 229. CrossRefGoogle ScholarPubMed
Tucker, L. R. Harris, C. W. (1963). Implications of factor analysis of three-way matrices for measurement of change. Problems in measuring change, Madison: University of Wisconsin Press. 122 137. Google Scholar
Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31, 279 311. CrossRefGoogle ScholarPubMed
Unkel, S., & Trendafilov, N. T. (2010). Simultaneous parameter estimation in exploratory factor analysis: An expository review. International Statistical Review, 78, 363 382. CrossRefGoogle Scholar
Vega-Montoto, L., Gu, H., & Wentzell, P. D. (2005). Mathematical improvements to maximum likelihood parallel factor analysis: Theory and simulations. Journal of Chemometrics, 19, 216 235. CrossRefGoogle Scholar
Vega-Montoto, L., & Wentzell, P. D. (2003). Maximum likelihood parallel factor analysis (MLPARAFAC). Journal of Chemometrics, 17, 237 253. CrossRefGoogle Scholar
Vega-Montoto, L., & Wentzell, P. D. (2006). Approaching the direct exponential curve resolution algorithm from a maximum likelihood perspective. Analytica Chimica Acta, 556, 383 399. CrossRefGoogle Scholar
Figure 0

Table 1. Bentler & McClain (1976) correlation data.

Figure 1

Table 2. Fit, AIC, BIC and number of parameters for different Parafac models applied to the Bentler & McClain (1976) correlation data.

Figure 2

Table 3. Diagonal elements of the scaling matrix D. (Standard errors are within parentheses.)

Figure 3

Table 4. Estimated factor loading matrix for the traits. (Standard errors are within parentheses.)

Figure 4

Table 5. Estimated factor loading matrix for the methods. (Standard errors are within parentheses.)

Figure 5

Table 6. Diagonal elements of the square root of the estimated covariance matrix for the common factors Φ1/2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\varvec{\Phi }}^{1/2}$$\end{document}. (Standard errors are within parentheses.)

Figure 6

Table 7. Square root of the estimated covariance matrix for the unique factors Ψ1/2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${\varvec{\Psi }}^{1/2}$$\end{document}. (Standard errors are within parentheses.)