Hostname: page-component-5f745c7db-sbzbt Total loading time: 0 Render date: 2025-01-06T07:50:57.737Z Has data issue: true hasContentIssue false

Non-parametric Regression Among Factor Scores: Motivation and Diagnostics for Nonlinear Structural Equation Models

Published online by Cambridge University Press:  01 January 2025

Steffen Grønneberg*
Affiliation:
BI Norwegian Business School
Julien Patrick Irmer
Affiliation:
Goethe University Frankfurt
*
Correspondence should be made to Steffen Grønneberg, Department of Economics, BI Norwegian Business School, Oslo 0484, Norway. Email: steffeng@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

We provide a framework for motivating and diagnosing the functional form in the structural part of nonlinear or linear structural equation models when the measurement model is a correctly specified linear confirmatory factor model. A mathematical population-based analysis provides asymptotic identification results for conditional expectations of a coordinate of an endogenous latent variable given exogenous and possibly other endogenous latent variables, and theoretically well-founded estimates of this conditional expectation are suggested. Simulation studies show that these estimators behave well compared to presently available alternatives. Practically, we recommend the estimator using Bartlett factor scores as input to classical non-parametric regression methods.

Type
Theory & Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © 2024 The Author(s)

1. Introduction

Structural equation models (SEMs) describe how an endogenous latent random vector η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} is influenced by an exogenous random vector ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} as well as coordinates of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} , where ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\xi ', \eta ')$$\end{document} belong to a randomly chosen person in a population. Usually, both vectors are latent and continuous. The added complexity of this latency may explain the current sparsity of tools for motivating and diagnosing the functional form of this influence. This paper provides a population-based theoretical foundation for non-parametrically estimating the functional forms of the relationships between the coordinates of ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\xi ', \eta ')'$$\end{document} that is based on Bartlett (Reference Bartlett1937) factor scores computed from the observables measuring η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} and ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . The population-based perspective of the paper means that we ignore sampling error for mathematical convenience, which correspond roughly to assuming that the sample size is large.

Even from a population perspective, the factor scores, say, ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} and η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} , approximate the latent variables ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} , respectively, with high precision only when the number of observable variables that measure them is sufficiently high (Krijnen, Reference Krijnen2004, Reference Krijnen2006a, Reference Krijnenb). For a low number of measurement variables, each individual factor score may still be a low precision approximation to the corresponding true latent variable. This is sometimes called factor indeterminacy (see, e.g.  Grice, 2001). Still, this paper shows that trend estimates for the effect of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} onto η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} based on factor scores can work well in realistic conditions, and that what matters most for the quality of the trend estimate is the number of measurement variables d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . Loosely speaking, the reason for this is as follows: The trend estimate is based on averaging observations of η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} for a given local range of observations of ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} . This approximates the true trend defined as averages of observations of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} for a given local range of observations of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . The averaging of η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} cancels completely out the mean zero approximation error η ¨ - η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }- \eta $$\end{document} , but the same effect is not present for the local range of observations of ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} as an approximation to the local range of observations of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , which improves only as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases.

With the caveat that individual factor scores may be rough approximations to the latent variables, scatter plots of factor scores with trend estimates can still motivate and diagnose functional forms in SEMs in much the same way as scatter plots and superimposed trends are commonly used in applied regression analysis (see, e.g., Fox & Weisberg, Reference Fox and Weisberg2011; Weisberg, 2005). While some specification tests or tests for quadratic and interaction terms for SEM exist (Nestler, Reference Nestler2015; Büchner and Klein, Reference Büchner and Klein2020), trend estimates of the functional form in SEM are useful also for linear SEM, as traditional covariance-based tools such as the Chi-square goodness-of-fit test and its robustified variants may have zero power toward non-linear alternatives (Mooijaart and Satorra, Reference Mooijaart and Satorra2009).

In this paper, ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} are assumed to be latent and measured via a correctly specified linear factor model, as specified shortly. This means that we consider diagnostics or motivation of the measurement model as outside the scope of the present paper.

We will later assume that the error terms of the factor model and the factors are independent and that the factors are continuous variables. This can only happen if the observed variables are continuous (see Appendix G in the online supplementary material). While treating ordinal data as continuous is sometimes justified under additional assumptions (Foldnes and Grønneberg, Reference Foldnes and Grønneberg2022; Grønneberg and Foldnes, Reference Grønneberg and Foldnes2024), this paper only deals with continuous observations. Ordinal data models, such as item response theory or threshold models, are outside the scope of the present paper.

The trend estimates we consider are non-parametric regressions for the structural connections between the coordinates of ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\xi ', \eta ')'$$\end{document} . If E η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} \eta $$\end{document} exists, then the conditional expectation E [ η | ξ ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\eta |\xi ]$$\end{document} exists (see Appendix K in the online supplementary material for a review of conditional expectations), which implies that

(1) η = H ( ξ ) + ζ , ζ : = η - E [ η | ξ ] , H ( x ) = E [ η | ξ = x ] , E [ ζ | ξ ] = 0 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \eta = H(\xi ) + \zeta , \quad \zeta := \eta - \mathbb {E} [\eta |\xi ], \quad H(x)= \mathbb {E} [\eta |\xi =x], \quad \mathbb {E} [\zeta | \xi ] = 0. \end{aligned}$$\end{document}

Recall that E [ ζ | ξ ] = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\zeta | \xi ] = 0$$\end{document} implies Cov ( φ ( ξ ) , ζ ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (\varphi (\xi ), \zeta ) = 0$$\end{document} for all integrable functions φ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varphi $$\end{document} (see Appendix K). This is stronger than merely assuming Cov ( ξ , ζ ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (\xi , \zeta ) = 0$$\end{document} , but weaker than independence between ζ , ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\zeta , \xi $$\end{document} as this is equivalent to Cov ( φ ( ξ ) , ϱ ( ζ ) ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (\varphi (\xi ), \varrho (\zeta )) = 0$$\end{document} for any integrable functions φ , ϱ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varphi ,\varrho $$\end{document} .

In Eq. (1), we considered the total effect of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} onto η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} (for an overview of linear mediation analysis see MacKinnon et al., Reference MacKinnon, Fairchild and Fritz2007). By the same reasoning, we can consider each coordinate η j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _j$$\end{document} of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} separately, conditioning η j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _j$$\end{document} not just on ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , but instead on both ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and the connections from η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} substantive knowledge dictates influences η j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _j$$\end{document} . If the substantive knowledge is correct, a proposition usually not fully identified from data alone (Bollen, Reference Bollen1989; Jöreskog et al., Reference Jöreskog, Olsson and Wallentin2016), this non-parametrically estimates the trend of a full SEM. This approach, which we call the component-wise approach, is more fully described and exemplified in Appendix B in the online supplementary material.

Algorithmically, the only difference between the component-wise approach and the reduced form approach considered in Eq. (1) is the names of the variables involved. To reduce the notational burden of the paper, we will therefore focus the main text on estimating H in the reduced form representation of Eq. (1). While the component-wise approach is of higher practical interest in most cases, its mathematics is exactly the same as the reduced form approach if we re-label the variables.

To illustrate the difference between the component-wise and reduced form approaches, consider the simple system

η 1 = ξ 1 2 + z 1 , η 2 = η 1 + ξ 1 + z 2 = ξ 1 + ξ 1 2 + z 1 + z 2 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \eta _1 = \xi _1^2 + \mathfrak {z}_1, \quad \eta _2 = \eta _1 + \xi _1 + \mathfrak {z}_2 = \xi _1 + \xi _1^2 + \mathfrak {z}_1 + \mathfrak {z}_2. \end{aligned}$$\end{document}

For this illustration, assume that the error terms z 1 , z 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathfrak {z}_1, \mathfrak {z}_2$$\end{document} and the exogenous variable ξ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi _1$$\end{document} are zero mean and independent. We first consider η 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _1$$\end{document} . In both the component-wise and the reduced form approaches, we consider E [ η 1 | ξ 1 ] = ξ 1 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\eta _1 | \xi _1] = \xi _1^2$$\end{document} , showing that z 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathfrak {z}_1$$\end{document} is also the error term induced by the conditional expectation representation, i.e., z 1 = ζ 1 : = η 1 - E [ η 1 | ξ 1 ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathfrak {z}_1 = \zeta _1:= \eta _1 - \mathbb {E} [\eta _1 | \xi _1]$$\end{document} . We then consider η 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _2$$\end{document} . In the component-wise approach, we calculate E [ η 2 | η 1 , ξ 1 ] = η 1 + ξ 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\eta _2 | \eta _1, \xi _1] = \eta _1 + \xi _1$$\end{document} , which is linear. The error term z 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathfrak {z}_2$$\end{document} is then the error term induced by this conditional expectation calculation, i.e., z 2 = η 2 - E [ η 2 | η 1 , ξ 1 ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathfrak {z}_2 = \eta _2 - \mathbb {E} [\eta _2 | \eta _1, \xi _1]$$\end{document} . From the expanded system shown at the end of the above display, we also deduce the reduced form trend E [ η 2 | ξ 1 ] = ξ 1 + ξ 1 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\eta _2|\xi _1] = \xi _1 + \xi _1^2$$\end{document} , which is quadratic, with an induced error term ζ 2 : = z 1 + z 2 = η 1 - E [ η 2 | ξ 1 ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\zeta _2:= \mathfrak {z}_1 + \mathfrak {z}_2 = \eta _1 - \mathbb {E} [\eta _2|\xi _1]$$\end{document} . We see that in both cases, we detect a non-linear trend in the system. With structural knowledge, we are able to further detect that the non-linear trend affects only η 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta _1$$\end{document} directly. More comprehensive examples and analytical examples are provided in Appendix B in the online supplementary material.

Our suggested empirical approach is based on plotting factor scores together with a non-parametric estimate of H to motivate or diagnose the functional form of a SEM. The non-parametric estimate will be rough and is in most cases best suited as a guide to model formulation and diagnostics—not as a standalone estimation technique. See Appendix A in the online supplementary material for a simple numerical illustration. Once an appropriate parametric model is identified, it is then estimated via standard techniques such as the classical linear approach, the latent moderated structural equations approach (LMS, Klein & Moosbrugger, Reference Klein and Moosbrugger2000), or the unconstrained product indicator approach (UPI, Marsh et al., Reference Marsh, Wen and Hau2004; Kelava & Brandt, Reference Kelava and Brandt2009). A literature review of available estimation methods is found in Appendix C in the online supplementary material. This approach follows common practice in the applied regression literature (see, e.g., Fox & Weisberg, Reference Fox and Weisberg2011; Weisberg, 2005), where non-parametric estimates are used to guide parametric modeling.

Plotting factor scores for model motivation and diagnostics has roots going back to McDonald (1967) who worked with nonlinear factor models. In the context of SEM, Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012) appear to be the first to suggest adding trend estimates to this plot, and Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012) also showed through simulation that this gives reasonable results. Our paper provides the theoretical underpinnings of the method, as well as substantial simulation work to further assess the performance of the method.

Another approach to model diagnostics in SEM is residual analysis. Bollen and Arminger (1991) define residuals for linear SEM via factor score-based estimators of the error terms of the measurement model and the structural model. Raykov and Penev (2014) show via simulation that plotting coordinates of residuals from a structural model against each other can be used to detect unaccounted for structural trends. While a formal analysis of these procedures would be intimately connected to the contributions in the present paper, residual analysis is a complex topic, and we consider it outside the scope of the present paper.

For mathematical convenience, our analysis is limited to population quantities, and we deal only with the consistency of estimates of H. Inference for H is not considered in the paper, though standard bootstrap approaches may be applicable. Our paper also provides insights into what types of non-parametric regression methods should be used through a theoretical analysis and a comprehensive simulation study.

Our focus is on non-parametric estimators of H that make no parametric assumptions on H and no parametric assumptions on the distributions of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} and ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . As reviewed in Appendix C in the online supplementary material, there are many ways to estimate H, but to the best of our knowledge, the only presently available non-parametric estimators for H are in the presently understudied papers of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) and Kohler et al. (Reference Kohler, Müller and Walk2015). Since no implementation of the estimator of Kohler et al. (Reference Kohler, Müller and Walk2015) is available, we do not consider it in our paper.

We do, however, compare our suggested methods with the computationally demanding method of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017). Out of the methods we compare, our simulations indicate that inputting Bartlett scores into simple LOESS or spline methods work best, on average. This is computationally practically instantaneous.

As mentioned above, we assume ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} are measured through correctly specified linear factor models: Let f = ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f=(\xi ', \eta ')'$$\end{document} where \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$'$$\end{document} is vector transposition. We let the dimension of a random vector, say V, be denoted as d V \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_V$$\end{document} . We observe a sample of size n from the random vector z = ( x , y ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$z = (x', y')'$$\end{document} which follows the factor model

(2) z ~ : = z - μ = ( x ~ , y ~ ) = ( x - μ x , y - μ y ) = Λ f + ε , f = ( ξ , η ) , ε = ( ε x , ε y ) , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \tilde{z}:= z - \mu = (\tilde{x}',\tilde{y}')' = (x'-\mu _x',y'-\mu _y')' = \Lambda f + \varepsilon , \quad f = (\xi ', \eta ')', \quad \varepsilon = (\varepsilon _x', \varepsilon _y')', \end{aligned}$$\end{document}

where μ = ( μ x , μ y ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu = (\mu _x', \mu _y')'$$\end{document} is the expectation of z, where Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} is a non-random ( d ξ + d η ) × ( d x + d y ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(d_\xi +d_\eta )\times (d_x+d_y)$$\end{document} matrix, and where ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} consists of measurement errors. Precise assumptions on the factor model will be given later.

Identifying a correct measurement model is a difficult, though standard problem. The assumption of a correctly specified and linear measurement model is made by all standard non-linear as well as linear structural equation models (see, e.g., Jöreskog, Reference Jöreskog1969; Kenny & Judd, Reference Kenny and Judd1984; Wall & Amemiya, Reference Wall and Amemiya2000; Klein & Moosbrugger, Reference Klein and Moosbrugger2000; Skrondal & Laake, Reference Skrondal and Laake2001; Wall & Amemiya, Reference Wall and Amemiya2001, Reference Wall and Amemiya2003; Marsh et al., Reference Marsh, Wen and Hau2004; Lee et al., Reference Lee, Song and Tang2007; Kelava & Brandt, Reference Kelava and Brandt2009; Mooijaart & Bentler, Reference Mooijaart and Bentler2010; Mooijaart & Satorra, Reference Mooijaart and Satorra2012; Croon, Reference Croon, Marcoulides and Moustaki2002; Kohler et al., Reference Kohler, Müller and Walk2015; Kelava et al., Reference Kelava, Kohler, Krzyżak and Schaffland2017; Devlieger & Rosseel, Reference Devlieger and Rosseel2017; Brandt et al., Reference Brandt, Cambria and Kelava2018; Holst & Budtz-Jørgensen, Reference Holst and Budtz-Jørgensen2020; Rosseel & Loh, 2022). In Appendix F in the online supplementary material, we show that the techniques presented in the present paper are also compatible with certain non-linear measurement models that can be rewritten as linear measurement models. Then, we can derive analytically how measurement model misspecification influences estimates of H, and that numerical experiments show that the proposed methodology is not overly sensitive to minor measurement model misspecification.

In this paper, we only consider additive measurement error, both in the structural and measurement part of the model. Our approach centers around approximating the conditional expectation function H, which enters in an additive relationship to ζ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\zeta $$\end{document} . For distributions of ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\xi ', \eta ')'$$\end{document} with errors entering non-additively, this need not be the right perspective for studying trends. See Appendix J in the online supplementary material for a simple example.

1.1. Inputting Factor Scores to Non-parametric Regression Methods, a Literature Review and an Overview of our Theoretical Contributions

Traditional parametric regression methods among factor scores have been studied in several papers, among them Skrondal and Laake (Reference Skrondal and Laake2001); Devlieger et al. (Reference Devlieger, Mayer and Rosseel2016); Devlieger and Rosseel (Reference Devlieger and Rosseel2017); Croon (Reference Croon, Marcoulides and Moustaki2002); Hoshino and Bentler (2011) as well as the more recent SAM (structural after measurement) approach of Rosseel and Loh (2022). Also PLS-SEM and some of its variants (Sarstedt et al., 2021; Dijkstra and Henseler, Reference Dijkstra and Henseler2015) are based on regression methods among factor scores (Yuan and Deng, Reference Yuan and Deng2021). In contrast, inputting factor scores into non-parametric regression methods is a far less well-studied problem. The first paper we have found on this is Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012). Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012) have two proposals for diagnostics and model formulation in NLSEM: The first proposal is to input factor scores to non-parametric regression estimators, which is the research area this paper continues. The second proposal is to consider structural equation mixture models, which we consider outside the scope of the present paper. While structural equation mixture models has its own literature, see, e.g., the references within Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012), inputting classical scores, such as the Bartlett (Reference Bartlett1937) or Thurstone (1935, Thomson, Reference Thomson1934) factor scores, into non-parametric regression methods has as far as we know not been analyzed theoretically in the literature previously.

In Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) and Kohler et al. (Reference Kohler, Müller and Walk2015), the authors propose to estimate H non-parametrically by a similar procedure as Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012), except that instead of classical factor scores, they generate mathematically complex non-linear factor scores which are inputted into non-parametric regression procedures. Their papers include theoretical results proving that as the sample size n increase, these methods are consistent.

A foundational result for linear factor scores is that for its convergence in probability (and mean square) towards the true latent variables in addition to n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n \rightarrow \infty $$\end{document} , also the number of easurement per latent variable are required to increase indefinitely (Guttman, Reference Guttman1955; Williams, Reference Williams1978; Schneeweiss and Mathes, Reference Schneeweiss and Mathes1995; Krijnen, Reference Krijnen2004, Reference Krijnen2006a, Reference Krijnenb). This has the important implication that in general, non-parametric regression methods based on linear in contrast to nonlinear factor scores will not be consistent in estimating the true trend as n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n \rightarrow \infty $$\end{document} , but will also require a sufficient number of measurements of the latent variables.

In the present paper, we show that under weak conditions, only d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , the number of measurements of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , has to be sufficiently high to approximate H. We propose two kinds of theoretical approaches to the problem, both justified only for d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} sufficiently high, though both are shown to work well in simulations also for small d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , such as d x = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x = 3$$\end{document} : We provide conditions so that the population versions of a class of factor scores fit into the problem of non-parametric regression with normal measurement error in the covariate. The normality of the measurement error is not based on parametric distributional assumptions on the variables in the model but is derived from a central limit theorem. We also provide conditions for when population versions of factor scores can be used to approximate H through a direct application of non-parametric regression estimates, such as the LOESS estimate (Cleveland, Reference Cleveland1979, Reference Cleveland1981) or smoothed splines (Chambers and Hastie, 1992). In our simulations, this second alternative, which is the computationally and mathematically simplest method of all considered, usually has best performance, also when taking into account the computationally complex method of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017).

In this paper, we do not consider the method proposed in Kohler et al. (Reference Kohler, Müller and Walk2015), as no implementation of this method appears to be available. Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) provides a MATLAB \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\texttt {MATLAB}$$\end{document} (The MathWorks Inc., 2023) implementation of their algorithm, which we use in our simulations. Their non-linear factor scores minimize a loss function defined in terms of unspecified constants called probability weights. The performance of their method depends on the choice of these probability weights as well as which non-parametric regression method is used in the second stage, where the provided implementation used B-splines (De Boor, Reference De Boor1978). We take the choice of probability weights as given in the implementation of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017). In the choice of a second stage non-parametric method, we consider both the B-splines method analyzed in Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) and LOESS or smoothed splines as implemented in R (R Core Team, 2023). The latter appears to give better performance than the B-spline option.

The asymptotic approach we consider is to let the number of items go to infinity, where we for simplicity consider an infinite sample size. A joint asymptotic analysis where the number of observations and items increases jointly is considered outside the scope of the present paper. Such an analysis would be mathematically considerably more complex than the analysis undertaken in the present paper.

An asymptotic approach with a growing number of items is standard in the related research field of factor panel data models. There, a common asymptotic approach is to let both the panel width and length increase. In this large literature, with contributions from econometrics, statistics and related fields, factor scores or its analogues are considered, see, e.g., Fan et al. (2023) and the references therein. As far as we know, non-parametric regression among factor scores has not been considered in that literature.

1.2. The Structure of the Paper

This paper has four main contributions. First, we establish the conditions under which the conditional expectation function of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} given ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , denoted as H, can be identified using population factor scores. Second, we prove some basic results on affine factor scores that are suitable for such an analysis. Third, we show new asymptotic results, which include the consistency of Bartlett scores in the mean square, the normality of the measurement error of factor scores as estimates to the factors, and conditions when conditional expectations based on factor scores with decreasing measurement error converge to the conditional expectation based on factors. These first three contributions are found in Sect. 2. Fourth, we suggest non-parametric methods based on Bartlett factor scores in Sect. 3, and in Sect. 4 we evaluate them together with the Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) procedure through a simulation study. Finally, we discuss the findings from the simulation study and give concluding remarks. More technical details and a fuller discussion of some conclusions from the simulation study are deferred to an online appendix. All proofs and source code for our numerical analysis are also found in the online supplementary material.

2. Identification of H Based on Population Factor Scores

We here investigate when H is identified in the population, and we base our analysis on a class of factor scores.

Identification in this context means that under the stated assumptions, we are able to pin-point H based on the distribution of population versions of factor scores. The measurement part of the model in Eq. (2) is a confirmatory factor model, whose parameters are identified only up to unit of measurement transformations of the factors (for an overview and historical references, see Chapter 14.2 in Anderson, 2003). We will shortly assume that the parameters of the factor model in Eq. (2) are identified, which means that the unit of measurement is chosen, either by standardizing the factors, or fixing appropriate elements of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} to 1, or some combination thereof. As shown in the Appendix H in the online supplementary material, conditional expectations are well-behaved with regard to changes of the units of measurement, and therefore, standard practice for setting the units of measurement can be followed. The choice of unit of measurement will have some consequences for interpretation, and formulas for converting between choices are found in Appendix H in the online supplementary material. Without substantive knowledge leading to a preferred scaling method, we recommend standardizing the factors in an empirical investigation because this is an easily interpreted object, i.e., the conditional expectation of a standardized version of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} given a standardized version of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} .

Conditions for the non-parametric identification of the parameters and distributions involved in a SEM using nonlinear factor scores are given in Lemma 1 in Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017). These conditions are quite strong and include that all coordinates of ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} are independent, that ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} and f are independent, and that no cross loadings are present, meaning no observed variable measures two latent variables simultaneously. Lemma 1 in Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) then shows that the joint distribution of ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} and f is identified from the distribution of the observable variables. From this, we can compute the marginal distribution of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , the function H ( x ) = E [ η | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H(x) = \mathbb {E} [\eta | \xi = x]$$\end{document} , and the distribution of the error term ζ = η - H ( ξ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\zeta = \eta - H(\xi )$$\end{document} . Therefore, Lemma 1 in Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) identifies H.

In this section, we provide an alternative set of assumptions that asymptotically identifies H. More precisely, we will identify H under assumptions that hold only as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , the number of measurements of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , increases indefinitely, which can be called asymptotic identification. Showing asymptotic identification and not exact identification allows our results to be formulated under much weaker conditions compared to those of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017), whose first lemma shows exact identification for any d x , d y 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x, d_y \ge 3$$\end{document} . Our analyses focus on population versions of a class of factor scores, which we now introduce.

Affine factor scores are of the form A z + a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A z + a$$\end{document} , where A is a d f × d z \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_f \times d_z$$\end{document} matrix, and a is a d f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_f$$\end{document} dimensional vector. Usually, a = - A μ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a = - A \mu $$\end{document} and A is chosen so that A z ~ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \tilde{z}$$\end{document} is in an appropriate sense as close to f as possible. We will only consider such factor scores, and all references to factor scores mean affine factor scores.

Let Φ = Cov f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi = {\text {Cov}} \, f$$\end{document} and Σ = Cov z \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Sigma = {\text {Cov}} \, z$$\end{document} . The ’regression’ factor scores (Thurstone, 1935; Thomson, Reference Thomson1934), also known as Thurstone factor scores, are derived using A = Φ Λ Σ - 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = \Phi \Lambda ' \Sigma ^ {-1} $$\end{document} and a = - A μ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a = - A \mu $$\end{document} . These factor scores are optimal in the mean square sense (Neudecker and Satorra, Reference Neudecker and Satorra2003), yet we will instead focus on the Bartlett factor score, for a theoretical reason we now explain. As sketched in the upcoming Remark 1, Thurstone factor scores can likely be included in an extension of the theoretical framework considered in the present paper.

As our focus is on using factor scores as input to non-parametric regression methods, we will only consider factor scores with the property that A z ~ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \tilde{z}$$\end{document} equals f distorted by some uncorrelated, or more strongly, independent noise, as such factor scores will fit in with the general theory on non-parametric regression with measurement error. By addition and subtraction of f, we may define the d f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_f$$\end{document} dimensional error term r A = A z ~ - f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_A = A \tilde{z} - f$$\end{document} so that

(3) A z ~ = f + r A . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} A \tilde{z} = f + r_A. \end{aligned}$$\end{document}

For this equation to be related to regression, we require at least E r A = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} r_A = 0$$\end{document} and Cov ( f , r A ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (f,r_A) = 0$$\end{document} . In the upcoming technical conditions, we will require the additional assumption that f and r A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_A$$\end{document} are independent. Since independence cannot hold if the covariance is non-zero, we investigate this property more fully here.

The following lemma, which gathers several technical results that we need, shows that the requirement Cov ( f , r A ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (f,r_A) = 0$$\end{document} is equivalent to A Λ = I d f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \Lambda = I_{d_f}$$\end{document} , i.e., that A is a left inverse of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} . Interestingly, this is also a central requirement in the recently developed “structural after measurement” approach of Rosseel and Loh (2022). The lemma shows that Thurstone factor scores do not have an uncorrelated measurement error term, but Bartlett factor scores do. Since the Bartlett (Reference Bartlett1937) score is a generalized least squares estimate (GLS), it shares the standard optimality properties of GLS. The optimality of Bartlett scores in the least squares sense in the class of conditionally unbiased factor scores is well known. The following lemma shows that the class of conditionally unbiased factor scores is the same class as factor scores with uncorrelated measurement errors, and both are characterized by the previously mentioned left inverse property. We make the following standard assumptions, whose motivation is recalled in Appendix E.1 in the online supplementary material.

Assumption 1

Suppose Eq. (2) holds and that η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} and ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} has at least two finite moments. Further suppose

  1. (1) E ε = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} \varepsilon = 0$$\end{document} , and the cross-covariance matrix Cov ( f , ε ) = E [ ( f - E f ) ε ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (f,\varepsilon ) = \mathbb {E} [(f - \mathbb {E} f) \varepsilon ']$$\end{document} is zero.

  2. (2) Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} has full column rank.

  3. (3) Φ = Cov ( f ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Phi = {\text {Cov}} \, (f)$$\end{document} is positive definite.

  4. (4) Ψ = Cov ( ε ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi = {\text {Cov}} \, (\varepsilon )$$\end{document} is positive definite.

Let G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {G}(\Lambda )$$\end{document} be the set of all left-inverses of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} . That is, A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A\in \mathcal {G}(\Lambda )$$\end{document} means A Λ = I d f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \Lambda = I_{d_f}$$\end{document} .

Lemma 1

Suppose given Assumption 1.

  1. (1) Let A be a deterministic matrix, and let r A = A z ~ - f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_A = A \tilde{z} - f$$\end{document} . Then Cov ( f , r A ) = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (f,r_A) = 0$$\end{document} if and only if A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in \mathcal {G}(\Lambda )$$\end{document} . This holds also if Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} is singular.

  2. (2) Let A be a deterministic matrix. If E [ ε | f ] = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\varepsilon |f] = 0$$\end{document} , then E [ A z ~ | f ] = f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [A \tilde{z} | f] = f$$\end{document} if and only if A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in \mathcal {G}(\Lambda )$$\end{document} .

  3. (3) The transformation matrix T = Φ Λ Σ - 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T = \Phi \Lambda ' \Sigma ^ {-1} $$\end{document} used in the Thurstone factor score exists, but is not in G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {G}(\Lambda )$$\end{document} .

  4. (4) The Bartlett matrix Δ = ( Λ Ψ - 1 Λ ) - 1 Λ Ψ - 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta = (\Lambda ' \Psi ^ {-1} \Lambda )^ {-1} \Lambda ' \Psi ^ {-1} $$\end{document} exists and is in G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {G}(\Lambda )$$\end{document} , and is such that for all A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in \mathcal {G}(\Lambda )$$\end{document} we have that Cov ( r Δ ) - Cov ( r A ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, (r_\Delta ) - {\text {Cov}} \, (r_A)$$\end{document} is non-positive definite.

Proof

See Section E.4.1. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

The set of left inverses of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} is non-empty if and only if Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} has full column rank (Harville, 1997, Lemma 8.1.1). Therefore, Assumption 1 (2) is foundational. Assumption 1 (4) can be avoided, see, e.g., Eq. (7) in Wall and Amemiya (Reference Wall and Amemiya2000) and Fuller (Reference Fuller1987) for a Bartlett formula that avoids inverting Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} . We will not consider singular Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} matrices in this paper. Further, since Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} is assumed to have full column rank, the set of left inverses G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {G}(\Lambda )$$\end{document} equals the set of generalized inverses of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} (Harville, 1997, Lemma 9.2.8). This set can be described constructively, see Theorem 9.2.7 in Harville (1997). Due to Lemma 1 (4), we single out the Bartlett factor score out of the elements from G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {G}(\Lambda )$$\end{document} in most of our study.

In applications, the transformation matrix A has to be estimated. This introduces estimation error, as discussed in the upcoming Sect. 3. Taking this estimation error into account is outside the scope of this paper.

Let us now consider the regression representation in Eq. (3). For a given A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in \mathcal {G}(\Lambda )$$\end{document} , such as the Bartlett score A = Δ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = \Delta $$\end{document} , we write r = r A \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r = r_A$$\end{document} and

(4) f ¨ = ( ξ ¨ , η ¨ ) = A z ~ = A ( Λ f + ε ) = f + r = ( ξ , η ) + ( r ξ , r η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \ddot{f} = (\ddot{\xi }', \ddot{\eta }')' = A \tilde{z} = A (\Lambda f + \varepsilon ) = f + r = (\xi ', \eta ')' + (r_\xi ', r_\eta ') \end{aligned}$$\end{document}

where r ξ , r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi , r_\eta $$\end{document} are, respectively, the first d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi $$\end{document} and last d η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\eta $$\end{document} coordinates of r, and ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} , η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} are respectively the first d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi $$\end{document} and last d η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\eta $$\end{document} coordinates of f ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{f}$$\end{document} . From Eq. (4), we reach

(5) ξ ¨ = ξ + r ξ , η ¨ = η + r η . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \ddot{\xi }= \xi + r_\xi , \qquad \ddot{\eta }= \eta + r_\eta . \end{aligned}$$\end{document}

Since A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A\in \mathcal {G}(\Lambda )$$\end{document} , we have that r = A ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r=A\varepsilon $$\end{document} . Now, since r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta $$\end{document} is a linear transformation of r, we get E [ η ¨ | ξ ] = E [ η | ξ ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\ddot{\eta }| \xi ] = \mathbb {E} [\eta | \xi ]$$\end{document} as long as ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} is independent to f. We therefore make the following assumption.

Assumption 2

Suppose ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} is independent to f.

In the classical literature on covariance models (see, e.g., the survey paper Shapiro, 2007), the strong Assumption 2 is not made. We need this assumption, and not merely the covariance Assumption 1 (1) to identify H. With only covariance restrictions the distribution of f , ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f,\varepsilon $$\end{document} is not identified (see Mardia et al., Reference Mardia, Kent and Bibby1979, Exercise 9.2.2). Also Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) made this assumption to identify H.

Lemma 2

Suppose given Assumption 1 and 2. For a given A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in \mathcal {G}(\Lambda )$$\end{document} , we have that H ( x ) = E [ η | ξ = x ] = E [ η ¨ | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ H(x) = \mathbb {E} [ \eta | \xi = x] = \mathbb {E} [ \ddot{\eta }| \xi = x ] $$\end{document} for η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} given by Eq. (4) and Eq. (5).

Proof

See Section E.4.2. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

Under the following assumption, ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} and η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} are computable based on identifiable quantities. Therefore, we may suppose that we observe ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} and η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} directly when analyzing identification of H.

Assumption 3

Suppose Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} , μ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu $$\end{document} , Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} are identified from the distribution of z.

The identification of these matrices is a classical problem, and the measurement model is usually only considered valid when they are identified up to scaling of the latent variables through the covariance matrix of z (Anderson, 2003; Bollen, Reference Bollen1989; Mardia et al., Reference Mardia, Kent and Bibby1979). However, conditional expectations are well-behaved with regard to changes of the units of measurement (see Appendix H in the online supplementary material).

We are interested in identifying H(x) based on ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} and η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} . Using Lemma 2, Eq. (5) has the same structure as a non-parametric regression problem with measurement noise, see, e.g., Delaigle et al. (Reference Delaigle, Fan and Carroll2009); Delaigle (Reference Delaigle2014); Apanasovich and Liang (2021), or Huang and Zhou (Reference Huang and Zhou2017). This appears to be noticed also by Wall & Amemiya (Reference Wall and Amemiya2000, see the discussion immediately following their Eq. (9)), but at the time that paper was written, no generally applicable non-parametric regression methods with measurement error were available. These approaches generally need a known distribution and independence conditions to hold for the measurement error r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} , which has to be independent noise. Therefore, we make the following additional assumptions.

Assumption 4

Suppose

  1. (1) r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} has a known distribution.

  2. (2) r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} and r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta $$\end{document} are independent.

We will later consider approximating the measurement error r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} by zero, meaning we ignore the measurement error, and will show that this approximation works as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases. In these arguments, we also use Assumption 4. When ignoring measurement errors, we conjecture that exact independence can be weakened to appropriate dependence bounds. We do not investigate this in the present paper.

The following result is the starting point of the literature on non-parametric regression with measurement error with some papers cited above. We state the result with our notation and provide its short proof for completeness.

Proposition 1

Suppose given Assumption 1, 2, 3, and 4. Then H is identified.

Proof

See Section E.4.3. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

We now consider when Assumption 4 (2) can be justified. Assumption 4 (1) will be considered in the next subsection.

Since ( r ξ , r η ) = r = A ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(r_\xi ', r_\eta ')' = r = A \varepsilon $$\end{document} we have r η = ( 0 d η , d ξ , I d η ) A ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta = ( \varvec{0}_{d_\eta ,d_\xi }, I_{d_\eta } ) A \varepsilon $$\end{document} and r ξ = ( I d ξ , 0 d ξ , d η ) A ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi = ( I_{d_\xi }, \varvec{0}_{d_\xi , d_\eta }) A \varepsilon $$\end{document} , where 0 a , b \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varvec{0}_{a,b}$$\end{document} is the a × b \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$a\times b$$\end{document} zero matrix. Unless strong distributional assumptions are made, r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} and r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta $$\end{document} will not be independent unless firstly A is partitioned diagonal (thereby avoiding cross terms from ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} and ε y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _y$$\end{document} ), and secondly ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} is independent of ε y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _y$$\end{document} . If this is the case, i.e., if

A = A x 0 d ξ , d y 0 d η , d x A y , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} A = \begin{pmatrix} A_x &{} \varvec{0}_{d_\xi , d_y}\\ \varvec{0}_{d_\eta , d_x} &{} A_y \end{pmatrix}, \end{aligned}$$\end{document}

then r ξ = A x ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi = A_x \varepsilon _x$$\end{document} and r η = A y ε y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta = A_y \varepsilon _y$$\end{document} , and r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} will be independent to r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta $$\end{document} , as long as ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} and ε y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _y$$\end{document} are independent.

In general, we may write Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} as a partitioned matrix

Λ = Λ x Λ x , y Λ y , x Λ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \Lambda = \begin{pmatrix} \Lambda _{x} &{} \Lambda _{x,y} \\ \Lambda _{y,x} &{} \Lambda _y \end{pmatrix} \end{aligned}$$\end{document}

where Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} is a d x × d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \times d_\xi $$\end{document} matrix, Λ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _y$$\end{document} is a d y × d η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_y \times d_\eta $$\end{document} matrix, Λ x , y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{x,y}$$\end{document} is a d x × d η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \times d_\eta $$\end{document} matrix, and Λ y , x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{y,x}$$\end{document} is a d y × d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_y \times d_\xi $$\end{document} matrix. When A is a partition diagonal matrix with diagonal matrix entries A x , A y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_x, A_y$$\end{document} , we have

(6) ( ξ ¨ , η ¨ ) = A z ~ = A ( Λ f + ε ) = A x Λ x ξ + A x Λ x , y η + A x ε x A y Λ y η + A y Λ y , x ξ + A y ε y . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} (\ddot{\xi }', \ddot{\eta }')' = A \tilde{z} = A (\Lambda f + \varepsilon ) = \begin{pmatrix} A_x \Lambda _x \xi + A_x \Lambda _{x,y} \eta + A_x \varepsilon _x \\ A_y \Lambda _y \eta + A_y \Lambda _{y,x} \xi + A_y \varepsilon _y \end{pmatrix}. \end{aligned}$$\end{document}

The matrix A x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_x$$\end{document} will not in general be a generalized inverse of both Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} and Λ x , y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{x,y}$$\end{document} . Therefore, the factor scores will contain residual dependency between ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} which distort the identification of H as then Assumption 4 (2) no longer holds, if used directly as input to non-parametric regression methods.

In order to fulfill Assumption 4 (2), we, therefore, do not allow cross-loadings or error correlations between endogenous and exogenous parts of the model, that is between ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} . If such are part of the model, one would have to delete corresponding observed variables in order to directly apply our analysis. A less wasteful method might hopefully be derived as an extension of this work, though such an extension is outside the scope of the present paper. Within the measurement part of the endogenous and exogenous variables, cross-loadings or error correlations are allowed. Hence, we make the following assumptions.

Assumption 5

Suppose ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} and ε y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _y$$\end{document} are independent, and that

Ψ = Ψ x 0 d x , d y 0 d y , d x Ψ y , Λ = Λ x 0 d x , d η 0 d y , d ξ Λ y , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \Psi = \begin{pmatrix} \Psi _{x} &{} \varvec{0}_{d_x, d_y}\\ \varvec{0}_{d_y, d_x} &{} \Psi _y \end{pmatrix}, \quad \Lambda = \begin{pmatrix} \Lambda _{x} &{} \varvec{0}_{d_x, d_\eta }\\ \varvec{0}_{d_y, d_\xi } &{} \Lambda _y \end{pmatrix}, \end{aligned}$$\end{document}

where Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{x}$$\end{document} and Λ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _{y}$$\end{document} have full column ranks.

Under Assumption 1 (4), Ψ x , Ψ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi _x, \Psi _y$$\end{document} are positive definite, as they are principle sub-matrices of a positive-definite matrix Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} (Horn & Johnson, Reference Horn and Johnson2013, Observation 7.1.2). Under Assumption 5, a direct calculation shows that if A x G ( Λ x ) , A y G ( Λ y ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_x \in \mathcal {G}(\Lambda _x), A_y \in \mathcal {G}(\Lambda _y)$$\end{document} , then A = A x 0 d ξ , d y 0 d η , d x A y G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = \begin{pmatrix} A_x &{} \varvec{0}_{d_\xi , d_y}\\ \varvec{0}_{d_\eta , d_x} &{} A_y \end{pmatrix} \in \mathcal {G}(\Lambda )$$\end{document} . While there are also elements in G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {G}(\Lambda )$$\end{document} of different forms (Harville, 1997, Exercise 9.7), partitioned diagonal generalized inverses of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} imply that Assumption 4 (2) holds.

Lemma 3

Suppose given Assumption 1 and 5. Suppose that A = A x 0 d ξ , d y 0 d η , d x A y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = \begin{pmatrix} A_x &{} \varvec{0}_{d_\xi , d_y}\\ \varvec{0}_{d_\eta , d_x} &{} A_y \end{pmatrix}$$\end{document} where A x G ( Λ x ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_x \in \mathcal {G}(\Lambda _x)$$\end{document} and A y G ( Λ y ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_y \in \mathcal {G}(\Lambda _y)$$\end{document} . Then r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} and r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta $$\end{document} are independent, i.e., Assumption 4 (2) holds.

Proof

See Section E.4.4. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

Under Assumption 5, Bartlett (Reference Bartlett1937) factor scores are partitioned diagonal as shown in the following result, and hence using Bartlett scores under Assumption 5 leads to Assumption 4 (2).

Proposition 2

Suppose given Assumption 1 and 5. Then,

Δ = Δ x 0 d ξ , d y 0 d η , d x Δ y , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \Delta = \begin{pmatrix} \Delta _x &{} \varvec{0}_{d_\xi , d_y}\\ \varvec{0}_{d_\eta , d_x} &{} \Delta _y \end{pmatrix}, \end{aligned}$$\end{document}

with Δ x : = Λ x Ψ x - 1 Λ x - 1 Λ x Ψ x - 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta _x:= \left( \Lambda _x'\Psi _{x}^ {-1} \Lambda _x\right) ^ {-1} \Lambda _x'\Psi _x^ {-1} $$\end{document} and Δ y : = Λ y Ψ y - 1 Λ y - 1 Λ y Ψ y - 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta _y:= \left( \Lambda _y'\Psi _y^ {-1} \Lambda _y\right) ^ {-1} \Lambda _y'\Psi _y^ {-1} $$\end{document} both existing. Additionally,

Cov r = Λ x Ψ x - 1 Λ x - 1 0 d ξ , d η 0 d η , d ξ Λ y Ψ y - 1 Λ y - 1 , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} {\text {Cov}} \, r=\begin{pmatrix} \left( \Lambda _x'\Psi _{x}^ {-1} \Lambda _x\right) ^ {-1} &{} \varvec{0}_{d_\xi , d_\eta }\\ \varvec{0}_{d_\eta , d_\xi } &{} \left( \Lambda _y'\Psi _y^ {-1} \Lambda _y\right) ^ {-1} \end{pmatrix}, \end{aligned}$$\end{document}

which is positive definite, and whose diagonal partitions are positive-definite matrices.

Proof

See Section E.4.5. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

We now consider Assumption 4 (1). We examine two approximations for sufficiently large d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} : Firstly, that r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is approximately zero, and secondly that r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is approximately normal. Since r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} will go to zero as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases under weak assumptions, asymptotic normality is closely connected to techniques that treat r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} as zero, and a normality approximation can potentially improve approximations of H. This issue will be further discussed though not resolved at the end of Sect. 2.2.

2.1. Distributional Approximations of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} Increases, Part 1: Approximating r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} by a Constant Zero Vector

We here consider Assumption 4 (1). For fixed d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , the distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is not identified, but under weak conditions, the distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} will go to zero in mean square. This motivates approximating r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} by a zero vector.

Mean square convergence of factor scores has been investigated by several previous authors, e.g., Guttman (Reference Guttman1955), Schneeweiss and Mathes (Reference Schneeweiss and Mathes1995), Krijnen (Reference Krijnen2004), Krijnen (Reference Krijnen2006b), Krijnen (Reference Krijnen2006a), or Williams (Reference Williams1978). To the best of our knowledge, previous papers either assume a particularly simple structure for the factor model, or used what may be termed abstract assumptions, such as limiting consideration of a certain eigenvalue for a matrix which is difficult to interpret. We, therefore, provide this conclusion based on alternative assumptions that have a more direct asymptotic interpretation as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \rightarrow \infty $$\end{document} . We only consider the Bartlett (Reference Bartlett1937) factor scores.

Let ( M ) · , i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(M)_{\cdot , i}$$\end{document} be the i’th column of a matrix M, and ( M ) j , i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(M)_{j,i}$$\end{document} be the ji’th element of a matrix M. Also, λ max ( M ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{\text {max}}(M)$$\end{document} and λ min ( M ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{\text {min}}(M)$$\end{document} are the, respectively, largest and smallest eigenvalues of a matrix M.

Assumption 6

Suppose

  1. (1) for all d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , there are numbers m Ψ x , M Ψ x > 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_{\Psi _x},M_{\Psi _x} > 0$$\end{document} such that m Ψ x < λ min ( Ψ x ) λ max ( Ψ x ) < M Ψ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_{\Psi _x}<\lambda _{\text {min}}(\Psi _x)\le \lambda _{\text {max}}(\Psi _x) < M_{\Psi _x}$$\end{document} .

  2. (2) for all d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , there are numbers m Λ x , M Λ x > 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_{\Lambda _x}, M_{\Lambda _x} > 0$$\end{document} such that for all indices 1 i d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1 \le i \le d_\xi $$\end{document} and all 1 k \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1\le k$$\end{document} where ( Λ x ) k , i 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\Lambda _x)_{k,i} \ne 0$$\end{document} , we have, that m Λ x < | ( Λ x ) k , i | < M Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$m_{\Lambda _x}< |(\Lambda _x)_{k,i}| < M_{\Lambda _x}$$\end{document} .

  3. (3) for N i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_i$$\end{document} being the number of non-zero elements in ( Λ x ) · , i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\Lambda _x)_{\cdot ,i}$$\end{document} that lim d x N i = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lim _{d_x \rightarrow \infty } N_i = \infty $$\end{document} for 1 i d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1 \le i \le d_\xi $$\end{document} .

  4. (4) for C i , j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$C_{i,j}$$\end{document} for 1 i , j d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1 \le i,j \le d_\xi $$\end{document} with i j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i \ne j$$\end{document} being the number of non-zero elements in ( | ( Λ x ) k , i ( Λ x ) l , j ( Ψ x - 1 ) k , l | ) 1 k , l d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(|(\Lambda _x)_{k,i} (\Lambda _x)_{l,j} (\Psi _x^ {-1} )_{k,l}|)_{1 \le k,l \le d_x}$$\end{document} , that lim d x 1 N i 1 j d ξ , j i C i , j = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lim _{d_x \rightarrow \infty } \frac{1}{N_i} \sum _{1 \le j \le d_\xi , j \ne i} C_{i,j} = 0$$\end{document} .

Assumption 6 (1) extends Assumption 1 (4) to the asymptotic case and can be interpreted using the classical result that for a vector x with x = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Vert x \Vert = 1$$\end{document} , we have λ min ( Ψ x ) x Ψ x x λ max ( Ψ x ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda _{\text {min}}(\Psi _x) \le x' \Psi _x x \le \lambda _{\text {max}}(\Psi _x)$$\end{document} . Assumption 6 (1), therefore, dictates that no linear combination with a unit squared coordinate sum has a variance that diverges or converges to zero. This assumption requires that the variances of ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} are within a bounded interval and bounded away from zero. It, further, places restrictions on the correlations between the elements in ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} . For a familiar example, let x = ( 1 , , 1 ) / d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x = (1, \ldots , 1)'/\sqrt{d_x}$$\end{document} , which is such that i = 1 d x x i 2 = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sum _{i=1}^{d_x} x_i^2 = 1$$\end{document} , giving x ε x = d x ε ¯ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x' \varepsilon _x = \sqrt{d_x} \bar{\varepsilon }_x$$\end{document} whose variance can neither diverge nor converge to zero if, for example, the effect of the central limit theorem for ε ¯ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\bar{\varepsilon }_x$$\end{document} is to occur. Assumption 6 (2) says that the loadings of the measurement of each coordinate of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} (i.e., those that are non-zero) must neither vanish nor explode. Assumption 6 (3) says that the number of measurements of each coordinate of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} are continually increasing, thereby giving more and more information on ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . Assumption 6 (4) places restrictions on the increase of the number of cross-loadings and cross correlations in relation to the number of direct loadings.

Proposition 3

Suppose Assumption 1, 2, and 5 hold, and let A = Δ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A = \Delta $$\end{document} .

  1. (1) Suppose Assumption 6 (1) and (2) hold and let N i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N_i$$\end{document} and C i , j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$C_{i,j}$$\end{document} be defined as in Assumption 6 (3) and (4), respectively, then

    max 1 i , j d ξ | ( Cov r ξ ) i , j | min 1 i d ξ N i m Λ x 2 M Ψ x - M Λ x 2 m Ψ x 1 N i 1 j d ξ , j i C i , j - 1 . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \max _{1 \le i,j \le d_\xi } |({\text {Cov}} \, r_\xi )_{i,j}| \le \left[ \min _{1 \le i \le d_\xi } N_i \left( \frac{m_{\Lambda _x}^2}{M_{\Psi _x}} - \frac{M_{\Lambda _x}^2}{m_{\Psi _x}} \frac{1}{N_i} \sum _{1 \le j \le d_\xi , j \ne i} C_{i,j} \right) \right] ^ {-1} . \end{aligned}$$\end{document}
  2. (2) Suppose Assumption 6 holds, then lim d x max 1 i , j d ξ ( Cov r ξ ) i , j = 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lim _{d_x \rightarrow \infty } \max _{1 \le i,j \le d_\xi } ({\text {Cov}} \, r_\xi )_{i,j} = 0$$\end{document} .

Proof

See Section E.4.6. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

We now consider convergence of the conditional expectation of the population Bartlett factor score η ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\eta }$$\end{document} given the population Bartlett factor score ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} . When inputting samples of these into non-parametric regression methods, the methods consistently estimate H d x ( x ) = E [ η ¨ | ξ ¨ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{d_x}(x) = \mathbb {E} [\ddot{\eta }| \ddot{\xi }= x]$$\end{document} which will not equal H ( x ) = E [ η | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H(x) = \mathbb {E} [\eta | \xi = x]$$\end{document} for fixed d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} . We here show that H d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{d_x}$$\end{document} converges to H uniformly over an appropriately chosen subset. The implication of this is that non-parametric estimators based on population Bartlett factor scores will converge to H as the number of measurements d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases over the chosen set.

Since conditional expectation of a vector is defined coordinate-wise, so that e.g., E [ η | ξ ] = ( E [ η 1 | ξ ] , , E [ η d η | ξ ] ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\eta | \xi ] = ( \mathbb {E} [\eta _1 | \xi ], \ldots , \mathbb {E} [\eta _{d_\eta } | \xi ])'$$\end{document} , we may without loss of generality assume that d η = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\eta = 1$$\end{document} , since all norms on R d η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^{d_\eta }$$\end{document} are equivalent and d η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\eta $$\end{document} is fixed.

We have not managed to find a result that implies the appropriate convergence of these conditional expectations and have therefore produced the following result. It seems plausible that relevant, and possibly stronger results could be available in the technical probabilistic literature. As d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increase, ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} will under natural conditions be close enough to ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} for E [ η ¨ | ξ ¨ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\ddot{\eta }| \ddot{\xi }= x]$$\end{document} and H ( x ) = E [ η | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H(x) = \mathbb {E} [\eta | \xi = x]$$\end{document} to be very close, likely also under much weaker conditions than we identify in the upcoming result, which is based on classical approximations.

Our result requires f to have a density and poses several regularity conditions on the density of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , as well as some boundedness and smoothness conditions on H. Additionally, it requires that r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} converges to zero in probability, which is implied by Proposition 3 and Markov’s inequality.

Let a 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Vert a \Vert _2$$\end{document} be the Euclidean norm of a vector a.

Assumption 7

Suppose

  1. (1) d η = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\eta = 1$$\end{document} , and that f = ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f = (\xi ', \eta ')'$$\end{document} and r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} have densities with respect to Lebesgue measure given by f ξ , η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f_{\xi , \eta }$$\end{document} and f r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f_{r_\xi }$$\end{document} , respectively.

  2. (2) sup x R d ξ f ξ ( x ) < \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sup _{x \in \mathbb {R}^{d_\xi }} f_{\xi }(x) < \infty $$\end{document} , where f ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f_\xi $$\end{document} is the marginal density of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} .

  3. (3) there is a set S R d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {S} \subseteq \mathbb {R}^{d_\xi }$$\end{document} such that for S ρ = { x + α ( x - v ) : x S , v R d ξ , v 2 < ρ , α [ 0 , 1 ] } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {S}^\rho = \{ x + \alpha (x - v): x \in \mathcal {S}, v \in \mathbb {R}^{d_\xi }, \Vert v\Vert _2 < \rho , \alpha \in [0,1] \}$$\end{document} for an ρ > 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho > 0$$\end{document} we have that

    1. (a) sup x S ρ | E ω ( x , r ξ ) | 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sup _{x \in \mathcal {S}^\rho } | \mathbb {E} \omega (x,r_\xi )| \rightarrow 0$$\end{document} as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \rightarrow \infty $$\end{document} , where ω ( x , h ) = H ( x - h ) - H ( x ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\omega (x,h) = H(x-h)-H(x)$$\end{document} .

    2. (b) sup x S ρ | H ( x ) | < \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sup _{x \in \mathcal {S}^\rho } |H(x)| < \infty $$\end{document}

    3. (c) inf x S ρ f ξ ( x ) > 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\inf _{x \in \mathcal {S}^\rho } f_{\xi }(x) > 0$$\end{document} ,

    4. (d) f ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f_\xi $$\end{document} is continuously differentiable in S ρ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {S}^\rho $$\end{document} , and sup x S ρ f ξ ( x ) 2 < \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sup _{x \in \mathcal {S}^\rho } \Vert f_\xi '(x)\Vert _2 < \infty $$\end{document} .

  4. (4) r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} converges in probability to zero as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases.

Assumptions 7 (1) and (2) suppose the desired densities. Assumption 7 (3) is the most complex assumption and is given in terms of a kind of modulus of continuity of H. A verification of this assumption for a specific class of H functions requires taking the structure of this class into account. The assumption itself can be justified as a kind of smoothness assumption on H. To illustrate that the assumption is reasonable, we verify it for the class of univariate polynomials in Appendix E.2 in the online supplementary material. Assumption 7 (4) is implied, for example, by Proposition 3. Consequently, Assumption 7 allows the proof of the following proposition considering the convergence of H d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{d_x}$$\end{document} to H for increasing d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} . Finally, Proposition 3 implies Assumption 7 (4).

Proposition 4

Suppose given Assumption 1, 2, 4 and 7. Let H d x ( x ) = E [ η ¨ | ξ ¨ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H_{d_x}(x) = \mathbb {E} [\ddot{\eta }| \ddot{\xi }= x]$$\end{document} and H ( x ) = E [ η | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H(x) = \mathbb {E} [\eta | \xi = x]$$\end{document} . Let | · | \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\cdot |$$\end{document} be any norm on the relevant Euclidean space. Then sup x S ρ | H d x ( x ) - H ( x ) | 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sup _{x \in \mathcal {S}^\rho } | H_{d_x}(x) - H(x)| \rightarrow 0 $$\end{document} as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \rightarrow \infty $$\end{document} .

Proof

See Section E.4.7. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

Remark 1

Let us re-visit the Thurstone transformation T. From Lemma 1, T G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T \notin \mathcal {G}(\Lambda )$$\end{document} . However, we have, say f ˘ : = ( ξ ˘ , η ˘ ) : = T z ~ = ( T Λ ) f + T ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\breve{f}:= (\breve{\xi }', \breve{\eta }')':= T \tilde{z} = (T \Lambda ) f + T \varepsilon $$\end{document} . From Assumption 1 and 2 we have that T ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T \varepsilon $$\end{document} is still mean zero and independent to f. Therefore, E [ η ˘ | ξ ˘ ] = T y Λ y E [ η | ξ ˘ ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\breve{\eta }| \breve{\xi }] = T_y \Lambda _y \mathbb {E} [\eta |\breve{\xi }]$$\end{document} where T y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_y$$\end{document} is defined analogously as Δ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta _x$$\end{document} . Since the Thurstone factor scores converge in probability (and mean square) toward the true latent variables under weak assumptions (see, e.g., Krijnen, Reference Krijnen2006a; Reference Krijnenb), we get that T Λ I \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T \Lambda \rightarrow I$$\end{document} as d y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_y$$\end{document} increases, and ξ ˘ ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\breve{\xi }\rightarrow \xi $$\end{document} as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases. We see that the additional term T y Λ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$T_y \Lambda _y$$\end{document} is due to T not fulfilling the regression equation Eq. (3) with an uncorrelated error term. Therefore, consistency requires d y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_y \rightarrow \infty $$\end{document} , in contrast to the present analysis that requires only d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \rightarrow \infty $$\end{document} .

2.1. Distributional Approximations of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} Increases, Part 2: Approximating r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} by a Normal

We now consider approximating r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} by a normal, using a central limit theorem. In this section, we make strong assumptions to simplify the normality argument. Under approximate normality, de-convolution methods can be used that take into account the distribution of the noise in ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} as an approximation to ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . The strong assumptions of the present section are not needed for our justification of direct non-parametric estimates of E [ η ¨ | ξ ¨ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\ddot{\eta }|\ddot{\xi }= x]$$\end{document} as an approximation to E [ η ¨ | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\ddot{\eta }| \xi = x]$$\end{document} as considered in the previous subsection, but they are needed in our justification of non-parametric de-convolution based methods, one of which we will consider in the main simulation study of the paper (the HZCV method).

For a partitioned diagonal A G ( Λ ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A \in \mathcal {G}(\Lambda )$$\end{document} , we have that r ξ = A x ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi = A_x \varepsilon _x$$\end{document} . For sufficiently large d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , we can expect central limit effects to justify the approximation r ξ a N ( 0 , A x Ψ A x ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi \overset{a}{\sim }\ N(0, A_x \Psi A_x')$$\end{document} . As d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases, r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} will typically converge to zero in the mean square, so that A x Ψ A x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_x \Psi A_x'$$\end{document} will tend to zero. Therefore, a re-scaling is required to prove a formal limiting result, as is the case for standard averages.

Let A i , · \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_{i, \cdot }$$\end{document} be the i’th row of A. Let r i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_i$$\end{document} be the i’th coordinate of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} . We have r i = A i , · ε x = j = 1 d x a i , j ε x , j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_i = A_{i, \cdot } \varepsilon _x = \sum _{j=1}^{d_x} a_{i,j} \varepsilon _{x, j}$$\end{document} . When ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} have independent components, the normality of r i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_i$$\end{document} can be analyzed via the Lindeberg–Feller or Lyapunov central limit theorems (see e.g. Billingsley, 1995, Section 27). In order to do this, detailed assumptions have to be made on the entries of A. To get concrete and simple assumptions, we provide a verification of the details of this argument only for the Bartlett factor score when the measurement model of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} has the following simplified structure. The following results can be generalized in many directions, and the approximate normality of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} holds also well outside these conditions, and will hold in most cases of practical interest.

Assumption 8

Suppose

  1. (1) ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} has independent components and Ψ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi _x$$\end{document} is a diagonal matrix.

  2. (2) Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} has only one non-zero element per row. Without loss of generality, we further assume that the coordinates of x ~ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\tilde{x}$$\end{document} are re-arranged in such a way that Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} is partitioned diagonal.

Let I j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {I}_j$$\end{document} be the coordinates of x which measures the j’th coordinate number of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . Under Assumption 8, | I j | \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\mathcal {I}_j|$$\end{document} is the number of non-zero rows in the j’th column of Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} , and ( I j ) j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\mathcal {I}_j)_j$$\end{document} forms a disjoint sequence. In the result, recall that the upper left elements of Λ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda $$\end{document} equal Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} , as is also the case for Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} and Ψ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi _x$$\end{document} .

Lemma 4

Suppose Assumptions 1, 2, and 8. Then

  1. (1) Δ x = λ ji ψ jj k = 1 d x λ ki 2 ψ kk i , j , i = 1 , , d ξ , j = 1 , , d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta _x = \left( \frac{\lambda _{ji}}{\psi _{jj}\sum _{k=1}^{d_x}\frac{\lambda _{ki}^2}{\psi _{kk}}}\right) _{i,j, i=1,\dots ,d_\xi ,j=1,\dots ,d_x}$$\end{document} .

  2. (2) The j’th coordinate of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} fulfills r j = i I j λ ij ψ ii k = 1 d x λ kj 2 ψ kk ε i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_j = \sum _{i\in \mathcal {I}_j}\frac{\lambda _{ij}}{\psi _{ii}\sum _{k=1}^{d_x}\frac{\lambda _{kj}^2}{\psi _{kk}}} \varepsilon _i$$\end{document} , for j = 1 , , d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j=1,\dots ,d_\xi $$\end{document} .

  3. (3) We have that Cov r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\text {Cov}} \, r_\xi $$\end{document} is the diagonal matrix with elements d ii : = 1 k = 1 d x λ ki 2 ψ kk \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{ii}:= \left( \frac{1}{\sum _{k=1}^{d_x}\frac{\lambda _{ki}^2}{\psi _{kk}}}\right) $$\end{document} for i = 1 , , d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i=1,\dots ,d_\xi $$\end{document} .

Proof

See Section E.4.8. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

The following assumptions provide enough regularity to use the Lyapunov central limit theorem.

Assumption 9

Suppose

  1. (1) for a δ > 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\delta > 0$$\end{document} we have

    sup j 1 E ε x , j ψ jj 2 + δ < . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \sup _{j \ge 1} \mathbb {E} \left| \frac{\varepsilon _{x,j}}{\sqrt{\psi _{jj}}} \right| ^{2 + \delta } < \infty . \end{aligned}$$\end{document}
  2. (2) there are finite numbers 0 < m λ / ψ M λ / ψ < \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0<m_{\lambda /\psi } \le M_{\lambda /\psi } < \infty $$\end{document} such that λ ji 2 ψ jj 1 i d ξ , 1 j d x [ m λ / ψ , M λ / ψ ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left( \frac{\lambda _{ji}^2}{\psi _{jj}} \right) _{1 \le i \le d_\xi , 1 \le j \le d_x} \subset [m_{\lambda /\psi },M_{\lambda /\psi }]$$\end{document} .

  3. (3) as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x \rightarrow \infty $$\end{document} , | I j | \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$|\mathcal {I}_j| \rightarrow \infty $$\end{document} , for j = 1 , , d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j=1,\dots ,d_\xi $$\end{document} .

Assumption 9 (2) places restrictions on asymptotic behavior of the coefficients in front of ε i \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _i$$\end{document} in the expression for r j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_j$$\end{document} in Lemma 4. Assumption 9 (3) means that we get more and more measurements for all coordinates of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} .

Notice that under the simplified variance expression in Lemma 4 (3), conditions for mean square convergence of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} are implied by Assumption 9 (2), as the variance converges to zero as d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} increases since the sum in the expression is greater than d x m λ / ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x m_{\lambda /\psi }$$\end{document} .

We now formalize the aforementioned central limit theorem-based approximation.

Proposition 5

Under Assumption 1, 2, Assumption 8, and 9, we have

c d x r ξ d x d N ( 0 , I ) . \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} c_{d_x}' r_{\xi } \xrightarrow [{d_x} \rightarrow \infty ]{d} N(0,I). \end{aligned}$$\end{document}

where c d x = ( n d x ( 1 ) , , n d x ( d ξ ) ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_{d_x}' = (\sqrt{n_{d_x}(1)}, \ldots , \sqrt{n_{d_x}(d_\xi )})$$\end{document} in which n d x ( i ) = j = 1 d x λ ji 2 ψ jj \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ n_{d_x}(i) = \sum _{j=1}^{d_x} \frac{\lambda _{ji}^2}{\psi _{jj}} $$\end{document} for i = 1 , 2 , , d ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$i = 1, 2, \ldots , d_\xi $$\end{document} .

Proof

See Section E.4.9. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\square $$\end{document}

This result does not have implications for identification, as Proposition 5 also implies that r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} converges to zero in probability. It may be, however, that using the approximate normality of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} improves approximations of H based on the distribution of the population factor scores. While we have been unable to prove this, this topic is further discussed in more technical detail in Appendix E.3 in the online supplementary material.

3. Empirical Estimation Strategies

Section 2 treats the foundational topic of identification. We now consider empirical estimates by following a plug-in procedure where Δ z \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta z$$\end{document} is replaced by Δ ^ z ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\Delta }\hat{z}$$\end{document} , where z ^ = z - μ ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{z} = z - \hat{\mu }$$\end{document} replaces all unknown parameters with parameter estimates from considering the measurement model in Eq. (2) as a confirmatory factor analysis model (CFA). This is a linear transformation of ( z , 1 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(z', 1)'$$\end{document} where the estimation error of the standard CFA estimators is of order O P ( n - 1 / 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O_P(n^{-1/2})$$\end{document} where n is the sample size (see, e.g., Satorra, Reference Satorra1989). Therefore, the empirical Bartlett factor score has the same structure as a residual in standard regression problems. The mathematics behind a full asymptotic analysis of non-parametric regression methods where the covariates have measurement error with a known distribution is highly technical already with independent and identically distributed data (see, Delaigle et al., Reference Delaigle, Fan and Carroll2009, Section 1). In our case, we are inputting empirical factor scores, which as mentioned above have the same mathematical structure as regression residuals. Therefore, taking the estimation error of approximating Δ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Delta $$\end{document} with Δ ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\Delta }$$\end{document} properly into account is similar to using residuals in statistical methods, which can be mathematically complex (see, e.g., Grønneberg & Holcblat, Reference Grønneberg and Holcblat2019, and references therein). We consider an analysis of this problem outside the scope of the present paper.

Next to uncertainty in the estimation of Δ ^ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hat{\Delta }$$\end{document} , the choice of the non-parametric regression method utilizing the computed factor scores will have an influence on the overall performance in approximating H in small samples. As there are many possible methods, we restrict attention to the most widely used methods and consider a more detailed examination outside the scope of the current article. We discuss the properties of the Bartlett factor scores derived in the previous section with regard to the choice in estimating H in the following.

The theoretical results of the previous sections imply that the residual r of the Bartlett factor score is close to normal (see Sect. 2.2) for sufficiently large d z \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_z$$\end{document} and converges to zero in probability (see Sect. 2.1 and specifically Proposition 3). Further, the conditional expectation of the underlying latent variables is identifiable using the Bartlett factor scores (see Proposition 1, when independence assumptions among r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} and r η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\eta $$\end{document} hold and the distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is known). Most of these results depend on convergence dependent on n (the sample size) or on the number of measurements of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} ( d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} ) or both. In the next section, we use simulation to study the finite sample properties of several non-parametric regression methods where we use the Bartlett factor scores as inputs. We compare the performance of these methods using the Bartlett factor scores with the performance of three methods using the nonlinear factor scores proposed by Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) as inputs.

For a finite sample, both n and d z \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_z$$\end{document} are finite and d x < d z n \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x < d_z \ll n$$\end{document} . Therefore, the Bartlett score f ¨ = f + r \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{f}=f+r$$\end{document} ought to have significant residual variance Var [ r ] > 0 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ {\text {Var}} \, [r]>0$$\end{document} . In this scenario, the usage of the Bartlett score is closely related to non-parametric regression estimation with measurement error (Delaigle et al., Reference Delaigle, Fan and Carroll2009; Delaigle, Reference Delaigle2014; Huang and Zhou, Reference Huang and Zhou2017), where the independent variable (here ξ ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{\xi }$$\end{document} ) is allowed to have a residual (here r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} ). Such methods require additional assumptions. Similarly to the arguments underlying Proposition 1, the distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is required to be known. However, from Proposition 5 we have that the distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is only approximately known as it is asymptotically normal. Unfortunately, there are no current methods available that enable an examination of the sample distribution of the measurement errors and r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} (see Appendix I in the online supplementary material for a discussion). We, therefore, are interested in the performance of such a method using an approximate distribution for r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} and focus on an adaption of the local polynomial estimator by Delaigle et al. (Reference Delaigle, Fan and Carroll2009, DFC-estimator) proposed by Huang and Zhou (Reference Huang and Zhou2017): the HZ-estimator (HZ for local linear estimators for solving errors-in-variables problems, see Appendix D.3 in the online supplementary material for more details). The HZ-estimator is less biased and more computationally stable compared to the originally proposed DFC-estimator as also suggested by some of our preliminary analyses.

Since for increasing numbers of measurements in the exogenous part of the model, the variance of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} decreases, measurement error in the factor scores can be ignored. There is a variety of methods that could be used to estimate trends within data non-parametrically that do not take measurement error into account. All results in the next sections could be influenced by these choices. We employed two commonly used methods to estimate non-parametric trends, namely the locally estimated scatter plot smoothing (LOESS) originating from its weighted version (LOWESS, Cleveland et al., 1992) proposed by Cleveland (Reference Cleveland1979, Reference Cleveland1981) and a cubic smoothing spline function (Chambers and Hastie, 1992). Both methods were also used to model the nonlinear factor scores of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) complemented by their implementation of a specific BSpline (De Boor, Reference De Boor1978) method in order to enable a fair comparison between the methods and rule out any performance influences induced by the non-parametric regression method. We do not examine the BSpline method based on Bartlett factor scores since there is no readily available implementation except for the script of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017). We did include it for the factor scores of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) since it was their suggested method to estimate the conditional expectation.

Table 1 provides a high-level summary of the most important assumptions for the empirical estimators considered. Using Bartlett factor scores together with either LOESS or spline estimates (BFS in the table) is our recommended approach and the one with the least assumptions on the measurement model.

Table 1 Assumptions used

BFS = Bartlett factor scores inputted into a general non-parametric trend estimate, HZCV = cross-validated HZ estimator, NLFS = nonlinear factor scores. The light gray area shows assumptions shared by all methods, the white region pertains to BFS, the dark gray region pertains to assumptions shared by two methods.

4. Numerical Illustrations

All empirical analyses were done in R (R Core Team, 2023), except the nonlinear factor scores proposed by Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) which were estimated with a modified version of their MATLAB (The MathWorks Inc., 2023) scripts called from R including their used BSpline method. Appendix D.2 in the online supplementary material gives detailed information on the implementation, including the use of R-packages. A simple and practically minded numerical example is provided in Appendix A in the online supplementary material. All code and data used in the paper are available at the OSF-repository: https://osf.io/2xfh8/.

Within the following sections, we abbreviate the Bartlett factor scores (Bartlett, Reference Bartlett1937) by BFS and the nonlinear factor scores by NLFS. None of the simulations are exhaustive due to the high computational cost of the cross-validated HZ-estimator and the estimation of the NLFS. Running the simulations of Sects. 4.3 and 4.4 on a 30 core clusterFootnote 1 took about 26 days to complete even when limiting the number of replications to 200 per condition.

4.1. The Distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document}

We here illustrate the quality of the normal approximation of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} . The normal approximation follows from the central limit theorem, and numerical illustrations of this effect are, therefore, well known. Hence, we only consider error distributions used in the proceeding simulation studies. We let ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} have independent and identically distributed coordinates and have marginal distributions given either by a standardized uniform or a standardized Gamma(1, 1). The exact distribution of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is then known analytically (Moschopoulos, Reference Moschopoulos1985; Kamgar-Parsi et al., Reference Kamgar-Parsi, Kamgar-Parsi and Brosh1995). Using these results, we produced Fig. 1. In it, we see that r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} is close to normal in the uniform case for as few as 3 measurements, while more measurements are necessary for skewed gamma distributions, where deviations are easily seen even with 6 or 9 measurements. This indicates that approximating r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} with a normal distribution is expected to work better for uniform errors than for gamma errors. Further, the plot also depicts the decreasing variance of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} with increasing d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , as indicated by narrower distributions.

Figure 1 A comparison of the exact densities of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} resulting from the corresponding distribution of ε x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon _x$$\end{document} with the relevant normal distribution suggested as an approximation.

4.2. A Visual Comparison of Approximations to H

In this section, we present average approximations to the conditional expectation H ( x ) = E [ η | ξ = x ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$H(x)= \mathbb {E} [\eta | \xi =x]$$\end{document} by the use of different methods. In order to examine the small sample and finite measurement properties of the methods, we simulated four different trends to be estimated non-parametrically for d ξ = d η = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi = d_\eta = 1$$\end{document} ; quadratic, cubic, logit and piecewise linear. We chose n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} , d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} as 3 and 9 and all model parameters to coincide with the assumptions needed for Lemma 4, that is, there are no cross-loadings or residual covariances and all residuals are independent. All coefficients are chosen so that ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} have zero mean and unit variance. Further, we fixed the first factor loading within Λ x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x$$\end{document} and Λ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _y$$\end{document} to 1 (in the population and in the analyses) and the corresponding residual variances in Ψ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi $$\end{document} to .5625 to ensure that the corresponding reliability is .64. The remaining factor loadings per latent variable and the corresponding residual variances were chosen to have reliabilities that are equidistant between .64 and .25. These item-wise reliabilities are rather low, but realistic. We wanted to choose conditions under which there is substantial noise in the data. As this is a condition also analyzed in the following simulation study we refer to Appendix D.1 in the online supplementary material for additional information.

Figure 2 shows the average nonparametric estimation of H using either BFS or NLFS as inputs compared with the true trend and to a linear SEM estimation averaged across 200 replications for normal ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and gamma ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} .

Figure 2 A comparison of nonparametric estimation for E [ η | ξ ] \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \mathbb {E} [\eta | \xi ]$$\end{document} averaged across 200 replications with n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} for the LOESS and the smoothed spline methods based on BFS and the NLFS, the HZ-estimator, the BSpline estimator based on NLFS compared to the true trend and a linear SEM estimation with different true trends (quadratic, cubic, logit and piecewise linear) and dimensions d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} with normal ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and gamma distributed errors ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} .

Figure 11 in Appendix D.4 in the online supplementary material extends Fig. 2 by coverage intervals. Both figures depict the convergence toward the true trend for increasing d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} and make differences with regard to the trends evident. That is, smoother trends, such as the quadratic trend with a linear first derivative, are approximated with more precision from all methods compared with the other trends, which have stronger differences with regard to their rate of change, i.e., nonlinear first derivatives. All methods appear to be less precise at the edges of the support, which is expected as there are fewer data points present. Further, Fig. 2, and 11 in Appendix D.4 in the online supplementary material suggest that there are differences among the methods with the methods relying on the BFS slightly outperforming the NLFS.

In order to compare computational costs, we benchmarked the methods used within Fig. 2 for a cubic trend, see Appendix D.4 in the online supplementary material, Table 7. On a standard laptop, the LOESS and spline method based on BFS are extremely quick compared to all other methods, taking much less than 1 s. The HZ-estimator using simulation-based cross-validated bandwidth took more than 24 min and the methods based on NLFS took more than 35 min on average.

The nonparametric method of Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) sets the first factor loading per latent variable to one. This is done in the following simulations to make estimates comparable. See Appendix H in the online supplementary material for theoretical information on the scaling issue.

4.3. Simulation Study Based on Mean Integrated Square Error for d ξ = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi = 1$$\end{document}

We now consider a more systematic simulation-based comparison of the performance of nonparametric estimation methods based on BFS and NLFS. We evaluate the scenarios in the previous section in more detail and aggregate the performance by using the mean integrated squared error:

MISE = E H - φ ^ A α , 2 2 , where H - φ A α , 2 2 = A α [ H ( x ) - φ ( x ) ] 2 d x , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} \text {MISE} = \mathbb {E} \Vert H - \hat{\varphi }\Vert _{\mathcal {A}_{\alpha ^\star }, 2}^2, \quad \text {where} \quad \Vert H - \varphi \Vert _{\mathcal {A}_{\alpha ^\star }, 2}^2 = \int _{\mathcal {A}_{\alpha ^\star }} \big [H(x) - \varphi (x)\big ]^2 dx, \end{aligned}$$\end{document}

where the expectation is approximated by the empirical expectation over the number of replications. The integration area is limited, as lack of data near edges inflates the mean squared error but is of limited practical interest. We set A α \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathcal {A}_{\alpha ^\star }$$\end{document} as level sets { x : f ξ ( x ) > c α } \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\{x: f_{\xi }(x) > c_\alpha ^\star \}$$\end{document} where c α \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$c_\alpha ^\star $$\end{document} is such that P ( ξ A α ) = 1 - α \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P(\xi \in \mathcal {A}_{\alpha ^\star }) = 1 - \alpha ^\star $$\end{document} , where α = 10 % \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha ^\star = 10\%$$\end{document} .

For our simulation study using d ξ = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi =1$$\end{document} , we extended the conditions of Sect. 4.2 by a crossed design for which we manipulated the number of items d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} (3, 6, 9), the distribution of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} (normal or uniform with mean zero and unit variance), the distribution of ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} (centered normal, centered uniform, centered gamma, see Sect. 4.1), and the true trends (quadratic, cubic, logistic, piecewise linear). This resulted in a total of 72 conditions. 200 replications were used, with a sample size of n = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=$$\end{document} 1000. For a more detailed description of the simulation conditions and the data generating process, see Appendix D.1 in the online supplementary material. All conditions were analyzed using the following methods: linear SEM, LOESS using BFS and NLFS, smoothed splines using BFS and NLFS, the BSpline method using NLFS proposed by Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017), and the cross-validated HZ-estimator using the BFS. In order to compare all results with a best case scenario, we also included LOESS and smoothed spline estimation using the true latent variables f = ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f=(\xi , \eta )'$$\end{document} as inputs.

Figure 3 A comparison of the average MISE across 200 replications with n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS f ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{f}$$\end{document} , NLFS, the linear SEM, and the true latent variables f for comparison) for different dimensions d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} aggregated across all distributions and trends used in the simulation study.

Figure 3 depicts the performance of the methods aggregated across all distributional conditions and all trends. As can be expected, the MISE for the linear SEM and the true latent variables f are not affected by d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} . Here, the linear SEM shows the highest and the methods based on f show the lowest MISE for all d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} averaged across all distributional conditions and trends. All other methods show a decrease in MISE for increasing d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} . However, even for d x = 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=9$$\end{document} the MISE of all methods is considerably higher compared with using the true latents f as inputs for the nonparametric methods. This deviation quantifies approximation error arising from measurement and finite sample error. Concerning the methods of interest, for all d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} the smoothed spline using the BFS as input showed the lowest MISE followed by the LOESS based on the BFS. Averaged across all distributional conditions and trends for d x = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=3$$\end{document} the HZ-estimator outperformed the methods based on NLFS. These differences disappear for d x = 6 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=6$$\end{document} and reverse for d x = 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=9$$\end{document} indicating that for smaller variance of r ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_\xi $$\end{document} the HZ-estimator is less useful compared to the other methods. Within the methods based on NLFS, Figs. 3 suggests that the spline and BSpline approaches have smaller MISE compared to the LOESS. However, these differences are rather small. For d x = 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=9$$\end{document} the differences between the methods using BFS and the methods using NLFS appear to be considerably smaller than for d x = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=3$$\end{document} .

Figure 4 shows the MISE across all conditions. Table 8 and 9 in Appendix D.4 in the online supplementary material displays the corresponding numerical values depicted in Fig. 4. These supplement Fig. 3 of the article by visualizing all MISE for all simulated conditions. For instance, it is evident for some conditions with a logit trend for d x = 3 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x=3$$\end{document} , that the MISE for methods based on NLFS was in fact higher than that of the linear SEM, indicating that the linear approximation was closer to the true trend than the non-parametric one based on NLFS. With increasing d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} , all methods showed lower MISE compared to the linear SEM. The distribution of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} also influences the performance of the methods. For instance, for normal ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} and logit trend the HZ-estimator resulted in lower MISE compared to all other methods based on factor scores. For a cubic trend, this is reversed and all methods except for the linear SEM show a lower MISE compared to the HZ-estimator. All in all, for all scenarios and all d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} the methods using the NLFS never had the smallest MISE, with the LOESS, the BSpline, and the smoothed spline method based on NLFS showing comparable MISE in almost all conditions. In most cases, spline and LOESS based on BFS showed the lowest MISE as already suggested by the aggregated results in Fig. 3. We note that the differences between the methods based on BFS and NLFS are small but consistent.

Interestingly, there are conditions for which LOESS showed lower MISE for the methods based on factor scores than the spline method, while the LOESS based on the true latent variables showed higher MISE than using splines in all conditions. Therefore, we cannot draw any conclusions from the performance of the methods using the true latents f as inputs, and they should only be used as a best case scenario and serve as an anchor for an expected smallest possible MISE (as f ¨ f \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{f} \rightarrow f$$\end{document} for d z \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_z \rightarrow \infty $$\end{document} ; hence, the MISE cannot be smaller than that using f). For ease of comparison, Figure 12 in Appendix D.4 in the online supplementary material depicts the relative improvement of MISE in comparison with the linear SEM approximation (see also Table 10 and 11). These relative improvements show that the logit trend was closest to linearity since the improvement was the smallest. The cubic trend for normal ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} showed the largest improvement, while for uniform ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} for cubic trends the improvement was comparable to that of quadratic or piecewise-linear trends.

Figure 4 A comparison of the averaged MISE across 200 replications with n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS, NLFS, linear SEM, and true latent variables f for comparison) for four models with different true trends (quadratic, cubic, logit and piecewise linear) and dimensions d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} . See Table 8 and 9 in Appendix D.4 in the online supplementary material for numerical values.

In order to visualize variation among the trends, Figure 13 in Appendix D.4 in the online supplementary material depicts box plots summarizing the MISE across all distributional conditions for each d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} and each trend. The variation among MISE decreases with increasing d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} and is comparable among the methods based on factor scores. The methods based on the true latent variables on average show the smallest variation. The cubic trend has the largest variation among the MISE across the distributional conditions, but the piecewise linear trend resulted in the largest average MISE.

To summarize, from our limited conditions within the simulation we may conclude that using BFS either with LOESS or smoothed splines will result in the smallest MISE and, therefore, in the best approximation of the true trend, while also being the cheapest with regard to computation time. The HZ-estimator was only beneficial in limited conditions, while the NLFS always showed higher MISE compared to LOESS or splines based on BFS.

4.3. Simulation Study Based on Mean Integrated Square Error for d ξ = 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi = 2$$\end{document}

In this section, we extend the previous simulation results to models with d ξ = 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi =2$$\end{document} . As a multivariate implementation of the HZ-estimator is still missing, we did not include it in the simulation. Further, there are no simple multivariate extensions of the smoothed spline method, and we discarded it within the d ξ = 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi =2$$\end{document} simulation study. This simulation study therefore only considers LOESS estimates of the trend as well as the BSpline approach as implemented by Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) for their NLFS.

We used a crossed design for which we manipulated the number of items per latent exogenous variable ξ = ( ξ 1 , ξ 2 ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi =(\xi _1,\xi _2)'$$\end{document} , the number of measurements per latent variable d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} (3, 6, 9) for j = 1 , 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$j=1,2$$\end{document} , the distribution of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} (multivariate standard normal or normal copula with uniform marginals with mean zero, variance 1, and correlation .5), the distribution of ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} (centered normal, centered uniform, centered gamma), and the true trends (quadratic, cubic). Further, we manipulated the model specification, that is whether cross-relations, i.e., cross-loadings and residual covariances, among the measurements of ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} or within the measurements of η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} are present (uncrossed, crossed). This resulted in a total of 72 conditions. We used 200 replications, and a sample size of n = \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=$$\end{document} 1000. For a more detailed description of the simulation conditions and the data generating process, see Appendix D.1 in the online supplementary material. We compared the following methods: linear SEM, LOESS using BFS or NLFS, and the BSpline method using NLFS proposed by Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017). In order to compare all results with a best case scenario, we, again, included LOESS estimation based on the true latent variables f = ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f=(\xi ', \eta )'$$\end{document} as inputs.

The NLFS proposed by Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) assumes a linear factor model without cross-loadings or residual covariances among the items (recall Table 1). In order to examine the effect of a misspecified measurement model used to compute factor scores, we added Bartlett scores estimated without cross-relations as an additional condition, to have a fair comparison between the methods, as they are then both misspecified in these conditions. For a discussion and examples on the misspecification of the functional form of the factor models (i.e., nonlinear factor models) see Appendix F in the online supplement, which show that misspecification is not very sensitive for minor deviations from linearity. We call BFS uc \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_\text {uc}$$\end{document} the Bartlett scores estimated using no cross-relations in the factor model.

Figure 5 A comparison of the averaged MISE across 200 replications with n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS f ¨ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\ddot{f}$$\end{document} , NLFS, the linear SEM, and the true latent variables f for comparison) for different dimensions d x \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_x$$\end{document} aggregated across all distributions and trends used in the simulation study separated for conditions without (uncrossed) and including cross-loadings and cross correlations in Λ x , Ψ x , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x, \Psi _x,$$\end{document} and Ψ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi _y$$\end{document} . BFS and BFS uc \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_\text {uc}$$\end{document} are equivalent for uncrossed data.

Figure 5 shows aggregated MISE results across all distributional conditions, trends, and model specifications. Similarly to the d ξ = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi =1$$\end{document} case, MISE decreases for all methods based on factor scores for increasing d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} , while LOESS based on the true latents f and the linear SEM are not affected by d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} . It stands out that LOESS based on BFS using the correct model is not influenced by cross relations, while the NLFS as well as the BFS without these cross-relations show largely inflated MISEs for conditions where there are in fact cross-relations present. Still, the wrongly specified BFS uc \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{uc}$$\end{document} resulted in lower MISE compared to the methods based on the NLFS for conditions with present cross-relations. For conditions without cross-relations the LOESS based on BFS with and without are identical and, hence, overlap completely. The LOESS based on BFS outperforms the methods based on NLFS under all presented conditions.

Figure 6 shows all average MISE across the 200 replications for all used conditions (see also Table 12 and 13 in Appendix D.4 in the online supplementary material for numerical values). From Fig. 6, it is evident that there are differences in the degree of poor performance of the method with regard to the distributions. The MISE did decrease for all methods based on factor scores with increasing d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} , but there are conditions where the MISE for methods using factor scores was considerably higher than that of the linear SEM. Compared to the linear SEM, the MISE in conditions with cross-relations was larger for BFS uc \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_\text {uc}$$\end{document} and the methods based on NLFS for quadratic trends and especially for normal ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} with the MISE being larger than that of the linear SEM even for d x j = 9 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}=9$$\end{document} for quadratic trends with normal ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} . Interestingly, with regard to measurement errors, gamma ε \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document} resulted in the lowest MISEs. In the conditions without cross-relations still in all conditions LOESS based on NLFS outperformed the BSpline method. Further, LOESS based on BFS is considerably lower in MISE in all conditions without cross-relations compared to methods based on NLFS. These differences appear the smallest for cubic trends with marginally uniform ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} with normal copula. The MISE of the methods based on factor scores was considerably higher than that of the LOESS based on f. Identically to the simulation with d ξ = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi =1$$\end{document} the MISE for the linear SEM or for the LOESS based on f was not related to d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} . This, of course, can be expected as for these objects d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} has minimal influence on the estimated parameters.

Figure 6 A comparison of the averaged MISE across 200 replications with n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS, NLFS, linear SEM, and true latent variables f for comparison) for two models with different true trends (quadratic and cubic), dimensions d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} , and inclusion of cross-relations (cross-loadings and cross-correlations in Λ x , Ψ x , \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Lambda _x, \Psi _x,$$\end{document} and Ψ y \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Psi _y$$\end{document} ) and distributions (row and column names refer to marginal distributions) used in the simulation study for d ξ = 2 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi = 2$$\end{document} . See Table 12 and 13 in Appendix D.4 in the online supplementary material for numerical values.

Further, Figure 16 in Appendix D.4 in the online supplementary material emphasizes that the LOESS based on BFS is much more homogeneous in the MISE and, hence, in the performance in approximating the true trend. Additional information on the relative improvement in comparison with the linear SEM approximation is further given in this appendix stretching the importance of a correctly specified (linear) measurement model (see Figure 15 and Tables 14 and 15).

To summarize, from our limited conditions within the simulation we underline the results of the previous simulation study for d ξ = 1 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_\xi =1$$\end{document} with even stronger evidence in favor of the LOESS method based on the BFS as it showed the smallest MISE in all conditions, opens the opportunity to check whether the confirmatory factor analysis model used to extract the factor scores fits the data, is flexible with regard to cross-relations and has an extremely short runtime. We note that the differences between the methods based on BFS and NLFS are small but consistent, when all assumptions for NLFS are fulfilled. The differences between the methods decline for increasing d x j \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_{x_j}$$\end{document} , but are never close to the performance of the LOESS based on f or to each other. Measurement model misspecification negatively affects all methods.

5. Concluding Remarks

We may combine our foundational equations (1) and (2) to see that the full model is a multivariate non-parametric regression problem where both the dependent variable η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} and independent variable ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} are observed with measurement error. In theory, this non-parametric formulation can be worked with directly. However, the distributions of the error terms would be unknown and would neither be asymptotically normal nor vanishing as the number of measurements increases. With stronger assumptions the distributions of these measurement errors are identified and can be non-parametrically estimated, which would lead to methodology such as that suggested in Kelava et al. (Reference Kelava, Kohler, Krzyżak and Schaffland2017) and Kohler et al. (Reference Kohler, Müller and Walk2015).

Our approach avoids such estimation or a-priori specification of the error distribution through our use of factor scores. That is, we use a linearly optimal dimensionality reduction which has the advantage that the measurement error distribution is asymptotically known, thereby avoiding their estimation in order to non-parametrically estimate H.

Our simulation study has demonstrated that using Bartlett (Reference Bartlett1937) factor scores as inputs in non-parametric regression method is well-working and computationally efficient for non-parametric estimation in NLSEM. Specifically, employing LOESS or spline approaches based on Bartlett factor scores outperformed the other methods in nearly all conditions in our simulation study.

Our analyses have several limitations. In the theoretical contribution, the most striking limitations are that we only study population quantities, and that we only used linear factor scores taken from an assumed correctly specified linear factor model. Also, the assumptions of the asymptotic results can be weakened, and the assumptions we made on the factor scores can likely also be weakened to, for example, allow the use of Thurstone factor scores.

In the simulations, we limited attention to non-parametric estimators of H, and excluded the semi-parametric alternatives reviewed in Appendix C in the online supplementary material. Bauer et al. (Reference Bauer, Baldasaro and Gottfredson2012) compared the performance of estimating trends using the semiparametric latent class approach of Bauer (Reference Bauer2005) with the approach of inputting factor scores into non-parametric regression methods as dealt with in this paper and concluded that the latent class approach performed best in many settings. In further research, one could analyze the scope of the semi-parametric methods, i.e., identify which types of trends and distributional forms are supported in common situations (in the latent class situation this could be a small to moderate number of latent classes and within each ( ξ , η ) \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(\xi ', \eta ')'$$\end{document} follows a linear and normal SEM), and compare the non-parametric approaches considered in the present paper with the semi-parametric methods both within and outside their scope.

Further limitations of our simulation study are that we only use a sample size of n = 1000 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=1000$$\end{document} observations with 200 replications. Expanding the sample size conditions could provide further insights into the performance of the methods. Expanding the replication number would sharpen our approximations. Furthermore, our simulation study solely considers symmetric distributions for the latent exogenous variable ξ \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\xi $$\end{document} , and we have not varied the number of measurements for the latent endogenous variable η \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta $$\end{document} .

Funding

Open access funding provided by Norwegian Business School

Declarations

Conflict of interest

None.

Data Availability

All source code and data are included in the supplementary material at https://osf.io/2xfh8/

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-024-09959-4.

Steffen Grønneberg and Julien Patrick Irmer share first authorship on this work.

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

1 For our simulation, we used 30 processes on a shared computer cluster with 2 sockets of AMD EPYC 7452 32-core Processors @2.35 GHz (64 physical cores and 128 logical cores) and 128GB RAM.

References

Anderson, T. W. (2003). Introduction to multivariate statistical analysis. Wiley, 3rd ed.Google Scholar
Apanasovich, T. & Liang, H. (2021). Nonparametric measurement errors models for regression. In Handbook of measurement error models (pp. 293–318). Chapman and Hall/CRC.CrossRefGoogle Scholar
Bartlett, M. S. (1937). The statistical conception of mental factors. British Journal of Psychology. General Section, 28, 97104.CrossRefGoogle Scholar
Bauer, D. J. (2005). A semiparametric approach to modeling nonlinear relations among latent variables. Structural Equation Modeling: A Multidisciplinary Journal, 12, 513535.CrossRefGoogle Scholar
Bauer, D. J., Baldasaro, R. E., Gottfredson, N. C. (2012). Diagnostic procedures for detecting nonlinear relationships between latent variables. Structural Equation Modeling: A Multidisciplinary Journal, 19, 157177.CrossRefGoogle Scholar
Billingsley, P. (1995). Probability and measure. New York: Wiley. 3rd. Edition.Google Scholar
Bollen, K. A. (1989). Structural equations with latent variables, New York: Wiley.CrossRefGoogle Scholar
Bollen, K. A. & Arminger, G. (1991). Observational residuals in factor analysis and structural equation models. Sociological Methodology , 235–262.CrossRefGoogle Scholar
Brandt, H., Cambria, J., Kelava, A. (2018). An adaptive bayesian lasso approach with spike-and-slab priors to identify multiple linear and nonlinear effects in structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 25, 946960.CrossRefGoogle Scholar
Büchner, R. D., Klein, A. G. (2020). A quasi-likelihood approach to assess model fit in quadratic and interaction SEM. Multivariate Behavioral Research, 55, 855872.CrossRefGoogle Scholar
Chambers, J. M. & Hastie, T. J. (1992). Statistical models in S. Wadsworth & Brooks/Cole.Google Scholar
Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association, 74, 829836.CrossRefGoogle Scholar
Cleveland, W. S. (1981). LOWESS: A program for smoothing scatterplots by robust locally weighted regression. The American Statistician, 35, 5454.CrossRefGoogle Scholar
Cleveland, W. S., Grosse, E. & Shyu, W. M. (1992). Local regression models. In J. M. Chambers & T. J. Hastie (eds.) Statistical models in S, chap. 8. Springer, pp. 309–379.Google Scholar
Croon, M. (2002). Using predicted latent scores in general latent structure models. In Marcoulides, G., Moustaki, I. (Eds), Latent variable and latent structure models, Psychology Press 207236.Google Scholar
De Boor, C. (1978). A practical guide to splines, New York: Springer.CrossRefGoogle Scholar
Delaigle, A. (2014). Nonparametric kernel methods with errors-in-variables: constructing estimators, computing them, and avoiding common mistakes. Australian & New Zealand Journal of Statistics, 56, 105124.CrossRefGoogle Scholar
Delaigle, A., Fan, J., Carroll, R. J. (2009). A design-adaptive local polynomial estimator for the errors-in-variables problem. Journal of the American Statistical Association, 104, 348359.CrossRefGoogle ScholarPubMed
Devlieger, I., Mayer, A., Rosseel, Y. (2016). Hypothesis testing using factor score regression: A comparison of four methods. Educational and Psychological Measurement, 76, 741770.CrossRefGoogle ScholarPubMed
Devlieger, I., Rosseel, Y. (2017). Factor score path analysis. Methodology, 13, 3138.CrossRefGoogle Scholar
Dijkstra, T. K., Henseler, J. (2015). Consistent and asymptotically normal pls estimators for linear structural equations. Computational Statistics & Data Analysis, 81, 1023.CrossRefGoogle Scholar
Fan, J., Masini, R. P., & Medeiros, M. C. (2023). Bridging factor and sparse models. The Annals of Statistics, 51(4), 1692–1717.CrossRefGoogle Scholar
Foldnes, N., Grønneberg, S. (2022). The sensitivity of structural equation modeling with ordinal data to underlying non-normality and observed distributional forms. Psychological Methods, 27, 541567.CrossRefGoogle ScholarPubMed
Fox, J., Weisberg, S. (2011). An R companion to applied regression, London: Sage Publications.Google Scholar
Fuller, W. A. (1987). Measurement error models, New York: Wiley.CrossRefGoogle Scholar
Grice, J. W. (2001). Computing and evaluating factor scores. Psychological Methods, 6, 430–450.CrossRefGoogle Scholar
Grønneberg, S., Foldnes, N. (2024). Factor analyzing ordinal items requires substantive knowledge of response marginals. Psychological Methods, 29(1), 6587.CrossRefGoogle ScholarPubMed
Grønneberg, S., Holcblat, B. (2019). On partial-sum processes of ARMAX residuals. The Annals of Statistics, 47, 32163243.CrossRefGoogle Scholar
Guttman, L. (1955). The determinacy of factor score matrices with implications for five other basic problems of common-factor theory. British Journal of Statistical Psychology, 8, 6581.CrossRefGoogle Scholar
Harville, D. A. (1997). Matrix algebra from a statistician’s perspective.CrossRefGoogle Scholar
Holst, K. K., Budtz-Jørgensen, E. (2020). A two-stage estimation procedure for non-linear structural equation models. Biostatistics, 21, 676691.CrossRefGoogle ScholarPubMed
Horn, R. A., Johnson, C. R. (2013). Matrix analysis, Cambridge: Cambridge University Press.Google Scholar
Hoshino, T., & Bentler, P. M. (2011). Bias in factor score regression and a simple solution. In Analysis of mixed data: Methods and applications (pp. 43–61). Chapman and Hall/CRC.Google Scholar
Huang, X., Zhou, H. (2017). An alternative local polynomial estimator for the error-in-variables problem. Journal of Nonparametric Statistics, 29, 301325.CrossRefGoogle Scholar
Jöreskog, K. G. (1969). A general approach to confirmatory maximum likelihood factor analysis. Psychometrika, 34, 183202.CrossRefGoogle Scholar
Jöreskog, K. G., Olsson, U. H., Wallentin, F. Y. (2016). Multivariate analysis with LISREL, Berlin: Springer.CrossRefGoogle Scholar
Kamgar-Parsi, B., Kamgar-Parsi, B., Brosh, M. (1995). Distribution and moments of the weighted sum of uniforms random variables, with applications in reducing monte carlo simulations. Journal of Statistical Computation and Simulation, 52, 399414.CrossRefGoogle Scholar
Kelava, A., Brandt, H. (2009). Estimation of nonlinear latent structural equation models using the extended unconstrained approach. Review of Psychology, 16, 123132.Google Scholar
Kelava, A., Kohler, M., Krzyżak, A., Schaffland, T. F. (2017). Nonparametric estimation of a latent variable model. Journal of Multivariate Analysis, 154, 112134.CrossRefGoogle Scholar
Kenny, D. A., Judd, C. M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin, 96, 201210.CrossRefGoogle Scholar
Klein, A. G., Moosbrugger, H. (2000). Maximum likelihood estimation of latent interaction effects with the LMS method. Psychometrika, 65, 457474.CrossRefGoogle Scholar
Kohler, M., Müller, F., Walk, H. (2015). Estimation of a regression function corresponding to latent variables. Journal of Statistical Planning and Inference, 162, 88109.CrossRefGoogle Scholar
Krijnen, W. P. (2004). Convergence in mean square of factor predictors. British Journal of Mathematical and Statistical Psychology, 57, 311326.CrossRefGoogle ScholarPubMed
Krijnen, W. P. (2006). Necessary conditions for mean square convergence of the best linear factor predictor. Psychometrika, 71, 593599.CrossRefGoogle Scholar
Krijnen, W. P. (2006). Some results on mean square error for factor score prediction. Psychometrika, 71, 395409.CrossRefGoogle ScholarPubMed
Lee, S.-Y., Song, X.-Y., Tang, N.-S. (2007). Bayesian methods for analyzing structural equation models with covariates, interaction, and quadratic latent variables. Structural Equation Modeling: A Multidisciplinary Journal, 14, 404434.CrossRefGoogle Scholar
MacKinnon, D. P., Fairchild, A. J., Fritz, M. S. (2007). Mediation analysis. Annual Review of Psychology, 58, 593614.CrossRefGoogle ScholarPubMed
Mardia, K. V., Kent, J. T., Bibby, J. M. (1979). Multivariate analysis, Academic Press.Google Scholar
Marsh, H. W., Wen, Z., Hau, K.-T. (2004). Structural equation models of latent interactions: Evaluation of alternative estimation strategies and indicator construction. Psychological Methods, 9, 275300.CrossRefGoogle ScholarPubMed
McDonald, R. (1967). Nonlinear factor analysis. No. 15 in Psychometric Monograph. William Byrd Press.Google Scholar
Mooijaart, A., Bentler, P. M. (2010). An alternative approach for nonlinear latent variable models. Structural Equation Modeling: A Multidisciplinary Journal, 17, 357373.CrossRefGoogle Scholar
Mooijaart, A., Satorra, A. (2009). On insensitivity of the chi-square model test to nonlinear misspecification in structural equation models. Psychometrika, 74, 443455.CrossRefGoogle Scholar
Mooijaart, A., Satorra, A. (2012). Moment testing for interaction terms in structural equation modeling. Psychometrika, 77, 6584.CrossRefGoogle Scholar
Moschopoulos, P. G. (1985). The distribution of the sum of independent gamma random variables. Annals of the Institute of Statistical Mathematics, 37, 541544.CrossRefGoogle Scholar
Nestler, S. (2015). A specification error test that uses instrumental variables to detect latent quadratic and latent interaction effects. Structural Equation Modeling: A Multidisciplinary Journal, 22, 542551.CrossRefGoogle Scholar
Neudecker, H., Satorra, A. (2003). On best affine prediction. Statistical Papers, 44, 257266.CrossRefGoogle Scholar
R Core Team (2023). R: A Language and Environment for Statistical Computing, Vienna: R Foundation for Statistical Computing.Google Scholar
Raykov, T. & Penev, S. (2014). Exploring structural equation model misspecifications via latent individual residuals. In Latent variable and latent structure models (pp. 133–146). Psychology Press.Google Scholar
Rosseel, Y., & Loh, W. W. (2022). A structural after measurement approach to structural equation modeling. Psychological Methods. Advance online publication.Google Scholar
Sarstedt, M., Ringle, C. M. & Hair, J. F. (2021). Partial least squares structural equation modeling. In Handbook of market research (pp. 587–632). Springer.CrossRefGoogle Scholar
Satorra, A. (1989). Alternative test criteria in covariance structure analysis: A unified approach. Psychometrika, 54, 131151.CrossRefGoogle Scholar
Schneeweiss, H., Mathes, H. (1995). Factor analysis and principal components. Journal of Multivariate Analysis, 55, 105124.CrossRefGoogle Scholar
Shapiro, A. (2007). Statistical inference of moment structures. In Handbook of latent variable and related models (pp. 229–260). Elsevier.CrossRefGoogle Scholar
Skrondal, A., Laake, P. (2001). Regression among factor scores. Psychometrika, 66, 563575.CrossRefGoogle Scholar
The MathWorks Inc. (2023). MATLAB version: 9.13.0 (R2023a). Natick, Massachusetts.Google Scholar
Thomson, G. H. (1934). The meaning of i in the estimate of g. British Journal of Psychology, 25, 92.Google Scholar
Thurstone, L. L. (1935). The vectors of mind: Multiple-factor analysis for the isolation of primary traits. Chicago: University of Chicago Press.Google Scholar
Wall, M. M., Amemiya, Y. (2000). Estimation for polynomial structural equation models. Journal of the American Statistical Association, 95, 929940.CrossRefGoogle Scholar
Wall, M. M., Amemiya, Y. (2001). Generalized appended product indicator procedure for nonlinear structural equation analysis. Journal of Educational and Behavioral Statistics, 26, 129.CrossRefGoogle Scholar
Wall, M. M., Amemiya, Y. (2003). A method of moments technique for fitting interaction effects in structural equation models. British Journal of Mathematical and Statistical Psychology, 56, 4763.CrossRefGoogle ScholarPubMed
Weisberg, S. (2005). Applied linear regression. Wiley, New York, 4th ed.CrossRefGoogle Scholar
Williams, J. S. (1978). A definition for the common-factor analysis model and the elimination of problems of factor score indeterminacy. Psychometrika, 43, 293306.CrossRefGoogle Scholar
Yuan, K.-H., Deng, L. (2021). Equivalence of partial-least-squares SEM and the methods of factor-score regression. Structural Equation Modeling: A Multidisciplinary Journal, 28, 557571.CrossRefGoogle Scholar
Figure 0

Table 1 Assumptions used

Figure 1

Figure 1 A comparison of the exact densities of \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$r_\xi $$\end{document} resulting from the corresponding distribution of εx\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\varepsilon _x$$\end{document} with the relevant normal distribution suggested as an approximation.

Figure 2

Figure 2 A comparison of nonparametric estimation for E[η|ξ]\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$ \mathbb {E} [\eta | \xi ]$$\end{document} averaged across 200 replications with n=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=1000$$\end{document} for the LOESS and the smoothed spline methods based on BFS and the NLFS, the HZ-estimator, the BSpline estimator based on NLFS compared to the true trend and a linear SEM estimation with different true trends (quadratic, cubic, logit and piecewise linear) and dimensions dx\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$d_x$$\end{document} with normal ξ\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\xi $$\end{document} and gamma distributed errors ε\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\varepsilon $$\end{document}.

Figure 3

Figure 3 A comparison of the average MISE across 200 replications with n=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\ddot{f}$$\end{document}, NLFS, the linear SEM, and the true latent variables f for comparison) for different dimensions dx\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$d_x$$\end{document} aggregated across all distributions and trends used in the simulation study.

Figure 4

Figure 4 A comparison of the averaged MISE across 200 replications with n=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS, NLFS, linear SEM, and true latent variables f for comparison) for four models with different true trends (quadratic, cubic, logit and piecewise linear) and dimensions dx\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$d_x$$\end{document}. See Table 8 and 9 in Appendix D.4 in the online supplementary material for numerical values.

Figure 5

Figure 5 A comparison of the averaged MISE across 200 replications with n=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\ddot{f}$$\end{document}, NLFS, the linear SEM, and the true latent variables f for comparison) for different dimensions dx\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$d_x$$\end{document} aggregated across all distributions and trends used in the simulation study separated for conditions without (uncrossed) and including cross-loadings and cross correlations in Λx,Ψx,\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\Lambda _x, \Psi _x,$$\end{document} and Ψy\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\Psi _y$$\end{document}. BFS and BFSuc\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$_\text {uc}$$\end{document} are equivalent for uncrossed data.

Figure 6

Figure 6 A comparison of the averaged MISE across 200 replications with n=1000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=1000$$\end{document} for different procedures [(B)Splines vs. LOESS vs. HZ/others] based on different inputs (BFS, NLFS, linear SEM, and true latent variables f for comparison) for two models with different true trends (quadratic and cubic), dimensions dxj\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$d_{x_j}$$\end{document}, and inclusion of cross-relations (cross-loadings and cross-correlations in Λx,Ψx,\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\Lambda _x, \Psi _x,$$\end{document} and Ψy\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\Psi _y$$\end{document}) and distributions (row and column names refer to marginal distributions) used in the simulation study for dξ=2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$d_\xi = 2$$\end{document}. See Table 12 and 13 in Appendix D.4 in the online supplementary material for numerical values.

Supplementary material: File

Grønneberg and Irmer Supplementary material

Grønneberg and Irmer Supplementary material
Download Grønneberg and Irmer Supplementary material(File)
File 5.1 MB