1 Introduction
The issue of interpretability is one of widespread concerns in modeling outcomes of interest across many different disciplines. At the extremes, approaches like large language or deep learning models can be composed of hundreds of thousands of parameters, none of which can be meaningfully interpreted individually. These issues are not isolated to large machine learning approaches, however, and filter down to more common linear models within the social and behavioral sciences. Often, standard models are fit using equations that optimize numerical stability and have relatively simple fit functions—often a crucial feature that allowed these models to be feasibly estimated before modern computing power—rather than the interpretability of the parameters obtained. While simple linear effects solve for both ease of fit and interpretability, they are limited in their ability to test more complex and specific substantive hypotheses (e.g., timing, inflection points). Some fields have sought a balance between these extremes. For instance, the field of cognitive computational modeling has developed a wide array of mathematical expressions that seek to describe mental state representations guiding overt action (Farrell & Lewandowsky, Reference Farrell and Lewandowsky2018; Wilson & Collins, Reference Wilson, Collins and Behrens2019). While these expressions are relatively complex, a strong emphasis of computational modeling is that the parameters of these models are linked with specific cognitive or behavioral processes. As such, individual (or group) differences in these parameters can ideally be linked directly back to cognitive or neural processes of interest (e.g., Mareschal & Thomas, Reference Mareschal and Thomas2007; Patzelt et al., Reference Patzelt, Hartley and Gershman2018; Pleskac et al., Reference Pleskac, Yu, Hopwood and Liu2019; Wilson & Collins, Reference Wilson, Collins and Behrens2019). Outside of these computational models, however, there has only been slow progress in the adoption of interpretable parameter models.
The bio-behavioral and clinical sciences are dominated by the use of linear models—specifically models which are linear with respect to the parameters, including popular polynomial models (e.g., quadratic and cubic) which chart out a non-linear relationship. These linear parameter models have many attractive features for estimating relationships between variables—they are identified across an infinite range of parameter values associated with the predictors (i.e.,
$\beta $
’s), they are purely additive in form, and they are widely implemented in available software. As such, results can be easily and efficiently obtained, often with closed-form solutions (e.g., ordinary least squares regression) or other well-behaved likelihood functions. Unfortunately, the parameters of these models also often do not test specific hypotheses that are of substantive interest (Cudeck & du Toit, Reference Cudeck and du Toit2002; Preacher & Hancock, Reference Preacher and Hancock2015; Ram & Grimm, Reference Ram and Grimm2007). To address these issues, prior work has derived alternative nonlinear expressions (Cudeck & du Toit, Reference Cudeck and du Toit2002; McNeish et al., Reference McNeish, Bauer, Dumas, Clements, Cohen, Lin, Sarama and Sheridan2021), or worked to reparameterize known nonlinear expressions into linear forms (Blozis, Reference Blozis2004; Feng et al., Reference Feng, Hancock and Harring2019; Grimm et al., Reference Grimm, Zhang, Hamagami and Mazzocco2013; Johnson & Hancock, Reference Johnson and Hancock2019; Preacher & Hancock, Reference Preacher and Hancock2012, Reference Preacher and Hancock2015; Zhang et al., Reference Zhang, McArdle and Nesselroade2012). Unfortunately, these models do not appear as standard options in major software packages, and many applied researchers remain unaware of their potential utility. Additionally, nonlinear expressions present additional estimation challenges—especially in growth modeling contexts with random effects—and for these reasons, largely remain the provenance of researchers with training in and access to more advanced statistical methods and software options.
Here, I address several extant issues for formulating polynomial models with interpretable and meaningful parameters, with an eye for expanding the utility and accessibility of these approaches. First, I review a history of interpretable parameter models and walk through a general approach for deriving new parameters of interest, highlighting the quadratic form outlined by Cudeck and du Toit (Reference Cudeck and du Toit2002). I then extend these principles and derive two alternative forms of a cubic polynomial with meaningful parameters and show how this new model is related to the standard linear parameter version. I also discuss a multiphase version of this model which can serve as an approximation of S-shaped nonlinear models (e.g., logistic functions). To address the common estimation issues with nonlinear versions of these alternative models, I lay out an approach of linear estimation with nonlinear inference (LENI), where the standard linear parameter model is estimated, and then results are transformed post hoc into the parameters of interest from the nonlinear alternative models. I derive transformation equations for the point estimates and standard errors of fixed, random, and conditional effects, allowing inferences to be made on the meaningful parameters as if we had directly estimated the nonlinear equation. Finally, extending prior work (Feng et al., Reference Feng, Hancock and Harring2019; Preacher & Hancock, Reference Preacher and Hancock2015), I derive a linearized structural equation model for all of the models discussed, focusing on implementation in freely-available software. Throughout, I discuss these models largely in the context of growth models using mixed-effects multilevel or latent curve structural equation models (McCormick et al., Reference McCormick, Byrne, Flournoy, Mills and Pfeifer2023; McNeish & Matta, Reference McNeish and Matta2018; Meredith & Tisak, Reference Meredith and Tisak1990), with artificial and real data examples, but the discussion of fixed effects derivations would apply equally to traditional regression analysis with no additional variance components. I then end with a discussion of implementation options for applied researchers and open avenues for future work in this area.
2 Interpretable parameter models
In efforts to address mismatches between theoretical and statistical models, prior work has focused on deriving new expressions which equivalently trace out the same nonlinear curves as standard polynomial models, but using parameters that more-closely match theoretical quantities of interest. A prime example of these efforts is work by Cudeck and du Toit (Reference Cudeck and du Toit2002), who derived a new quadratic expression with meaningful parameters. The familiar linear parameter version of the quadratic for repeated measures outcome
$y_{ti}$
is

for person i at time t, where
$\beta _{1}$
and
$\beta _{2}$
are the linear and quadratic effect of the covariate
$x_{ti}$
, respectively, while the alternative expression takes the following form (Cudeck & du Toit, Reference Cudeck and du Toit2002):

Here,
$\alpha _{x}$
and
$\alpha _{y}$
represent the
$(x,y)$
location for the vertex (i.e., peak or trough) of the quadratic parabola, while
$\alpha _{0}$
is equivalent to
$\beta _{0}$
in Equation 1—that is, the predicted level of y when
$x = 0$
. This alternative quadratic model is nonlinear with respect to the parameters (e.g.,
$\alpha _{x}$
is in the denominator) but otherwise describes the exact same parabolic shape as in Equation 1 (note that we can ignore the residual term,
$\varepsilon _{ti}$
, here when discussing alternative models because if we have done our job correctly, it will be identical across model versions). The advantage of Equation 2 is that the
$\alpha _{x}$
and
$\alpha _{y}$
parameters give a direct estimate of the location of the vertex, which in developmental contexts might relate to the timing of changes in sensitivity to the external environment (Braams et al., Reference Braams, van Duijvenvoorde, Peper and Crone2015; McCormick et al., Reference McCormick, Peters, Crone and Telzer2021; Nunes et al., Reference Nunes, Vakorin, Kozhemiako, Peatfield, Ribary and Doesburg2020; Orben et al., Reference Orben, Przybylski, Blakemore and Kievit2022; Shaw et al., Reference Shaw, Kabani, Lerch, Eckstrand, Lenroot, Gogtay, Greenstein, Clasen, Evans, Rapoport, Giedd and Wise2008; Somerville et al., Reference Somerville, Jones, Ruberry, Dyke, Glover and Casey2013), or reflect the optimal arousal levels or dosage needed to maximize the desired response (Chaiken, Reference Chaiken1994; Cudeck & du Toit, Reference Cudeck and du Toit2002; Preacher & Hancock, Reference Preacher and Hancock2015). Cudeck and du Toit (Reference Cudeck and du Toit2002) also described a second alternative model

where
$\gamma $
represents the difference in the level of the outcome y between the intercept and the vertex (
$\gamma = \alpha _{y} - \alpha _{0}$
) rather than estimating
$\alpha _{y}$
(Equation 3, left) or
$\alpha _{0}$
(right) directly. To highlight the fact that we can retain any combination of meaningful parameters that we wish, we can also lay out another alternative quadratic form below where we estimate
$\alpha _{x}$
and
$\alpha _{y}$
but retain
$\beta _{2}$
as
$\alpha _{c}$
(i.e., half of the acceleration, which controls the degree of curvature for the parabola) rather than
$\alpha _{0}$
:

Detailed derivations for all of these models are available in the Supplementary Material.
2.1 Deriving new quantities
While these parameters represent mathematically meaningful points on a quadratic parabola (e.g., location of the vertex), we might also wish to understand something about the curve at a particular value of x. Examples of this kind of question include things like the number of words acquired by 20 months-of-age (Huttenlocher et al., Reference Huttenlocher, Haight, Bryk, Seltzer and Lyons1991), or level of drug and alcohol use upon entry to university (Derefinko et al., Reference Derefinko, Charnigo, Peters, Adams, Milich and Lynam2016). This is an interesting inversion of time-to-criterion models (Johnson & Hancock, Reference Johnson and Hancock2019), where the focus is on estimating how long it takes for some outcome to reach a pre-defined level, or nonlinear models for measuring potential (e.g., McNeish & Dumas, Reference McNeish and Dumas2017) where the rate of approach towards an average or individual-level learning “capacity” is of interest. Here, we might instead be interested in group- or individual-level status achieved on some outcome by a certain developmental milestone. We can use this example to highlight how to go about translating new quantities of interest into statistical parameters that we can estimate from alternative model expressions. For a quadratic of the form derived by Cudeck and du Toit (Reference Cudeck and du Toit2002), we can express the level of the outcome at any pre-defined value of
$x = s$
through the parameter
$\alpha _{s}$
with the equation

To emphasize, s is predefined when estimating the model based on a theoretically interesting value of x, and not estimated as a unique parameter. This could be accomplished alternatively as part of a data management step, where s is subtracted from x before entering the model. In that approach, the original expression in Equation 2 should be used, as the identity of
$\alpha _{0}$
has been changed through centering (Aiken & West, Reference Aiken and West1991) rather than through the model equation.
It might occur to the reader that this new parameter
$\alpha _{s}$
in particular is easily achieved by simply entering the theoretically interesting value of
$x = s$
into the linear parameter quadratic equation (Equation 1) and generating a predicted value of y. However, this approach only returns a point estimate for
$\alpha _{s}$
and not a standard error or random effect variance; therefore, it should preclude us from making inferences on that predicted value (although applied research frequently does so). For instance, many developmental applications calculate and interpret the vertex of a quadratic trajectory (Eggleston et al., Reference Eggleston, Laub and Sampson2004; Giedd et al., Reference Giedd, Raznahan, Alexander-Bloch, Schmitt, Gogtay and Rapoport2015; Lenroot et al., Reference Lenroot, Gogtay, Greenstein, Wells, Wallace, Clasen, Blumenthal, Lerch, Zijdenbos, Evans, Thompson and Giedd2007; LeWinn et al., Reference LeWinn, Sheridan, Keyes, Hamilton and McLaughlin2017) or inflection of a cubic (LeWinn et al., Reference LeWinn, Sheridan, Keyes, Hamilton and McLaughlin2017; Mills et al., Reference Mills, Goddings, Herting, Meuwese, Blakemore, Crone, Dahl, Güroğlu, Raznahan, Sowell and Tamnes2016) as a quantity of interest without also obtaining a standard error. Without some measure of uncertainty, interpreting these point estimates can lead to erroneous conclusions about developmental timing, or differences among individuals or groups thereof (Giedd et al., Reference Giedd, Raznahan, Alexander-Bloch, Schmitt, Gogtay and Rapoport2015; Karriker-Jaffe et al., Reference Karriker-Jaffe, Foshee, Ennett and Suchindran2008; Pfefferbaum et al., Reference Pfefferbaum, Kwon, Brumback, Thompson, Cummins, Tapert, Brown, Colrain, Baker, Prouty, De Bellis, Clark, Nagel, Chu, Park, Pohl and Sullivan2018). We will return to the issue of obtaining these measures of uncertainty (see Linear estimation, nonlinear inference (LENI)).
When considering new interpretable quantities, it is important that we be able to generate specific mathematical definitions. Assuming that we can do this, we can generate an infinite combination of meaningful model expressions that directly estimate parameters of interest. In the following section, I will build on the principles outlined here to derive nonlinear alternatives to the standard cubic model, with interpretable parameters linked to meaningful developmental phenomena.
3 Deriving alternative cubic models
As a natural extension of the quadratic, the cubic polynomial is another option for modeling nonlinear change over time. The common expression of the cubic model is as follows for y for person i at time t:

which is linear with respect to the parameters. However, like the quadratic expression (Cudeck & du Toit, Reference Cudeck and du Toit2002), the parameters of this model are not readily identifiable with a specific developmental feature that might be of theoretical interest. The lower order terms
$\beta _{1}$
and
$\beta _{2}$
are conditional effects specific to where
$x = 0$
, and
$\beta _{3}$
is not easily expressible in meaningful terms (i.e., the change in the acceleration of the curve) for most researchers. Here I will draw out two alternative expressions of the linear parameter cubic model with theoretically-interesting parameters. In its initial form, this model expression is best suited to cases where both extrema (i.e., local minimum and maximum) occur within the range of x, although that condition is not necessary for this alternative expression to obtain meaningful results. For modeling increases which subsequently plateau (e.g., Somerville et al., Reference Somerville, Jones, Ruberry, Dyke, Glover and Casey2013), the multiphase form outlined in Multiphase cubic model for S-shaped trajectories would likely be preferable.
3.1 Expressing meaningful model parameters
For defining the alternative form of the cubic formula, we can consider 4 parameters of interest, two location parameters—
$x_{N}$
and
$y_{N}$
—which locate the inflection point (N) of the curve, and two stretch parameters—
$\delta $
(delta) and h (height)—which determine the horizontal and vertical distances respectively between the inflection point and the extrema of the cubic function. These parameters have well-defined geometric properties and can be used to solve polynomials in terms of their roots (Nickalls, Reference Nickalls1993). To obtain
$x_{N}$
, we take the second derivative of the linear parameter model

which can be set to zero and rearranged such that

which nicely resembles the form of
$\alpha _{x}$
from the quadratic expression (Cudeck & du Toit, Reference Cudeck and du Toit2002). For full details on these parameter derivations, see the Supplementary Material.
The value obtained at the inflection point (
$y_{N}$
) is expressed by evaluating Equation 6 at Equation 8 and simplifying

Now that we have identified the location parameters of the inflection point for the cubic function, we need to define the stretch parameters. The most convenient way to do so is to lay out expressions which identify the extrema—or local maximum and minimum—and defining our parameters as the distances between the inflection point and extrema. A cubic has two extrema locations (
$x_{\mathit{extrema}}$
), but since the function is symmetric, we only need to derive a single expression for
$\delta $
or h that will define the positive and negative distance. A convenient way to obtain
$\delta $
is to set the first derivative of the cubic equal to 0

and solve using the quadratic formula where
$a = 3 \beta _{3}$
,
$b = 2 \beta _{2}$
, and
$c = \beta _{1}$

which, through substitution and simplification (see here), results in

which means that

and that

We can enforce positive distance values by defining
$\delta = \sqrt {\delta ^{2}}$
where

Finally, by defining h as the difference between y obtained at the extrema (
$y_{x_{N} \pm \delta }$
) and the inflection point (
$y_N$
)

Simplifying this expression (see Supplementary Material for full details) results in

One final meaningful parameter we might define is the slope of the cubic function at the inflection point,
$\beta _{N}$
. Here solve the first derivative of Equation 6 at the inflection point and substitute

A schematic of these parameters is displayed in Figure 1. Note that the sign of h (or alternatively
$\beta _{N}$
) determines whether the cubic has an overall increasing (
$h < 0$
) or decreasing (
$h> 0$
) function (the direction of change locally between the extrema is opposite of change globally).

Figure 1 The interpretable parameters are superimposed on an idealized cubic function. Location parameters (
$x_{N}$
and
$y_{N}$
) are indicated with dashed lines, stretch parameters (
$\delta $
and h) are indicated by single-headed arrows over the relevant distance, and the slope of the tangent at the inflection point (
$\beta _{N}$
) is indicated by a single-headed arrow tracing the tangent line at that point.
As we saw with the quadratic, we have a great deal of flexibility in terms of which nonlinear parameters we wish to use, so long as we can derive these transformation functions. Importantly, however, between any two given sets of parameters, the transformation functions are unique and deterministic because we must maintain the same functional form for the equations defined by the two parameter sets to be equivalent.
3.2 Defining the nonlinear model
Now we can substitute the expressions for the meaningful parameters (
$x_{N}$
,
$y_{N}$
,
$\delta $
, h) into the linear parameter model (Equation 6) to derive the nonlinear expression for an interpretable cubic.

which results in

Alternatively, substituting Equation 18 instead of Equation 17 results in

For complete details on how we arrive at these expressions, see the Supplementary Material. Note that these expressions are equivalent to the linear parameter model (Equation 6), and should fit the same functional form. We can use a real data example to see how this model fits in empirical settings.
3.2.1 Data example 1: Negative affect and aging
To briefly highlight how Equation 20 produces meaningful inferences, we can turn to an empirical example. Here I extracted aggregated scatter plot data from Teachman (Reference Teachman2006) (see Figure 2) and fit a cubic model using Equation 6 and Equation 20 to model age-related differences in negative affect in adults, ages 17–93.

Figure 2 Cubic age-related changes in Nnegative aAffect across the adult lifespan. The fitted cubic relationship is plotted over the raw data points.
Results of the two models are presented in Table 1. While both models have the same fit to the data (see log-likelihood and BIC values; the standard
$R^2$
is only available in the linear parameter model), it is immediately apparent that the parameter values in the nonlinear parameter model have intuitive meaning—the inflection point in negative affect is around 57 years-of-age and decreases by approximately 0.4 units across the 21 years between either extrema (
$x_{N} \pm \delta $
) and the inflection point. By contrast, the values of the linear parameter model are virtually meaningless from a substantive standpoint without plotting. To emphasize, neither model is wrong — indeed they are identical in the model-implied curve—however, the nonlinear parameter model is more useful if our goal is to interpret meaningful points in the age-related change with precision (rather than the frequent practice of “eyeballing” a plot) and crucially, the appropriate level of uncertainty (via the standard errors).
Table 1 Fitting linear and nonlinear parameter cubic models

Note: * p
$<$
0.05, ** p
$<$
0.01, *** p
$<$
0.001;
$\ell $
is the log-likelihood, k is the number of model parameters.
3.3 Multiphase cubic model for S-shaped trajectories
While the cubic model in its full form offers an ability to derive meaningful conclusions about developmental patterns, cubic polynomials can also be used to capture plateaus (e.g., Somerville et al., Reference Somerville, Jones, Ruberry, Dyke, Glover and Casey2013) because of the saddle-point of the cubic. However, there is often less made inferentially about subsequent acceleration after that plateau, although any full cubic will continue off into infinity in principle. If we want better approximate developmental plateauing, or other S-shaped functional forms (e.g., logistic, Gompertz), then a multiphase (or piecewise) version of the cubic may offer an attractive alternative. McNeish et al. (Reference McNeish, Bauer, Dumas, Clements, Cohen, Lin, Sarama and Sheridan2021) showed that two reparameterized quadratics (Cudeck & du Toit, Reference Cudeck and du Toit2002) can be linked together at the inflection point of a hypothetical S-shaped functional form in a multiphase polynomial model (Cudeck & Klebe, Reference Cudeck and Klebe2002; Flora, Reference Flora2008). Additional pieces are specified such that once these linked quadratics reach their model-implied vertices (
$\alpha _{x}$
), they stay fixed at the y value obtained at the vertex (
${\alpha}_y$
, see McNeish et al., Reference McNeish, Bauer, Dumas, Clements, Cohen, Lin, Sarama and Sheridan2021, for full details).
We can use a similar approach here, taking advantage of the interpretable parameters I derived in Equation 20. If we take the case of a 4-parameter logistic function as an exemplar of an S-shaped curve, this function is defined for outcome y for person i at time t by

where
$A_{upper}$
and
$A_{lower}$
are the upper and lower asymptote,
$x_{N}$
is the x-location of the inflection point, and
$Hill$
is related to the steepness of change at the inflection point. These parameters conceptually map on to several of the interpretable cubic parameters (Equation 20 or Equation 21) nicely, with
$x_{N}$
capturing the inflection point,
$\beta _{N}$
capturing the rate of change at that point, and the upper and lower asymptotes approximated by
$y_{N} \pm h$
. I use “approximate” advisedly because the multiphase cubic actually obtains the value of
$y_{N} \pm h$
, while Equation 22 only ever infinitely approaches
$A_{upper}$
and
$A_{lower}$
. We can define the multiphase cubic function for the same
$y_{ti}$
as

or alternatively

A schematic of the multiphase model with the relevant features highlighted can be seen in Figure 3.

Figure 3 Multiphase cubic model. A) Using a multiphase cubic function (Equation 23 or Equation 24), we can approximate an S-shaped function with three components. The component between onset and offset is defined by the cubic function, while outside this range is defined by
$y_{N} \pm h$
. B) Alternative models were fit to pubertal developmental data, including a 4-parameter logistic (red), the multiphase cubic (green) and standard linear parameter cubic model (blue). The logistic and multiphase models do not enforce continued acceleration at the edges of the curve—an advantage over the standard cubic. While the logistic model continues increasing asymptotically, the multiphase cubic models stability outside of the cubic extrema.
3.3.1 Data example 2: Pubertal development
To illustrate this model, I drew pubertal data from an accelerated longitudinal study of adolescent development (BrainTime
$;$
Braams et al., Reference Braams, van Duijvenvoorde, Peper and Crone2015; McCormick et al., Reference McCormick, Peters, Crone and Telzer2021). Adolescents were assessed up to three times at two-year intervals and self-reported on their level pubertal development (Petersen et al., Reference Petersen, Crockett, Richards and Boxer1988). Prior work (Braams et al., Reference Braams, van Duijvenvoorde, Peper and Crone2015) determined that a standard cubic model was the best fit to these data compared with linear and quadratic polynomial alternatives. Here, I fit three random-intercept growth models to these data: 1) a 4-parameter logistic curve (4-PL; Equation 22), 2) the multiphase cubic model (Equation 23), and 3) a standard linear parameter version of the cubic (Equation 6). I modeled
$A_{lower}$
,
$y_{N}$
, and
$\beta _{0}$
, respectively, as random terms to equate the complexity of the random effects structure for all three models (code and full results for all three models are available in the Supplementary Material).
The primary fixed effects of interest in each model are presented in Table 2. As reflected in Figure 3, the three models achieve reasonably similar results in terms of fitting a curve to the data. However, note that the BIC indicates that the 4-PL and multiphase-cubic models both fit better than the standard cubic function, due in large part to their different edge behavior (they asymptote or offset rather than continuing to accelerate). However, from a model interpretation stand-point, the 4-PL and multiphase cubic have clearly interpretable parameters that match substantively interesting features of the developmental trajectory. The estimates between these two models largely draw the same conclusions about both the levels of y at the asymptotes and where the inflection point is located. I additionally fit Equation 24 to compute
$\beta _{N}$
(0.491, SE = 0.034, p < 0.001), which is expressed directly as the slope of the tangent at the inflection point unlike the
$Hill$
parameter, which controls the overall shape of the curve in the 4-PL model but is not in easily-interpretable units.
Table 2 Alternative S-shaped trajectories

Note: * p
$<$
0.05, ** p
$<$
0.01, *** p
$<$
0.001;
$\ell $
is the marginal (Marg.) log-likelihood, k is the number of model parameters.
There are some additional theoretical advantages to the multiphase cubic compared with the 4-PL model. Unlike the asymptotic nature of the logistic, the multiphase cubic here and quadratic outlined by McNeish et al. (Reference McNeish, Bauer, Dumas, Clements, Cohen, Lin, Sarama and Sheridan2021) obtain the minimum and maximum model-implied values. For outcomes measured on some natural scale—as opposed to the probability scale where asymptotic behavior is desirable — these multiphase models likely represent more realistic developmental conditions, especially when the scale has a natural or measurement boundary (i.e., floor or ceiling
$;$
Feng et al., Reference Feng, Hancock and Harring2019). For instance, pubertal development is not an infinitely-occurring process—that is pre- and post-puberty are meaningful terms. Other processes such as cortical thinning (Fuhrmann et al., Reference Fuhrmann, Madsen, Johansen, Baaré and Kievit2022; Mills et al., Reference Mills, Goddings, Herting, Meuwese, Blakemore, Crone, Dahl, Güroğlu, Raznahan, Sowell and Tamnes2016; Tamnes et al., Reference Tamnes, Herting, Goddings, Meuwese, Blakemore, Dahl, Güroğlu, Raznahan, Sowell, Crone and Mills2017) also reach some (at least local) minimum before other processes take over into the adult phase of development. Here the multiphase cubic could even be extended, with additional growth components modeled in the offset period, or to model processes of punctuated equilibrium, where change occurs in bursts followed by periods of stability, by linking several multiphase cubic functions. This approach is familiar in generalized additive (mixed) models where cubic splines have featured prominently (e.g., Wood, Reference Wood2000). However, unlike GAMs, which focus on description and prediction, the multiphase cubic model maintains a primary focus on an explanatory model with interpretable parameters. Nevertheless, GAMs can be potentially valuable tools for conducting sensitivity analyses on interpretable parameter models to ensure that data-driven curves descriptively resemble the a priori functional forms fit to the data. Substantial departure from these imposed forms can reveal whether theory-driven models are indeed suitable to model the data at hand. For the various empirical examples, this GAM-based sensitivity approach was used to confirm the suitability of the functional forms used in the analysis (see the Supplementary Code for these results).
4 Linear estimation, nonlinear inference (LENI)
Thus far, I have outlined a history of efforts to improve the interpretability of models by deriving alternative forms with theoretically meaningful parameters, including applications of the reparameterized quadratic (Cudeck & du Toit, Reference Cudeck and du Toit2002; Grimm et al., Reference Grimm, Zhang, Hamagami and Mazzocco2013; McNeish et al., Reference McNeish, Bauer, Dumas, Clements, Cohen, Lin, Sarama and Sheridan2021; Preacher & Hancock, Reference Preacher and Hancock2015) and how to build on these principles to derive new quantities of interest. I then extended these ideas to the cubic model, deriving a set of new expressions with 5 interpretable parameters (Equation 20 and Equation 21), and a multiphase version of the cubic to model S-shaped curves. However, despite clear theoretical advantages and substantive interest in things like developmental “peaks” in the literature, these alternative models have remained largely restricted to advanced applications. In the remainder of this paper, I will lay out current challenges for the adoption of meaningful parameter models, and then offer a set of analytic approaches to address key limitations. The techniques discussed—which involve linear estimation with nonlinear parameter interpretation (LENI) — apply to both standard regression analysis, as well as and major classes of longitudinal modeling, including mixed-effect (multilevel) and structural equation growth models (McCormick et al., Reference McCormick, Byrne, Flournoy, Mills and Pfeifer2023; McNeish & Matta, Reference McNeish and Matta2018).
4.1 Limitations of nonlinear models
While there are strong theoretical arguments for the use of interpretable parameter models, a number of technical challenges have historically been barriers to their widespread adoption. Many of these challenges stem from the relative difficulty of estimation in nonlinear parameter models compared to their linear parameter alternatives. First, linear parameter models are defined across all possible values of the parameters, whereas many nonlinear equations are undefined at specific values. For instance, the alternative quadratic (Cudeck & du Toit, Reference Cudeck and du Toit2002), (Equation 2) is undefined at
$\alpha _{x} = 0$
, and the alternative cubic (Equation 20) is undefined at
$\delta = 0$
. As parameter estimates approach these values, the model-implied values of the outcome become more unstable. By contrast, in the linear parameter models (Equation 1; Equation 6), any parameter
$\beta _{p} = 0$
merely indicates an absence of that component of the polynomial function. This issue is not universal for alternative expressions, as the additional alternative quadratic derived in Equation 4 is defined at all parameter values, but it is a common problem with nonlinear equations.
An additional disadvantage of nonlinear alternative models is a lack of clearly hierarchical structure to the parameters of the model, both for the fixed effects and for the ordering of random effects (McNeish et al., Reference McNeish, Bauer, Dumas, Clements, Cohen, Lin, Sarama and Sheridan2021). For instance, as mentioned above, when a linear parameter
$\beta _{p} = 0$
, this is informative about the complexity of a curve. If
$\beta _{3} = 0$
, then the cubic function devolves back to a quadratic one, and so forth. By contrast, the nonlinear alternative (Equation 20) cannot be reduced to a quadratic by setting any single parameter to 0, nor can we do nested model comparisons (i.e., likelihood ratio tests) to determine the optimal polynomial complexity of the curve. The lack of hierarchical structure additionally complicates model specification when the random effects structure needs to be constrained to achieve convergence (McNeish & Bauer, Reference McNeish and Bauer2022). In the linear parameter model, we would typically constrain random effects from the highest-order (
$\tau _{\beta _{3}}$
) to the lowest (
$\tau _{\beta _{1}}$
). However, in the interpretable parameter model, the ordering of which random effects to constrain is less clear. The parameter transformation equations might give us some sense of a reasonable ordering—for instance,
$y_{N}$
is the only interpretable parameter that is a function of the intercept (
$\beta _{0}$
), and
$x_{N}$
is the only parameter that is not a function of
$\beta _{1}$
. This might suggest that
$x_{N}$
be constrained first, and
$y_{N}$
always be modeled in a random effects model, but this ordering is much less clear than in the linear parameter case.
Finally, two related challenges stem from the relative difficulty of estimating nonlinear parameter models compared to linear parameter alternatives. Even in the relatively simple regression case, nonlinear models lack a closed-form solution and require iterative fitting procedures (Fox & Weisberg, Reference Fox and Weisberg2011). These estimation challenges only increase when fitting mixed-effects or structural equation growth models, where likelihoods in nonlinear models can be more poorly-behaved (S. A. Blozis, Reference Blozis2007) and more likely to result in local solutions, in addition to the challenges of improper solutions (e.g.,
$\delta \to 0$
). Due in part to these challenges, nonlinear parameter models are often not easily accessible for applied researchers within widely-available software packages, requiring custom syntax and difficulty implementing these non-standard model equations. Paired with less wide-spread knowledge of alternative interpretable models, this creates a negative feedback loop, where software providers are not incentivized to develop resources for models that users are unlikely to fit.
The linear estimation with nonlinear inference (LENI) approach laid out in the following sections addresses each of these challenges, drawing on the strengths of estimation and simplicity in the linear parameter model but without sacrificing the interpretability of the nonlinear alternative models.
4.2 LENI for fixed effects
The idea of deriving some meaningful quantity from an estimated linear parameter model is not without precedence. Substantive applications have occasionally used the formulations for
$\alpha _{x}$
and
$\delta $
to identify the peaks (e.g., Braams et al., Reference Braams, van Duijvenvoorde, Peper and Crone2015; LeWinn et al., Reference LeWinn, Sheridan, Keyes, Hamilton and McLaughlin2017), especially when comparing trajectories across groups (often between males and females
$;$
e.g., Giedd et al., Reference Giedd, Raznahan, Alexander-Bloch, Schmitt, Gogtay and Rapoport2015). However, these formulations only return a point estimate without an associated standard error. Nor does it allow for conditional effects of covariates of interest predicting the interpretable parameters. In the following sections, I lay out a general approach for deriving the meaningful parameters of the nonlinear alternative models discussed throughout, entirely from the results of a linear parameter model for the same functional form (quadratic or cubic). Note that for the fixed effects, these transformations apply equally to standard regression, mixed-effects (or multilevel) models, and structural equation models. For SEM growth models specifically, we can return to other, more direct, LENI approaches at the end of this treatment.
For considering fixed effects, we will estimate the following linear parameter models. Here we will consider a standard regression model with no random effects, but these approaches generalize to models with random effects structures without alteration (as shown in later sections). Thus, for person i,

4.2.1 Point estimates
To obtain point estimates for the interpretable parameters, we need only to apply the relationships derived previously by Cudeck and du Toit (Reference Cudeck and du Toit2002) and here in Deriving alternative cubic models. Namely, for the alternative quadratic model, the expressions for the interpretable parameters are as follows:

and for the alternative cubic model, the expressions are

Nothing more need be done to obtain these point estimates beyond some algebra. One note of caution, however, is that we can in theory compute more interpretable parameter point estimates than we estimate in the linear parameter model. On one hand, the interpretable parameters are not independent (there is an especially tight relationship between
$\gamma $
and
$\alpha _{y}$
, and between h and
$\beta _{N}$
for instance), so we would just be repackaging the same information and thus only support the same conclusions. However, on the other hand, I would recommend that best practice would be to focus interpretation only on a set of parameters that would be estimated in a single nonlinear model (e.g., h or
$\beta _{N}$
, not both) to avoid over-extracting the results. Selecting which set of interpretable parameters to extract should be guided by the conclusions that researchers wish to draw, much like model selection proceeds generally.
4.2.2 Standard errors
For these point estimates to be useful inferentially, we need a LENI approach to deriving the standard errors for the interpretable parameters. For an unknown quantity equal to a function of two quantities (
$\theta _{1}$
and
$\theta _{2}$
) with known values and uncertainty (i.e., fixed effects and standard errors), the unknown uncertainty,
$\mathrm {Var}(f(\theta _{1},\theta _{2}))$
, can be approximated by the quadratic expression of partial derivatives:Footnote
1

Thus, for
$x_{N}$
, we can take Equation 8 and express the expected variance of the parameter as

Taking the square root of
$\mathrm {Var}(x_{N})$
yields the standard error, which can be used to compute a p-value or confidence interval as desired.
However, many of the nonlinear parameters are significantly more complex functions of the linear parameters. As such, this scalar equation approach quickly becomes tedious and error-prone. Alternatively, we can take a matrix approach and pre- and post-multiply the asymptotic covariance matrix of the linear parameter model- – —by the Jacobian of partial derivatives with respect to each linear parameter for a given nonlinear parameter transformation expression (e.g.,
$\mathbf {J}_{x_{N}}$
).Footnote
2

If we expand the Jacobian with additional columns of partial derivatives corresponding to each nonlinear transformation, we can obtain the entire asymptotic covariance matrix of Equation 20 rather than only the variance of each parameter individually.

The square root of the diagonal of the resulting matrix is the vector of standard errors for the interpretable nonlinear parameters.
While this analytic approach works very well at approximating the
$\mathrm {ACOV}$
matrix that would have resulted from directly fitting the nonlinear model (the error of approximation is very small; see Supplementary Code for complete details), we could alternatively use a bootstrapping approach to build an empirical standard error and confidence interval. This empirical alternative uses straightforward applications of the bootstrap, and thus, I will arrogate the details of such a procedure to the Supplementary Code for interested readers. We will see that the bootstrapping approach is more useful for other components of the LENI approach in future sections.
4.2.3 Simulation example 1: Fixed effects
Before moving on to the random effects derivations, I first demonstrate the performance of these derivations in a simulated example of both the quadratic and cubic models. Here data were simulated from the nonlinear equations Equation 2 and Equation 20. The parameters of the quadratic are
$\alpha _{0} = 1$
,
$\alpha _{x} = 2$
, and
$\alpha _{y} = 8$
, resulting in a concave quadratic function with a vertex at
$x,y = (2,8)$
. The parameters of the cubic are
$x_{N} = 0$
,
$y_{N} = 10$
,
$\delta = 3$
, and
$h = -2$
, resulting in a cubic function where the local maximum occurs before the inflection point, the local minimum occurs after, and the vertex is at
$(x,y) = (0,10)$
. For both models, I simulated data for 250 individuals 1,000 times. I then fit the linear parameter model (linear estimates), generated LENI estimates through the transformations outlined in the prior sections, and then fit the nonlinear parameter model directly (nonlinear estimates) with the true parameter values as starting values to avoid estimation issues. The mean parameter values across all iterations for each approach can be compared in Table 3. Notably, the LENI and nonlinear point estimates and standard errors are nearly identical, with correspondence out to the 8th or 9th decimal place (see Supplementary Material for full precision details).
Table 3 LENI approach to fixed effects estimation

Note: Parameter estimates and standard errors (in parentheses) are the mean values across 1000 iterations of data generation and model fitting. Pop
$\theta $
indicates the generating value for each parameter. Linear Estimates indicate the fitted values from the linear parameter model. LENI Estimates indicate the transformed estimates of the nonlinear parameter model based on the Linear Estimates. Nonlinear Estimates indicate the fitted values from directly estimating the nonlinear parameter model.
$R^{2}$
is the proportion of variance explained (only available in the linear parameter model), Marg.
$\ell $
is the marginal log-likelihood, k is the number of model parameters, and BIC is the Bayesian Information Criterion.
4.3 LENI for random effects
While the fixed effects are often the focus for empirical studies, we can also develop a set of transformations for the random effects. Random effects allow for individual variability in parameters of interest, and form the basis for more complex models, including conditional models (Biesanz et al., Reference Biesanz, Deeb-Sossa, Papadakis, Bollen and Curran2004; Curran et al., Reference Curran, Bauer and Willoughby2004; Raudenbush & Bryk, Reference Raudenbush and Bryk2002) and models with distal outcomes (McCormick et al., Reference McCormick, Curran and Hancock2024). Here the target is to transform the covariance matrix for the random effects obtained in the linear parameter version of the model into the covariance matrix; we would have obtained from fitting the nonlinear parameter model directly. I will also show how to obtain standard errors associated with these (co)variance estimates.
To assess the generality of the LENI approach, I consider a maximal random effects model for both the quadratic and cubic model— that is all random effect variances (e.g.,
$\tau _{a_{x}}$
,
$\tau _{\delta }$
) and covariances (e.g.,
$\tau _{\alpha _{0},\alpha _{y}}$
,
$\tau _{y_{N},h}$
). Substantive applications might restrict the full covariance matrix (
$\mathbf {T}$
in MLM, or
$\boldsymbol {\Psi }$
in SEM growth models), usually for reasons of under-identification, either theoretically due to too few individual repeated measures or empirically due to a non-positive definite full matrix. Here the LENI approach takes advantage of the clear hierarchical structure of the linear parameter model, and we can restrict the random effects variances in sequence from highest-order (
$\tau _{\beta _{3}}$
) to lowest (
$\tau _{\beta _{1}}$
) as indicated.
4.3.1 Variance estimates
In a beautiful bit of symmetry, we can apply the same approach for obtaining the standard errors that outlined above to obtain the variance–covariance matrix of the random effects for the nonlinear parameter model (
$\mathbf {T}_{f(x_{N},y_{N},\delta ,h)}$
). That is, we can use the Jacobian of partial derivatives of each transformation function with respect to the linear parameters. However, instead of the
matrix as in Equation 31, here we will use
—the variance–covariance matrix for the linear parameter model—instead.

Unlike the standard error approach, the resulting
$\mathbf {T}_{f(x_{N},y_{N},\delta ,h)}$
matrix can be used directly, and the off-diagonal covariances (e.g.,
$\tau _{x_{N},y_{N}}$
) or standardized correlations are often of direct theoretical interest.
4.3.2 Variance standard errors
While conceptually similar to deriving the standard errors of the fixed effects, deriving analytic expressions for the standard errors of the variance components in
$\mathbf {T}_{f(x_{N},y_{N},\delta ,h)}$
require us to consider additional transformations. Namely that the point estimates expressions of the variance components that we must compute the Jacobian partial derivatives on are now the quadratic expressions given by Equation 32.
While the full set of transformations is outlined in the Supplementary Material, prior work has shown that analytic expressions are unlikely to provide optimal estimates for these standard errors in practice because of the asymmetric nature of variance estimates, a point proven by the simulation results. As is the case with variance–covariance parameters in standard linear parameter models (Bolker, Reference Bolker2016), it is preferable to generate bootstrap confidence interval estimates instead and apply the point estimate transformations from Equation 32 to obtain confidence intervals on the nonlinear parameter model estimates.
4.3.3 Simulation example 2: Random effects
To test the LENI approach to random effects, I simulated data for 1000 replicated samples from the nonlinear equations with saturated random effects structures (quadratic: N = 500, t = 4; cubic: N = 750, t = 6). Like before, I present results from the linear parameter model (linear estimates), transformed results (LENI estimates), and from a nonlinear parameter model directly (nonlinear estimates). In contrast to the tight correspondence between LENI and nonlinear point estimates (Table 3), the estimates for the variance (
$\tau $
) and correlation (
$\rho $
) parameters differ to a greater extent from one another, and from the population-generating parameters. While the results for the quadratic model appear reasonable, the estimates for the cubic model are much less accurate for either LENI or nonlinear approaches, with wide standard errors, reflecting the general difficulty of fitting such high-dimensional random effects models (Table 4). Follow-up investigation of the results suggests that this is due to poor recovery of the linear parameter covariance matrix elements, where the standardized bias in the quadratic (
$M_{\mathit {abs. val.}}=0.535$
,
$\mathit {range}_{\mathit {abs. val.}}=0.166 - 1.164$
) and cubic (
$M_{\mathit {abs. val.}}=0.681$
,
$\mathit {range}_{\mathit {abs. val.}}=0.049 - 1.839$
) models were high. The substantially better fit of the nonlinear model, as indicated by the lower average BIC, is likely driven by two factors: 1) the population-generating model is from the nonlinear equation while the LENI estimates are approximations, and 2) population values were input into the nonlinear model as starting values because of convergence issues, meaning that these are best-case estimates of the nonlinear model. Despite these discrepancies in the random effects results, the LENI estimates for the fixed effects in these models still perform well (see Supplementary Material for full details). This suggests that if the fixed effects are of key theoretic interest, the LENI results can still perform well, while if the covariances/correlations are key to testing the substantive theory, alternative methods should be utilized. One would be to directly fit the nonlinear model (although recovery is likely to still be poor in the cubic model), or alternatively to utilize the structural equation model approach outlined in a later section.
Table 4 LENI approach to random effects estimation

Note: Parameter estimates and standard errors (in parentheses) are the mean values across 1,000 iterations of data generation and model fitting. Pop
$\theta $
indicates the generating value for each parameter. Linear estimates indicate the fitted values from the linear parameter model. LENI estimates indicate the transformed estimates of the nonlinear parameter model based on the linear estimates. Nonlinear estimates indicate the fitted values from directly estimating the nonlinear parameter model.
$\tau $
parameters represent variances,
$\rho $
parameters represent correlations,
$R^{2}$
is the proportion of variance explained (only available in the linear parameter model), Marg.
$\ell $
is the marginal log-likelihood, k is the number of model parameters, and BIC is the Bayesian information criterion.
4.4 LENI for including predictors of interest
4.4.1 Conditional effects
The point estimates, variance–covariance terms, and associated standard errors of the interpretable parameters offer important information about the developmental process under consideration; however, by themselves, they are largely descriptive of the pattern of change over time. Many important developmental questions involve testing predictors of growth parameters (Bauer & Curran, Reference Bauer and Curran2005; Curran et al., Reference Curran, Bauer and Willoughby2004), and we can derive a LENI approach to this as we did with the fixed effects of the unconditional growth model.
Here we need to distinguish between time-varying and time-invariant covariates (Curran & Bauer, Reference Curran and Bauer2011; McNeish & Matta, Reference McNeish and Matta2019). The relevant difference here is that time-varying covariates predict the repeated measures outcome directly, while time-invariant covariates predict the repeated measures indirectly through the growth parameters. The central insight here is that this indirect prediction results in interaction terms in the growth equation. As such, we can use basic principles of interactions in regression analysis (Aiken & West, Reference Aiken and West1991) to derive point estimates and standard errors for conditional effects. Consider a binary time-invariant covariate
$w_{i}$
which is included in the cubic equation, predicting all growth parameters. For person i at time t, the resulting growth model would have the form

We can therefore use what we know about the expected value of this equation at different levels of
$w_{i}$
to define the conditional effect on the nonlinear parameter. For instance, for
$w_{i} = 0$
, the expected value of
$x_{N}$
is

while for
$w_{i} = 1$
then the expected value is

because
$\beta _{6}$
is the expected change in
$\beta _{2}$
when
$w_{i}$
is shifted one unit, and
$\beta _{7}$
is the expected change in
$\beta _{3}$
across the same change in
$w_{i}$
.
Taking the difference of Equation 35 and Equation 34 gives the change in
$x_{N}$
per unit change in
$w_{i}$
(denoted here as
$\pi _{x_{N},w}$
).

The conditional effects of each of the interpretable nonlinear parameters can be derived in this fashion, and the standard errors can be obtained using the same Jacobian or bootstrapping approaches outlined in Standard errors.
4.4.2 Simulation example 3: Conditional effects
I simulated a conditional version of each nonlinear equation, where
$w_{i}$
was used to predict each growth term. The covariate was simulated
$w_{i}$
as both a binary (
$w_{i} \sim \mathit {Bernoulli}(0.4)$
) and a continuous (
$w_{i} \sim \mathit {N}(0, 0.25^{2})$
) covariate in separate simulations for generality, but the LENI computations of the relevant conditional effects apply across predictor types. Below, the results of the binary simulations are reported, but the continuous results can be seen in the Supplementary Material. Like in the unconditional model, the LENI approach does an excellent job of approximating the nonlinear results in the fixed effects for both the quadratic and cubic models (Table 5).
Table 5 LENI approach to conditional effects estimation

Note: Parameter estimates and standard errors (in parentheses) are the mean values across 1000 iterations of data generation and model fitting. Pop
$\theta $
indicates the generating value for each parameter. Linear estimates indicate the fitted values from the linear parameter model. LENI estimates indicate the transformed estimates of the nonlinear parameter model based on the linear estimates. Nonlinear estimates indicate the fitted values from directly estimating the nonlinear parameter model.
$\pi $
parameters represent conditional effects of
$w_{i}$
on the nonlinear parameters,
$R^{2}$
is the proportion of variance explained (only available in the linear parameter model), Marg.
$\ell $
is the marginal log-likelihood, k is the number of model parameters, and BIC is the Bayesian Iinformation cCriterion.
4.5 LENI real data example
As an empirical demonstration of the LENI approach, I drew network modularity data from the BrainTime sample (McCormick et al., Reference McCormick, Peters, Crone and Telzer2021) and used self-reported sex (female = 0; male = 1) as a predictor of the curve components. I then fit the following conditional random-intercept model:

with both main effects and interactions with the predictor. I also considered models hierarchically — a benefit of the linear estimation component of LENI—with linear and quadratic random effects, but these models were singular. Using the defined transformations, I then examined the implied nonlinear results and interpreted sex-specific trajectories of brain network organization.
Plotting the data, we can see that the trajectory for network modularity in female adolescents (red) peaks at 19.04 years-of-age (
$\mathit {SE}$
= 0.670) with male adolescents (blue) appearing to reach this vertex somewhat later (implied at 19.95 years) and at lower peak levels (Figure 4). While common practice has been to simply assert this difference as meaningful (i.e., “boys show delayed development compared with girls”), the LENI approach allows us to build a direct statistic test for sex-specific differences in meaningful parameters of these trajectories. The LENI results suggest that while female adolescents do show a higher peak value in modularity at the vertex (
$\alpha _y$
) than male adolescents (
$\pi _{\alpha _y,male}$
= −0.392,
$\mathit {SE}$
= 0.114, t = −3.429;
$p <$
0.001), there is no significant sex difference in the age at which the vertex is reached (
$\pi _{\alpha _{x},male}$
= 0.908,
$\mathit {SE}$
= 0.935, t = 0.971, p = 0.332). This demonstrates the importance of parameterizing statistical models to test meaningful hypotheses with appropriate uncertainty (here the standard errors) rather than relying solely on point estimates to draw inferences about developmental theories (e.g., sex-specific delays in maturation).

Figure 4 Sex-specific trajectories of brain network organization. While the implied trajectory for male adolescents (blue) appears to have a delayed peak compared with female adolescents (red,
$\alpha _x$
), the inference test on the nonlinear parameter (
$\pi _{\alpha _{x},male}$
) derived from the LENI approach shows that they are statistically indistinguishable.
4.6 Linearized structural equation models
In the prior sections, I laid out a series of transformations that allow us to transform the results from a linearly-estimated model to approximate the results of directly estimating the nonlinear model- – with attendant advantages related to estimation, measures of
$R^{2}$
, and hierarchical specification of the random effects structure. I have highlighted these transformations thus far primarily using mixed-effects (multilevel) models on time-unstructured data for maximum generality, although these methods could easily be applied in more time-structured cohort data without issue. Indeed, the set of LENI transformations could be equally useful in structural equation growth models like the latent curve (Meredith & Tisak, Reference Meredith and Tisak1990) and latent change score models (McArdle et al., Reference McArdle, Grimm, Hamagami, Bowles and Meredith2009).
However, adopting the SEM framework allows us to extend the LENI conceptual framework in a more interesting way by estimating a linearized version of the nonlinear model directly within the latent variable software (S. A. Blozis, Reference Blozis2004; Browne, Reference Browne, Cuadras and Rao1993; Preacher & Hancock, Reference Preacher and Hancock2012, Reference Preacher and Hancock2015).Footnote 3 That is, rather than estimating the familiar linear parameter model and applying post-hoc transformations to obtain the nonlinear inferences, we can specify a linearized SEM to allow for direct estimation of the nonlinear parameters within a linear estimator. Prior work has used this approach to model a wide array of potential nonlinear functions within a linear SEM framework, including logistic curves (Choi et al., Reference Choi, Harring and Hancock2009), multiphase (piecewise) models with random knots (Feng et al., Reference Feng, Hancock and Harring2019; Preacher & Hancock, Reference Preacher and Hancock2015), half-life with negative exponentials (S. A. Blozis, Reference Blozis2004; Preacher & Hancock, Reference Preacher and Hancock2015), and time-to-criterion models (Johnson & Hancock, Reference Johnson and Hancock2019), among others.
The general procedure for fitting linearized SEM has been detailed in full by prior work (S. A. Blozis, Reference Blozis2004, Reference Blozis2007; Feng et al., Reference Feng, Hancock and Harring2019; Preacher & Hancock, Reference Preacher and Hancock2012, Reference Preacher and Hancock2015), so I briefly review the relevant modeling steps here and then move into the specific use-cases of the models discussed thus far. First (1), we need to define the nonlinear equation—either a reparameterization such as the alternative quadratic or cubic, or a natively nonlinear function (e.g., logistic, negative exponential)—with a set of parameters that correspond to theoretically meaningful quantities. Then (2), to make the nonlinear function compatible with the Linearly-estimated SEM, we linearize the function through a first-order Taylor series approximation by taking the partial derivative of the nonlinear function with respect to each parameter at the mean value of all other parameters. Once we have these partial derivatives (3), we use a structured latent curve model (SLCM
$;$
S. A. Blozis, Reference Blozis2004; Browne & Du Toit, Reference Browne and Du Toit1991; Browne, Reference Browne, Cuadras and Rao1993) approach where we set the factor loadings of each latent variable- – which now represent the meaningful nonlinear parameters—as the partial derivative with respect to that same parameter, and specify each observed repeated measure intercept as the mean of the target nonlinear function at that value of time. Finally (4), with this model specification, we can estimate the linearized model with standard SEM software and obtain direct estimates and standard errors for the meaningful nonlinear parameters.
The resulting SLCM (Browne, Reference Browne, Cuadras and Rao1993) models both average and individual-level change, although with some notable differences in interpretation compared with both the standard latent curve model (Meredith & Tisak, Reference Meredith and Tisak1990) as well as fully nonlinear mixed-effects growth models. In the standard latent curve model, individual trajectories must follow the same functional form as the average trajectory, whereas the SLCM relaxes this constraint to allow for increased flexibility to fit unique patterns of change that do not follow the average (S. A. Blozis & Harring, Reference Blozis and Harring2017). However, like the standard latent curve model, SCLMs show dynamic consistency—that is the model for the population mean response is equal to the average of the individual-level effects—in contrast with fully non-linear models where this equivalence is not imposed (for an in-depth treatment of these issues and how they impact model interpretation, see Blozis & Harring, Reference Blozis and Harring2016; Harring & Blozis, Reference Harring and Blozis2016).
While differing in this way from fully nonlinear mixed-effects models, adopting the SLCM framework allows us to take advantage of the full flexibility of the SEM to model additional complexities to the core linearized function, including covariates (Curran et al., Reference Curran, Bauer and Willoughby2004; Preacher & Hancock, Reference Preacher and Hancock2015), distal outcomes (McCormick et al., Reference McCormick, Byrne, Flournoy, Mills and Pfeifer2023; McCormick et al., Reference McCormick, Curran and Hancock2024), and approaches for parameter moderation (Bauer, Reference Bauer2017). We can next turn to the alternative polynomials as examples of this process.
4.6.1 Alternative polynomial models
Here I will focus on the alternative cubic, as a linearized version of the alternative quadratic model (Cudeck & du Toit, Reference Cudeck and du Toit2002) has been demonstrated previously (Preacher & Hancock, Reference Preacher and Hancock2015), however, the code for both models can be found in the Supplementary Material. We can use either Equation 20 or Equation 21 as the target nonlinear equation from, so I will define the partial derivatives for all five nonlinear parameters (note only four are modeled at any given time). The partial derivatives for Equation 20 are

and for Equation 21 are

Once we have these partial derivatives, we can set each factor loading to the relevant expression at each value of
$x_{ti}$
Footnote
4
and then define the intercepts of the repeated measures as the mean of the target nonlinear function at that value of
$x_{ti}$
. The corresponding path diagram for both the alternative quadratic (A) and cubic (B) is presented in Figure 5. Note that this specification allows us to directly model the means, variances, and covariances of the interpretable parameters. Unlikely standard latent curve model specifications, the estimated parameters of the latent variables are used to define the numerical value of the factor loadings (remember that the partial derivatives are evaluated at the means of the other parameters).

Figure 5 Linearized SEM. We can directly model the nonlinear parameters (e.g.,
$\alpha _{x}$
and
$\delta $
) as random latent variables through a process of linearization where we set the factor loadings to partial derivatives of the target nonlinear function with respect to each modeled parameter and set the intercepts of the repeated measures to the mean of the target function.
To avoid unnecessary repetition with prior sections, I will leave the full parameter comparison to the Supplementary Material, but note that all the linearized models successfully capture the interpretable model parameters and have near-identical fit to the standard linear versions of the polynomial LCMs (
$\Delta \chi ^{2}_{quadratic} = {-7.713 \times 10^{-5}}$
;
$\Delta \chi ^{2}_{cubic,\, h} = {8.414 \times 10^{-7}}$
;
$\Delta \chi ^{2}_{cubic,\, \beta _{N}} = {-1.126 \times 10^{-6}}$
).
4.6.2 Multiphase polynomial models
Finally, I applied the linearization approach to the multiphase (or piecewise) cubic model outlined previously. Feng et al. (Reference Feng, Hancock and Harring2019) outlined a related approach to deal with floor and ceiling effects in modeled variables using a 3-phase linear model. Their approach reparameterized the standard multiphase linear model into a single equation by taking the median of the three linear functions—see Figure 1 and Equations 1–3 in Feng et al. (Reference Feng, Hancock and Harring2019) for complete details. Unfortunately, while this approach successfully models the multiphase trajectory for monotonic functions, like the 3-phase linear model, for nonmonotonic functions like the quadratic and cubic, it does not appropriately define the onset and offset of the phases (see the Supplementary Material for examples of the challenges).
To accomplish the same idea, I parameterized a multiphase function where rather than taking the median of the functions of each phase, I instead took the median of three quantities—
$(x_{N} - \delta )$
,
$x_{ti}$
, and
$(x_{N} + \delta )$
—which does change monotonically as a function of
$x_{ti}$
(Figure 6 B). Thus, when
$x_{ti}$
is
$< (x_{N} - \delta )$
or
$> (x_{N} + \delta )$
, its value is effectively fixed at those boundary points. This allows cubic change to only occur within the onset-offset boundaries (Figure 6 C). As highlighted in Feng et al. (Reference Feng, Hancock and Harring2019), the median of three monotonic functions can be computed as

where the minimum and maximum of two quantities are

By substituting Equation 23 or Equation 24 into these expressions, I derived the following forms for the multiphase cubic (see Supplementary Material for full derivation details):


Figure 6 Multiphase linearized SEM. (a) A canonical path diagram of the linearized SEM model for the multiphase cubic. (b) Because the cubic function is non-monotonic, we need to instead take the median of
$(x_{N} - \delta )$
,
$x_{ti}$
, and
$(x_{N} + \delta )$
, which is monotonic as a function of
$x_{ti}$
. (c) The boundaries formed for the predictor
$x_{ti}$
in (B) allow for the proper specification of the multiphase cubic function for
$y_{ti}$
. (d) The multiphase cubic model was fit to generated trajectories of cortical thinning (Fuhrmann et al., Reference Fuhrmann, Madsen, Johansen, Baaré and Kievit2022) which successfully recovered individual (grey) and group (black) trajectories.
I took these equations and linearized them using the same procedure as above (see Figure 6 A for a representative path diagram). I then generated cortical thickness trajectory data based on Fuhrmann et al. (Reference Fuhrmann, Madsen, Johansen, Baaré and Kievit2022), with 100 cases observed across 12 waves. The linearized SEM (
$\chi ^{2}_{model}$
= 84.299,
$\mathit {df}$
= 75, p = 0.217) captured the parameters of the growth trajectory, with an average onset of change at 12 (
$x_{N} - \delta $
= 12.028,
$\mathit {SE}$
= 0.201), an inflection point at 15 (
$x_{N}$
= 15.049,
$\mathit {SE}$
= 0.105) with a negative instantaneous slope (
$\beta _{N}$
= −0.337,
$\mathit {SE}$
= 0.022), and an offset of change at 18 years-of-age (
$x_{N} + \delta $
= 18.069,
$\mathit {SE}$
= 0.164; see Figure 6 D for model-implied individual and group trajectories; see the Supplementary Material for full code and output). This example highlights the extreme flexibility of the linearization approach for linear estimation with nonlinear inference (LENI)—any target nonlinear function can be transformed using the approach outlined above and accommodated within standard linear estimation software.
5 Recommendations for applied researchers
For applied research, the LENI suite of approaches offers exciting new opportunities for testing novel theoretical hypotheses with an ultimate eye toward the development of generative models for change over time. Hypotheses must be generated and tested on meaningful and interpretable developmental quantities (Preacher & Hancock, Reference Preacher and Hancock2015), and not narrowly restricted to linear parameter models that are the default in major software implementations. The potential of LENI is for researchers to be able to fit the model they want to test theoretically, while avoiding some of the issues inherent in nonlinear model estimation. Here I demonstrated both mixed-effects and structural equation model options for random-effects growth modeling, allowing researchers maximal flexibility in specifying the nonlinear model within broader modeling traditions that exist in different fields. For instance, both the mixed-effect and SEM LENI approaches allow for the additional inclusion of time-varying covariates through direct prediction of the within-person repeated measure outcome, as well as additional time-invariant predictors and distal outcomes at the between-person (factor or random effect) level.Footnote 5
Selecting between the various LENI approaches outlined may seem daunting, but ultimately the choice is likely far simpler than it appears. While there are many differences between mixed-effect and structural equation modeling approaches to longitudinal data analysis (McCormick et al., Reference McCormick, Byrne, Flournoy, Mills and Pfeifer2023; McNeish & Matta, Reference McNeish and Matta2018)—some important in the way that they model the repeated measures and others simply due to conventions or idiosyncratic discipline preferences — there are no additional issues related to the LENI set of approaches which would preference either class of modeling. For instance, time-unstructured data is more easily accommodated in mixed-effect approaches while multivariate models are more naturally fit within structural equation models, but this applies equally to linear and nonlinear/linearized versions of the two frameworks. As such, researchers should select the particular approach that is already suitable for testing their theoretical question and for accommodating the structure of the data that they have. LENI is here to help in that endeavor—not to impose additional restrictions.
To aid in the application of LENI approaches in substantive research, I have developed an R package (leni; https://github.com/E-M-McCormick/leni) which allows users to convert the output of linear regression and mixed-effects models into nonlinear estimates using the transformations highlighted here, as well as to generate lavaan (Rosseel, Reference Rosseel2012) syntax for linearized SEMs.
6 Conclusions
The linear estimation with nonlinear inference (LENI) approach is a broad framework that allows for the modeling of nonlinear parameters which represent theoretically interesting quantities while taking advantages of the well-behaved properties of linear parameter models for estimation. My goal was to offer a comprehensive introduction to 1) the motivation and approach for defining nonlinear models with interpretable parameters, 2) defining a set of transformation functions to convert linear mixed-effects models into nonlinear output, and 3) direct estimation of nonlinear parameters through a linearized SEM approach. This foundation of the LENI approach for modeling growth offers fertile ground for additional research and methodological development, with additional avenues for work on small-sample behavior, methods for increasing the reliability of random effect/factor (co)variances, optimal Bayesian approaches (e.g. transformation robustness when applied at the individual draw level), and the role of time coding for estimation and interpretation of the nonlinear parameters. Combining the theoretical perspective of the LENI approach—focusing on meaningful features of change over time—with its computational efficiencies in approximating complex nonlinear equations shows great promise for advancing developmental science and the analysis of longitudinal data broadly.
Supplementary material
The supplementary material for this article can be found at https://osf.io/5dmy3/.
Data availability statement
The code required to reproduce all analyses, as well as more-extensive derivations, can be found at https://doi.org/10.17605/OSF.IO/5DMY3.
Acknowledgements
EMM was supported with funds from the NWO (Nederlandse Organisatie voor Wetenschappelijk Onderzoek), Domain Social Sciences and Humanities (SSH) Sector Plan: Resilience in Youth, and Jacobs Foundation Fellowship 2023-1510-00. Special thanks go to supportive colleagues including Dr. G.686 R. Hancock, Dr. R. A. Kievit, and Dr. P. J. Curran for critical feedback on the article.
Competing interests
The author declares no competing interests.