We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Clinical high-risk for psychosis (CHR-P) states exhibit diverse clinical presentations, prompting a shift towards broader outcome assessments beyond psychosis manifestation. To elucidate more uniform clinical profiles and their trajectories, we investigated CHR-P profiles in a community sample.
Methods
Participants (N = 829; baseline age: 16–40 years) comprised individuals from a Swiss community sample who were followed up over roughly 3 years. latent class analysis was applied to CHR-P symptom data at baseline and follow-up, and classes were examined for demographic and clinical differences, as well as stability over time.
Results
Similar three-class solutions were yielded for both time points. Class 1 was mainly characterized by subtle, subjectively experienced disturbances in mental processes, including thinking, speech and perception (basic symptoms [BSs]). Class 2 was characterized by subthreshold positive psychotic symptoms (i.e., mild delusions or hallucinations) indicative of an ultra-high risk for psychosis. Class 3, the largest group (comprising over 90% of participants), exhibited the lowest probability of experiencing any psychosis-related symptoms (CHR-P symptoms). Classes 1 and 2 included more participants with functional impairment and psychiatric morbidity. Class 3 participants had a low probability of having functional deficits or mental disorders at both time points, suggesting that Class 3 was the healthiest group and that their mental health and functioning remained stable throughout the study period. While 91% of Baseline Class 3 participants remained in their class over time, most Baseline Classes 1 (74%) and Class 2 (88%) participants moved to Follow-up Class 3.
Conclusions
Despite some temporal fluctuations, CHR-P symptoms within community samples cluster into distinct subgroups, reflecting varying levels of symptom severity and risk profiles. This clustering highlights the largely distinct nature of BSs and attenuated positive symptoms within the community. The association of Classes 1 and 2 with Axis-I disorders and functional deficits emphasizes the clinical significance of CHR-P symptoms. These findings highlight the need for personalized preventive measures targeting specific risk profiles in community-based populations.
We present an hierarchical Bayes approach to modeling parameter heterogeneity in generalized linear models. The model assumes that there are relevant subpopulations and that within each subpopulation the individual-level regression coefficients have a multivariate normal distribution. However, class membership is not known a priori, so the heterogeneity in the regression coefficients becomes a finite mixture of normal distributions. This approach combines the flexibility of semiparametric, latent class models that assume common parameters for each sub-population and the parsimony of random effects models that assume normal distributions for the regression parameters. The number of subpopulations is selected to maximize the posterior probability of the model being true. Simulations are presented which document the performance of the methodology for synthetic data with known heterogeneity and number of sub-populations. An application is presented concerning preferences for various aspects of personal computers.
This commentary addresses the modeling and final analytical path taken, as well as the terminology used, in the paper “Hierarchical diagnostic classification models: a family of models for estimating and testing attribute hierarchies” by Templin and Bradshaw (Psychometrika, doi:10.1007/s11336-013-9362-0, 2013). It raises several issues concerning use of cognitive diagnostic models that either assume attribute hierarchies or assume a certain form of attribute interactions. The issues raised are illustrated with examples, and references are provided for further examination.
Diagnostic classification models are confirmatory in the sense that the relationship between the latent attributes and responses to items is specified or parameterized. Such models are readily interpretable with each component of the model usually having a practical meaning. However, parameterized diagnostic classification models are sometimes too simple to capture all the data patterns, resulting in significant model lack of fit. In this paper, we attempt to obtain a compromise between interpretability and goodness of fit by regularizing a latent class model. Our approach starts with minimal assumptions on the data structure, followed by suitable regularization to reduce complexity, so that readily interpretable, yet flexible model is obtained. An expectation–maximization-type algorithm is developed for efficient computation. It is shown that the proposed approach enjoys good theoretical properties. Results from simulation studies and a real application are presented.
Several articles in the past fifteen years have suggested various models for analyzing dichotomous test or questionnaire items which were constructed to reflect an assumed underlying structure. This paper shows that many models are special cases of latent class analysis. A currently available computer program for latent class analysis allows parameter estimates and goodness-of-fit tests not only for the models suggested by previous authors, but also for many models which they could not test with the more specialized computer programs they developed. Several examples are given of the variety of models which may be generated and tested. In addition, a general framework for conceptualizing all such models is given. This framework should be useful for generating models and for comparing various models.
A probabilistic choice model is developed for paired comparisons data about psychophysical stimuli. The model is based on Thurstone's Law of Comparative Judgment Case V and assumes that each stimulus is measured on a small number of physical variables. The utility of a stimulus is related to its values on the physical variables either by means of an additive univariate spline model or by means of multivariate spline model. In the additive univariate spline model, a separate univariate spline transformation is estimated for each physical dimension and the utility of a stimulus is assumed to be an additive combination of these transformed values. In the multivariate spline model, the utility of a stimulus is assumed to be a general multivariate spline function in the physical variables. The use of B splines for estimating the transformation functions is discussed and it is shown how B splines can be generalized to the multivariate case by using as basis functions tensor products of the univariate basis functions. A maximum likelihood estimation procedure for the Thurstone Case V model with spline transformation is described and applied for illustrative purposes to various artificial and real data sets. Finally, the model is extended using a latent class approach to the case where there are unreplicated paired comparisons data from a relatively large number of subjects drawn from a heterogeneous population. An EM algorithm for estimating the parameters in this extended model is outlined and illustrated on some real data.
The standard tobit or censored regression model is typically utilized for regression analysis when the dependent variable is censored. This model is generalized by developing a conditional mixture, maximum likelihood method for latent class censored regression. The proposed method simultaneously estimates separate regression functions and subject membership in K latent classes or groups given a censored dependent variable for a cross-section of subjects. Maximum likelihood estimates are obtained using an EM algorithm. The proposed method is illustrated via a consumer psychology application.
A normally distributed person-fit index is proposed for detecting aberrant response patterns in latent class models and mixture distribution IRT models for dichotomous and polytomous data.
This article extends previous work on the null distribution of person-fit indices for the dichotomous Rasch model to a number of models for categorical data. A comparison of two different approaches to handle the skewness of the person-fit index distribution is included.
A reparameterization of a latent class model is presented to simultaneously classify and scale nominal and ordered categorical choice data. Latent class-specific probabilities are constrained to be equal to the preference probabilities from a probabilistic ideal-point or vector model that yields a graphical, multidimensional representation of the classification results. In addition, background variables can be incorporated as an aid to interpreting the latent class-specific response probabilities. The analyses of synthetic and real data sets illustrate the proposed method.
The new software package OpenMx 2.0 for structural equation and other statistical modeling is introduced and its features are described. OpenMx is evolving in a modular direction and now allows a mix-and-match computational approach that separates model expectations from fit functions and optimizers. Major backend architectural improvements include a move to swappable open-source optimizers such as the newly written CSOLNP. Entire new methodologies such as item factor analysis and state space modeling have been implemented. New model expectation functions including support for the expression of models in LISREL syntax and a simplified multigroup expectation function are available. Ease-of-use improvements include helper functions to standardize model parameters and compute their Jacobian-based standard errors, access to model components through standard R $ mechanisms, and improved tab completion from within the R Graphical User Interface.
A multidimensional unfolding model is developed that assumes that the subjects can be clustered into a small number of homogeneous groups or classes. The subjects that belong to the same group are represented by a single ideal point. Since it is not known in advance to which group or class a subject belongs, a mixture distribution model is formulated that can be considered as a latent class model for continuous single stimulus preference ratings. A GEM algorithm is described for estimating the parameters in the model. The M-step of the algorithm is based on a majorization procedure for updating the estimates of the spatial model parameters. A strategy for selecting the appropriate number of classes and the appropriate number of dimensions is proposed and fully illustrated on some artificial data. The latent class unfolding model is applied to political science data concerning party preferences from members of the Dutch Parliament. Finally, some possible extensions of the model are discussed.
A general approach for analyzing rating data with latent class models is described, which parallels rating models in the framework of latent trait theory. A general rating model as well as a two-parameter model with location and dispersion parameters, analogous to Andrich's Dislocmodel are derived, including parameter estimation via the EM-algorithm. Two examples illustrate the application of the models and their statisticalcontrol. Model restrictions through equality constrains are discussed and multiparameter generalizations are outlined.
This paper presents a synthesis of Bock's (1972) nominal categories model and Luce's (1959) choice model for mixed-effects analyses of rank-ordered data. It is shown that the proposed ranking model is both parsimonious and flexible in accounting for preference heterogeneity as well as fixed and random effects of covariates. Relationships to other approaches, including Takane's (1987) ideal point discriminant model and Croon's (1989) latent-class version of Luce's ranking model, are also discussed. The application focuses on a ranking study of behavioral traits that parents find desirable in children.
In the Netherlands, national assessments at the end of primary school (Grade 6) show a decline of achievement on problems of complex or written arithmetic over the last two decades. The present study aims at contributing to an explanation of the large achievement decrease on complex division, by investigating the strategies students used in solving the division problems in the two most recent assessments carried out in 1997 and in 2004. The students’ strategies were classified into four categories. A data set resulted with two types of repeated observations within students: the nominal strategies and the dichotomous achievement scores (correct/incorrect) on the items administered.
It is argued that latent variable modeling methodology is appropriate to analyze these data. First, latent class analyses with year of assessment as a covariate were carried out on the multivariate nominal strategy variables. Results showed a shift from application of the traditional long division algorithm in 1997, to the less accurate strategy of stating an answer without writing down any notes or calculations in 2004, especially for boys. Second, explanatory IRT analyses showed that the three main strategies were significantly less accurate in 2004 than they were in 1997.
Latent class models for cognitive diagnosis often begin with specification of a matrix that indicates which attributes or skills are needed for each item. Then by imposing restrictions that take this into account, along with a theory governing how subjects interact with items, parametric formulations of item response functions are derived and fitted. Cluster analysis provides an alternative approach that does not require specifying an item response model, but does require an item-by-attribute matrix. After summarizing the data with a particular vector of sum-scores, K-means cluster analysis or hierarchical agglomerative cluster analysis can be applied with the purpose of clustering subjects who possess the same skills. Asymptotic classification accuracy results are given, along with simulations comparing effects of test length and method of clustering. An application to a language examination is provided to illustrate how the methods can be implemented in practice.
In categorical data analysis, two-sample cross-validation is used not only for model selection but also to obtain a realistic impression of the overall predictive effectiveness of the model. The latter is of particular importance in the case of highly parametrized models capable of capturing every idiosyncracy of the calibrating sample. We show that for maximum likelihood estimators or other asymptotically efficient estimators Pearson’s X2 is not asymptotically chi-square in the two-sample cross-validation framework due to extra variability induced by using different samples for estimation and goodness-of-fit testing. We propose an alternative test statistic, X2xval, obtained as a modification of X2 which is asymptotically chi-square with C - 1 degrees of freedom in cross-validation samples. Stochastically, X2xval≤ X2. Furthermore, the use of X2 instead of X2xval with a χ2C - 1 reference distribution may provide an unduly poor impression of fit of the model in the cross-validation sample.
In this paper, we propose a cluster-MDS model for two-way one-mode continuous rating dissimilarity data. The model aims at partitioning the objects into classes and simultaneously representing the cluster centers in a low-dimensional space. Under the normal distribution assumption, a latent class model is developed in terms of the set of dissimilarities in a maximum likelihood framework. In each iteration, the probability that a dissimilarity belongs to each of the blocks conforming to a partition of the original dissimilarity matrix, and the rest of parameters, are estimated in a simulated annealing based algorithm. A model selection strategy is used to test the number of latent classes and the dimensionality of the problem. Both simulated and classical dissimilarity data are analyzed to illustrate the model.
As the literature indicates, no method is presently available which takes explicitly into account that the parameters of Lazarsfeld's latent class analysis are defined as probabilities and are therefore restricted to the interval [0, 1]. In the present paper an appropriate transform on the parameters is performed in order to satisfy this constraint, and the estimation of the transformed parameters according to the maximum likelihood principle is outlined. In the sequel, a numerical example is given for which the basis solution and the usual maximum likelihood method failed. The different results are compared and the advantages of the proposed method discussed.
Finite mixture models are widely used in the analysis of growth trajectory data to discover subgroups of individuals exhibiting similar patterns of behavior over time. In practice, trajectories are usually modeled as polynomials, which may fail to capture important features of the longitudinal pattern. Focusing on dichotomous response measures, we propose a likelihood penalization approach for parameter estimation that is able to capture a variety of nonlinear class mean trajectory shapes with higher precision than maximum likelihood estimates. We show how parameter estimation and inference for whether trajectories are time-invariant, linear time-varying, or nonlinear time-varying can be carried out for such models. To illustrate the method, we use simulation studies and data from a long-term longitudinal study of children at high risk for substance abuse.
There are two main theories with respect to the development of spelling ability: the stage model and the model of overlapping waves. In this paper exploratory model based clustering will be used to analyze the responses of more than 3500 pupils to subsets of 245 items. To evaluate the two theories, the resulting clusters will be ordered along a developmental dimension using an external criterion. Solutions for three statistical problems will be given: (1) an algorithm that can handle large data sets and only renders non-degenerate clusters; (2) a goodness of fit test that is not affected by the fact that the number of possible response vectors by far out-weights the number of observed response vectors; and (3) a new technique, data expunction, that can be used to evaluate goodness-of-fit tests if the missing data mechanism is known.