We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A metric multidimensional scaling (MDS) procedure based on computer-subject interaction is developed, and an experiment designed to validate the procedure is presented. The interactive MDS system allows generalization of current MDS systems in two directions: (a) very large numbers of stimuli may be scaled; and (b) the scaling is performed with individual subjects, facilitating the investigation of individual as well as group processes. The experiment provided positive support for the interactive MDS system. Specifically, (a) individual data are amenable to meaningful interpretation, and they provide a tentative basis for quantitative investigation; and (b) grouped data provide meaningful interpretive and quantitative results which are equivalent to results from standard paired-comparisons methods.
It is often considered desirable to have the same ordering of the items by difficulty across different levels of the trait or ability. Such an ordering is an invariant item ordering (IIO). An IIO facilitates the interpretation of test results. For dichotomously scored items, earlier research surveyed the theory and methods of an invariant ordering in a nonparametric IRT context. Here the focus is on polytomously scored items, and both nonparametric and parametric IRT models are considered.
The absence of the IIO property in two nonparametric polytomous IRT models is discussed, and two nonparametric models are discussed that imply an IIO. A method is proposed that can be used to investigate whether empirical data imply an IIO. Furthermore, only two parametric polytomous IRT models are found to imply an IIO. These are the rating scale model (Andrich, 1978) and a restricted rating scale version of the graded response model (Muraki, 1990). Well-known models, such as the partial credit model (Masters, 1982) and the graded response model (Samejima, 1969), do no imply an IIO.
Certain assumptions and procedures basic to factor analysis are examined from the point of view of the mathematician. It is demonstrated that the Hotelling method does not yield meaningful traits, and an example from the theory of gas mixtures with convertible components is cited as evidence. The justification of current methods for determining the adequacy of the reproduction of a correlation matrix by a factorial matrix is questioned, and a x2 criterion, practical only for a small matrix, is proposed. By means of a hypothetical example from geometry, it is shown that results of a Hotelling analysis are necessarily relative to the population at hand. The factorial effects of the adjunction of a “total test” to a group of tests are considered. Some of the general considerations and questions raised are pertinent to types of analysis other than the Hotelling.
A new class of parametric models that generalize the multivariate probit model and the errors-in-variables model is developed to model and analyze ordinal data. A general model structure is assumed to accommodate the information that is obtained via surrogate variables. A hybrid Gibbs sampler is developed to estimate the model parameters. To obtain a rapidly converged algorithm, the parameter expansion technique is applied to the correlation structure of the multivariate probit models. The proposed model and method of analysis are demonstrated with real data examples and simulation studies.
Results of an experiment to obtain data on the consistency of the items of two forms of an Activity Preference Blank are presented. Both Form I and Form II, which was a revised edition of Form I, were administered twice, so consistency data are available for both forms. A sub-item is said to be consistent if a high proportion of men marked it the same way, M for preferred Most and L for preferred Least, on both administrations of the test. The data of the experiment were investigated to see what happens to the consistency of sub-items when the items are changed in context, when the number of sub-items in an item is reduced, and when the time-interval between the administration and the re-administration of the test is increased. The author also gives data on the consistency of the responses made to particular combinations of sub-items and data on item consistency when all sub-item combinations are taken into consideration.
Lord and Wingersky have developed a method for computing the asymptotic variance-covariance matrix of maximum likelihood estimates for item and person parameters under some restrictions on the estimates which are needed in order to fix the latent scale. The method is tedious, but can be simplified for the Rasch model when one is only interested in the item parameters. This is demonstrated here under a suitable restriction on the item parameter estimates.
In many areas of research, the round-robin design is used to study interpersonal judgments and behaviors. The resulting data are analyzed with the social relations model (SRM), whereby almost all previously published studies have used ANOVA-based methods or multilevel-based methods to obtain SRM parameter estimates. In this article, the SRM is embedded into the linear mixed model framework, and it is shown how restricted maximum likelihood can be employed to estimate the SRM parameters. It is also described how the effect of covariates on the SRM-specific effects can be estimated. An example is presented to illustrate the approach. We also present the results of a simulation study in which the performance of the proposed approach is compared to the ANOVA method.
Cognitive diagnosis models (CDMs) are useful statistical tools in cognitive diagnosis assessment. However, as many other latent variable models, the CDMs often suffer from the non-identifiability issue. This work gives the sufficient and necessary condition for identifiability of the basic DINA model, which not only addresses the open problem in Xu and Zhang (Psychometrika 81:625–649, 2016) on the minimal requirement for identifiability, but also sheds light on the study of more general CDMs, which often cover DINA as a submodel. Moreover, we show the identifiability condition ensures the consistent estimation of the model parameters. From a practical perspective, the identifiability condition only depends on the Q-matrix structure and is easy to verify, which would provide a guideline for designing statistically valid and estimable cognitive diagnosis tests.
Necessary and sufficient conditions for the existence and uniqueness of a solution of the so-called “unconditional” (UML) and the “conditional” (CML) maximum-likelihood estimation equations in the dichotomous Rasch model are given. The basic critical condition is essentially the same for UML and CML estimation. For complete data matrices A, it is formulated both as a structural property of A and in terms of the sufficient marginal sums. In case of incomplete data, the condition is equivalent to complete connectedness of a certain directed graph. It is shown how to apply the results in practical uses of the Rasch model.
Latent class models for cognitive diagnosis have been developed to classify examinees into one of the 2K attribute profiles arising from a K-dimensional vector of binary skill indicators. These models recognize that response patterns tend to deviate from the ideal responses that would arise if skills and items generated item responses through a purely deterministic conjunctive process. An alternative to employing these latent class models is to minimize the distance between observed item response patterns and ideal response patterns, in a nonparametric fashion that utilizes no stochastic terms for these deviations. Theorems are presented that show the consistency of this approach, when the true model is one of several common latent class models for cognitive diagnosis. Consistency of classification is independent of sample size, because no model parameters need to be estimated. Simultaneous consistency for a large group of subjects can also be shown given some conditions on how sample size and test length grow with one another.
The relationship between linear factor models and latent profile models is addressed within the context of maximum likelihood estimation based on the joint distribution of the manifest variables. Although the two models are well known to imply equivalent covariance decompositions, in general they do not yield equivalent estimates of the unconditional covariances. In particular, a 2-class latent profile model with Gaussian components underestimates the observed covariances but not the variances, when the data are consistent with a unidimensional Gaussian factor model. In explanation of this phenomenon we provide some results relating the unconditional covariances to the goodness of fit of the latent profile model, and to its excess multivariate kurtosis. The analysis also leads to some useful parameter restrictions related to symmetry.
Reliability captures the influence of error on a measurement and, in the classical setting, is defined as one minus the ratio of the error variance to the total variance. Laenen, Alonso, and Molenberghs (Psychometrika 73:443–448, 2007) proposed an axiomatic definition of reliability and introduced the RT coefficient, a measure of reliability extending the classical approach to a more general longitudinal scenario. The RT coefficient can be interpreted as the average reliability over different time points and can also be calculated for each time point separately. In this paper, we introduce a new and complementary measure, the so-called RΛ, which implies a new way of thinking about reliability. In a longitudinal context, each measurement brings additional knowledge and leads to more reliable information. The RΛ captures this intuitive idea and expresses the reliability of the entire longitudinal sequence, in contrast to an average or occasion-specific measure. We study the measure’s properties using both theoretical arguments and simulations, establish its connections with previous proposals, and elucidate its performance in a real case study.
A procedure for computing the power of the likelihood ratio test used in the context of covariance structure analysis is derived. The procedure uses statistics associated with the standard output of the computer programs commonly used and assumes that a specific alternative value of the parameter vector is specified. Using the noncentral Chi-square distribution, the power of the test is approximated by the asymptotic one for a sequence of local alternatives. The procedure is illustrated by an example. A Monte Carlo experiment also shows how good the approximation is for a specific case.
The present note illustrates the application of Lancaster & Hamdan's [1964] polychoric series method for estimating the correlation coefficient in contingency tables. A simple format for the calculations involved, using a desk calculator, is suggested and hence applied to a specific 3 × 3 contingency table.
The steady state of a simple reaction system has been shown to have some of the properties of a psychophysical discrimination system, including the possibility of deducing a generalized Weber-Fechner Law, both in integral form and in difference form. The Weber ratio so deduced is not constant, and its dependence on stimulus intensity is exhibited. The dependence of the difference limen on the internal threshold is discussed; it is found that in general there is a finite value of this threshold for which response is impossible. This critical threshold is lower for higher values of the reference stimulus intensity. Similarly, it is shown that the difference limen and the Weber ratio, for a fixed value of the threshold, become infinite (i.e., discrimination is impossible) for a value of the stimulus intensity which in general is finite.
In this paper, motivated by aspects of preregistration plans we discuss issues that we believe have important implications for how experiments are designed. To make possible valid inferences about the effects of a treatment in question, we first illustrate how economic theories can help allocate subjects across treatments in a manner that boosts statistical power. Using data from two laboratory experiments where subject behavior deviated sharply from theory, we show that the ex-post subject allocation to maximize statistical power is closer to these ex-ante calculations relative to traditional designs that balances the number of subjects across treatments. Finally, we call for increased attention to (i) the appropriate levels of the type I and type II errors for power calculations, and (ii) how experimenters consider balance in part by properly handling over-subscription to sessions.