We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
On the basis of previous mathematico-biophysical studies, a theory of visual perception is outlined. The theory is applied to the aesthetic values of polygonal patterns and found in fair agreement with experimental results.
On the basis of the idea of common elements to account for similarity, Restle [1959] developed a set-theoretical model on the generation of similarity judgments. Restle differentiates between his “more qualitative discussion” (p. 207), regarding stimuli as sets of elements and using no geometric concepts or assumptions, and metric quantitative developments of similarity, regarding stimuli as points in a geometric space. To the latter we can count the additive difference model of Beals, Krantz, & Tversky [1968] and Tversky & Krantz [1970].
Despite these differences the two models lead to the same conclusions on the characteristics of similarity judgments.
One-absorbing barrier random walks arising from Luce's nonlinear beta model for learning and a linear commuting-operator model (called the alpha model) are considered. Functional equations for various statistics are derived from the branching processes defined by the two models. Solutions to general functional equations, satisfied by statistics of the alpha and beta models, are obtained. The methods presented have application to other learning models.
Change scores obtained in pretest–posttest designs are important for evaluating treatment effectiveness and for assessing change of individual test scores in psychological research. However, over the years the use of change scores has raised much controversy. In this article, from a multilevel perspective, we provide a structured treatise on several persistent negative beliefs about change scores and show that these beliefs originated from the confounding of the effects of within-person change on change-score reliability and between-person change differences. We argue that psychometric properties of change scores, such as reliability and measurement precision, should be treated at suitable levels within a multilevel framework. We show that, if examined at the suitable levels with such a framework, the negative beliefs about change scores can be renounced convincingly. Finally, we summarize the conclusions about change scores to dispel the myths and to promote the potential and practical usefulness of change scores.
Focusing on Michael Silverstein’s account of relationships between “microcontexts of interaction” and the “macrosociological,” this article takes up his suggestion that news reporting provides particularly clear examples of such links. Examining a mundane ABC World News report on changing recommendations for vitamin intake, it analyzes how leading physician-journalist Richard Besser constructs a ritual center of medical semiosis, projects it as inaccessible to laypersons, and models a circulatory process that requires highly constrained forms of communication. Ethnography in newsrooms, clinical spaces, public health offices, and elsewhere suggests how notions of (1) a ritual center that produces medical knowledge, (2) a primordial space of doctor-patient interaction that affords limited, highly regulated access to laypersons, and (3) what are construed as processes of communication require the continual making of communicable models that attempt to separate projected first and second indexical orders and, just as importantly, generate indexical disorders that create anxiety and seem to require assistance from physician-journalist guides.
In this paper, I will review some aspects of psychometric projects that I have been involved in, emphasizing the nature of the work of the psychometricians involved, especially the balance between the statistical and scientific elements of that work. The intent is to seek to understand where psychometrics, as a discipline, has been and where it might be headed, in part at least, by considering one particular journey (my own). In contemplating this, I also look to psychometrics journals to see how psychometricians represent themselves to themselves, and in a complementary way, look to substantive journals to see how psychometrics is represented there (or perhaps, not represented, as the case may be). I present a series of questions in order to consider the issue of what are the appropriate foci of the psychometric discipline. As an example, I present one recent project at the end, where the roles of the psychometricians and the substantive researchers have had to become intertwined in order to make satisfactory progress. In the conclusion I discuss the consequences of such a view for the future of psychometrics.
The axioms of additive conjoint measurement provide a means of testing the hypothesis that testing data can be placed onto a scale with equal-interval properties. However, the axioms are difficult to verify given that item responses may be subject to measurement error. A Bayesian method exists for imposing order restrictions from additive conjoint measurement while estimating the probability of a correct response. In this study an improved version of that methodology is evaluated via simulation. The approach is then applied to data from a reading assessment intentionally designed to support an equal-interval scaling.
The following problem is considered: Given that the frequency distribution of the errors of measurement is known, determine or estimate the distribution of true scores from the distribution of observed scores for a group of examinees. Typically this problem does not have a unique solution. However, if the true-score distribution is “smooth,” then any two smooth solutions to the problem will differ little from each other. Methods for finding smooth solutions are developed a) for a population and b) for a sample of examinees. The results of a number of tryouts on actual test data are summarized.
Protocol analysis, in the form of concurrent verbal ‘thinking aloud’ reports, is a method of collecting and analyzing data about cognitive processes. This approach can help economists in evaluating competing theories of behavior and in categorizing heterogeneity of thinking patterns. As a proof of concept, I tested this method in the context of a guessing game. I found that concurrent think aloud protocols can inform us about individual’s thought processes without affecting decisions. The method allowed me to identify game theoretic thinking and heterogeneous approaches to unravelling the guessing game. The think aloud protocol is inexpensive and scalable, and it is a useful tool for identifying empirical regularities regarding decision processes.
A fundamental assumption of most IRT models is that items measure the same unidimensional latent construct. For the polytomous Rasch model two ways of testing this assumption against specific multidimensional alternatives are discussed. One, a marginal approach assuming a multidimensional parametric latent variable distribution, and, two, a conditional approach with no distributional assumptions about the latent variable. The second approach generalizes the Martin-Löf test for the dichotomous Rasch model in two ways: to polytomous items and to a test against an alternative that may have more than two dimensions. A study on occupational health is used to motivate and illustrate the methods.
Current psychometric models of choice behavior are strongly influenced by Thurstone’s (1927, 1931) experimental and statistical work on measuring and scaling preferences. Aided by advances in computational techniques, choice models can now accommodate a wide range of different data types and sources of preference variability among respondents induced by such diverse factors as person-specific choice sets or different functional forms for the underlying utility representations. At the same time, these models are increasingly challenged by behavioral work demonstrating the prevalence of choice behavior that is not consistent with the underlying assumptions of these models. I discuss new modeling avenues that can account for such seemingly inconsistent choice behavior and conclude by emphasizing the interdisciplinary frontiers in the study of choice behavior and the resulting challenges for psychometricians.
A model for direct multidimensional ratio scaling is presented, based on the concepts “common” and “difference” of the “halos” of two percepts. Measures of halos and their differences are proportional to lengths of corresponding percept vectors and their distance in subjective space. Ekman type scaling judgements are assumed to reflect the ratio measure of the common/measure of the standard's halo. The model is supposed to yield results that are in line with the results of distance models of multidimensional ratio scaling since negative scalar products of percept vectors are admitted.
In oligopoly, imitating the most successful competitor yields very competitive outcomes. This theoretical prediction has been confirmed experimentally by a number of studies. A recent paper by Friedman et al. (J Econ Theory 155:185–205, 2015) qualifies those results in an interesting way: While they replicate the very competitive results for the first 25–50 periods, they show that when using a much longer time horizon of 1200 periods, results slowly turn to more and more collusive outcomes. We replicate their result for duopolies. However, with 4 firms, none of our oligopolies becomes permanently collusive. Instead, the average quantity always stays above the Cournot–Nash equilibrium quantity. Thus, it seems that “four remain many” even with 1200 periods.
The Rasch model is an item analysis model with logistic item characteristic curves of equal slope, i.e. with constant item discriminating powers. The proposed goodness of fit test is based on a comparison between difficulties estimated from different scoregroups and over-all estimates.
Based on the within scoregroup estimates and the over-all estimates of item difficulties a conditional likelihood ratio is formed. It is shown that—2 times the logarithm of this ratio is x2-distributed when the Rasch model is true.
The power of the proposed goodness of fit test is discussed for alternative models with logistic item characteristic curves, but unequal discriminating items from a scholastic aptitude test.
A logistic model developed by Birnbaum was tested in two ways. First, plots of proportions of subjects in different score categories were examined for consistency with the assumption of a logistic trace line, and especially for departures from the logistic which seemed due to guessing in multiple choice items. The results showed that guessing seemed to have little effect. Second, an attempt was made to predict the obtained score distributions of samples of subjects on six tests from item parameters estimated on independent samples. The fits were good in all cases, despite considerable differences between the tests, and some extremely odd distributions.
Consider any scoring procedure for determining whether an examinee knows the answer to a test item. Let xi = 1 if a correct decision is made about whether the examinee knows the i th item; otherwise xi = 0. The k out of n reliability of a test is ρk = Pr (Σxi ≥k). That is, ρk is the probability of making at least k correct decisions for a typical (randomly sampled) examinee. This paper proposes an approximation of ρk that can be estimated with an answer-until-correct test. The paper also suggests a scoring procedure that might be used when ρk is judged to be too small under a conventional scoring rule where it is decided an examinee knows if and only if the correct response is given.
A coefficient of association τ′ is described for a contingency table containing data classified into two sets of ordered categories. Within each of the two sets the number of categories or the number of cases in each category need not be the same. τ′=+1 for perfect positive association and has an expectation of 0 for chance association. In many cases τ′ also has a -1 as a lower limit. The limitations of Kendall&s τ′a and τ′b and Stuart&s τ′c are discussed, as is the identity of these coefficients to τ′ under certain conditions. Computational procedure for τ′ is given.
A remarkable difference between the concept of rank for matrices and that for three-way arrays has to do with the occurrence of non-maximal rank. The set of n × n matrices that have a rank less than n has zero volume. Kruskal pointed out that a 2 × 2 × 2 array has rank three or less, and that the subsets of those 2 × 2 × 2 arrays for which the rank is two or three both have positive volume. These subsets can be distinguished by the roots of a certain polynomial. The present paper generalizes Kruskal's results to 2 × n × n arrays. Incidentally, it is shown that two n × n matrices can be diagonalized simultaneously with positive probability.