We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Fully Bayesian estimation of item response theory models with logistic link functions suffers from low computational efficiency due to posterior density functions that do not have known forms. To improve algorithmic computational efficiency, this paper proposes a Bayesian estimation method by adopting a new data-augmentation strategy in uni- and multidimensional IRT models. The strategy is based on the Pólya–Gamma family of distributions which provides a closed-form posterior distribution for logistic-based models. In this paper, an overview of Pólya–Gamma distributions is described within a logistic regression framework. In addition, we provide details about deriving conditional distributions of IRT, incorporating Pólya–Gamma distributions into the conditional distributions for Bayesian samplers’ construction, and random drawing from the samplers such that a faster convergence can be achieved. Simulation studies and applications to real datasets were conducted to demonstrate the efficiency and utility of the proposed method.
Quantitative psychology is concerned with the development and application of mathematical models in the behavioral sciences. Over time, models have become more complex, a consequence of the increasing complexity of research designs and experimental data, which is also a consequence of the utility of mathematical models in the science. As models have become more elaborate, the problems of estimating them have become increasingly challenging. This paper gives an introduction to a computing tool called automatic differentiation that is useful in calculating derivatives needed to estimate a model. As its name implies, automatic differentiation works in a routine way to produce derivatives accurately and quickly. Because so many features of model development require derivatives, the method has considerable potential in psychometric work. This paper reviews several examples to demonstrate how the methodology can be applied.
Garbarino et al. (J Econ Sci Assoc. https://doi.org/10.1007/s40881-018-0055-4, 2018) describe a new method to calculate the probability distribution of the proportion of lies told in “coin flip” style experiments. I show that their estimates and confidence intervals are flawed. I demonstrate two better ways to estimate the probability distribution of what we really care about—the proportion of liars—and I provide R software to do this.
Methodological development of the model-implied instrumental variable (MIIV) estimation framework has proved fruitful over the last three decades. Major milestones include Bollen’s (Psychometrika 61(1):109–121, 1996) original development of the MIIV estimator and its robustness properties for continuous endogenous variable SEMs, the extension of the MIIV estimator to ordered categorical endogenous variables (Bollen and Maydeu-Olivares in Psychometrika 72(3):309, 2007), and the introduction of a generalized method of moments estimator (Bollen et al., in Psychometrika 79(1):20–50, 2014). This paper furthers these developments by making several unique contributions not present in the prior literature: (1) we use matrix calculus to derive the analytic derivatives of the PIV estimator, (2) we extend the PIV estimator to apply to any mixture of binary, ordinal, and continuous variables, (3) we generalize the PIV model to include intercepts and means, (4) we devise a method to input known threshold values for ordinal observed variables, and (5) we enable a general parameterization that permits the estimation of means, variances, and covariances of the underlying variables to use as input into a SEM analysis with PIV. An empirical example illustrates a mixture of continuous variables and ordinal variables with fixed thresholds. We also include a simulation study to compare the performance of this novel estimator to WLSMV.
Nonlinear random coefficient models (NRCMs) for continuous longitudinal data are often used for examining individual behaviors that display nonlinear patterns of development (or growth) over time in measured variables. As an extension of this model, this study considers the finite mixture of NRCMs that combine features of NRCMs with the idea of finite mixture (or latent class) models. The efficacy of this model is that it allows the integration of intrinsically nonlinear functions where the data come from a mixture of two or more unobserved subpopulations, thus allowing the simultaneous investigation of intra-individual (within-person) variability, inter-individual (between-person) variability, and subpopulation heterogeneity. Effectiveness of this model to work under real data analytic conditions was examined by executing a Monte Carlo simulation study. The simulation study was carried out using an R routine specifically developed for the purpose of this study. The R routine used maximum likelihood with the expectation–maximization algorithm. The design of the study mimicked the output obtained from running a two-class mixture model on task completion data.
This research concerns a mediation model, where the mediator model is linear and the outcome model is also linear but with a treatment–mediator interaction term and a residual correlated with the residual of the mediator model. Assuming the treatment is randomly assigned, parameters in this mediation model are shown to be partially identifiable. Under the normality assumption on the residual of the mediator and the residual of the outcome, explicit full-information maximum likelihood estimates of model parameters are introduced given the correlation between the residual for the mediator and the residual for the outcome. A consistent variance matrix of these estimates is derived. Currently, the coefficients of this mediation model are estimated using the iterative feasible generalized least squares (IFGLS) method that is originally developed for seemingly unrelated regressions (SURs). We argue that this mediation model is not a system of SURs. While the IFGLS estimates are consistent, their variance matrix is not. Theoretical comparisons of the FIMLE variance matrix and the IFGLS variance matrix are conducted. Our results are demonstrated by simulation studies and an empirical study. The FIMLE method has been implemented in a freely available R package iMediate.
An observer is to make inference statements about a quantity p, called a propensity and bounded between 0 and 1, based on the observation that p does or does not exceed a constant c. The propensity p may have an interpretation as a proportion, as a long-run relative frequency, or as a personal probability held by some subject. Applications in medicine, engineering, political science, and, most especially, human decision making are indicated. Bayes solutions for the observer are obtained based on prior distributions in the mixture of beta distribution family; these are then specialized to power-function prior distributions. Inference about log p and log odds is considered. Multiple-action problems are considered in which the focus of inference shifts to the process generating the propensities p, both in the case of a process parameter π known to the subject and unknown. Empirical Bayes techniques are developed for observer inference about c when π is known to the subject. A Bayes rule, a minimax rule and a beta-minimax rule are constructed for the subject when he is uncertain about π.
This chapter introduces communication and information theoretical aspects of molecular communication, relating molecular communication to existing techniques and results in communication systems. Communication models are discussed, as well as detection and estimation problems. The information theory of molecular communication is introduced, and calculation of the Shannon capacity is discussed.
Research on advice taking has demonstrated a phenomenon of egocentric discounting: people weight their own estimates more than advice from others. However, this research is mostly conducted in highly controlled lab settings with low or no stakes. We used unique data from a game show on Norwegian television to investigate advice taking in a high stakes and highly public setting. Parallel to the standard procedure in judge–advisor systems studies, contestants give numerical estimates for several tasks and solicit advice (another estimate) from three different sources during the game. The average weight of advice was 0.58, indicating that contestants weighted advice more than their own estimates. Of potential predictors of weight of advice, we did not detect associations with the use of intuition (e.g., gut feeling, guessing) and advice source (family, celebrities, average of viewers from hometown), but own estimation success (the proportion of previous rounds won) was associated with less weight of advice. Solicitation of advice was associated with higher stakes. Together with the relatively high weight on advice, this suggests that participants considered the advice valuable. On average, estimates did not improve much after advice taking, and the potential for improvement by averaging estimates and advice was negligible. We discuss different factors that could contribute to these findings, including stakes, solicited versus unsolicited advice, task difficulty, and high public scrutiny. The results suggest that highly controlled lab studies may not give an accurate representation of advice taking in high stakes and highly public settings.
This chapter elaborates on the calibration and validation procedures for the model. First, we describe our calibration strategy in which a customised optimisation algorithm makes use of a multi-objective function, preventing the loss of indicator-specific error information. Second, we externally validate our model by replicating two well-known statistical patterns: (1) the skewed distribution of budgetary changes and (2) the negative relationship between development and corruption. Third, we internally validate the model by showing that public servants who receive more positive spillovers tend to be less efficient. Fourth, we analyse the statistical behaviour of the model through different tests: validity of synthetic counterfactuals, parameter recovery, overfitting, and time equivalence. Finally, we make a brief reference to the literature on estimating SDG networks.
The polymer model provides a relatively simple and robust basis for estimating the standard Gibbs free energies of formation (ΔGfo) and standard enthalpies of formation (ΔHfo) of clay minerals and other aluminosilicates with an accuracy that is comparable to or better than can be obtained using alternative techniques. The model developed in the present study for zeolites entailed the selection of internally consistent standard thermodynamic properties for model components, calibration of adjustable model parameters using a linear regression technique constrained by ΔGfo and ΔHfo values retrieved from calorimetric, solubility, and phase-equilibrium experiments, and assessments of model accuracy based on comparisons of predicted values with experimental counterparts not included in the calibration dataset. The ΔGfo and ΔHfo predictions were found to average within ±0.2% and ±0.3%, respectively, of experimental values at 298.15 K and 1 bar. The latter result is comparable to the good accuracy that has been obtained by others using a more rigorous electronegativity-based model for ΔHfo that accounts explicitly for differences in zeolite structure based on differences in framework density and unit-cell volume. This observation is consistent with recent calorimetric studies indicating that enthalpies of transition from quartz to various pure-silica zeolite frameworks (zeosils) are small and only weakly dependent on framework type, and suggests that the effects on ΔHfo of differences in framework topology can be ignored for estimation purposes without incurring a significant loss of accuracy. The relative simplicity of the polymer model, together with its applicability to both zeolites and clay minerals, is based on a common set of experimentally determined and internally consistent thermodynamic properties for model components. These attributes are particularly well suited for studies of the effects of water-rock-barrier interactions on the long-term safety of geologic repositories for high-level nuclear waste (HLW).
For this book, we assume you’ve had an introductory statistics or experimental design class already! This chapter is a mini refresher of some critical concepts we’ll be using and lets you check you understand them correctly. The topics include understanding predictor and response variables, the common probability distributions that biologists encounter in their data, the common techniques, particularly ordinary least squares (OLS) and maximum likelihood (ML), for fitting models to data and estimating effects, including their uncertainty. You should be familiar with confidence intervals and understand what hypothesis tests and P-values do and don’t mean. You should recognize that we use data to decide, but these decisions can be wrong, so you need to understand the risk of missing important effects and the risk of falsely claiming an effect. Decisions about what constitutes an “important” effect are central.
In this chapter we introduce and apply hidden Markov models to model and analyze dynamical data. Hidden Markov models are one of simplest of dynamical models valid for systems evolving in a discrete state-space at discrete time points. We first describe the evaluation of the likelihood relevant to hidden Markov models and introduce the concept of filtering. We then describe how to obtain maximum likelihood estimators using expectation maximization. We then broaden our discussion to the Bayesian paradigm and introduce the Bayesian hidden Markov model. In this context, we describe the forward filtering backward sampling algorithm and Monte Carlo methods for sampling from hidden Markov model posteriors. As hidden Markov models are flexible modeling tools, we present a number of variants including the sticky hidden Markov model, the factorial hidden Markov model, and the infinite hidden Markov model. Finally, we conclude with a case study in fluorescence spectroscopy where we show how the basic filtering theory presented earlier may be extended to evaluate the likelihood of a second-order hidden Markov model.
A relatively novel approach of autonomous navigation employing platform dynamics as the primary process model raises new implementational challenges. These are related to: (i) potential numerical instabilities during longer flights; (ii) the quality of model self-calibration and its applicability to different flights; (iii) the establishment of a global estimation methodology when handling different initialisation flight phases; and (iv) the possibility of reducing computational load through model simplification. We propose a unified strategy for handling different flight phases with a combination of factorisation and a partial Schmidt–Kalman approach. We then investigate the stability of the in-air initialisation and the suitability of reusing pre-calibrated model parameters with their correlations. Without GNSS updates, we suggest setting a subset of the state vector as ‘considered’ states within the filter to remove their estimation from the remaining observations. We support all propositions with new empirical evidence: first in model-parameter self-calibration via optimal smoothing and second through applying our methods on three test flights with dissimilar durations and geometries. Our experiments demonstrate a significant improvement in autonomous navigation quality for twelve different scenarios.
This chapter discusses the key elements involved when building a study. Planning empirical studies presupposes a decision about whether the major goal of the study is confirmatory (i.e., tests of hypotheses) or exploratory in nature (i.e., development of hypotheses or estimation of effects). Focusing on confirmatory studies, we discuss problems involved in obtaining an appropriate sample, controlling internal and external validity when designing the study, and selecting statistical hypotheses that mirror the substantive hypotheses of interest. Building a study additionally involves decisions about the to-be-employed statistical test strategy, the sample size required by this strategy to render the study informative, and the most efficient way to achieve this so that study costs are minimized without compromising the validity of inferences. Finally, we point to the many advantages of study preregistration before data collection begins.
Cognitive diagnosis models originated in the field of educational measurement as a psychometric tool to provide finer-grained information more suitable for formative assessment. Typically,but not necessarily, these models classify examinees as masters or nonmasters on a set of binary attributes. This chapter aims to provide a general overview of the original models and the extensions, and methodological developments, that have been made in the last decade. The main topics covered in this chapter include model estimation, Q-matrix specification, model fit evaluation, and procedures for gathering validity and reliability evidences. The chapter ends with a discussion of future trends in the field.
The dynamics and fusion of vesicles during the last steps of exocytosis are not well established yet in cell biology. An open issue is the characterization of the diffusion process at the plasma membrane. Total internal reflection fluorescence microscopy (TIRFM) has been successfully used to analyze the coordination of proteins involved in this mechanism. It enables to capture dynamics of proteins with high frame rate and reasonable signal-to-noise values. Nevertheless, methodological approaches that can analyze and estimate diffusion in local small areas at the scale of a single diffusing spot within cells, are still lacking. To address this issue, we propose a novel correlation-based method for local diffusion estimation. As a starting point, we consider Fick’s second law of diffusion that relates the diffusive flux to the gradient of the concentration. Then, we derive an explicit parametric model which is further fitted to time-correlation signals computed from regions of interest (ROI) containing individual spots. Our modeling and Bayesian estimation framework are well appropriate to represent isolated diffusion events and are robust to noise, ROI sizes, and localization of spots in ROIs. The performance of BayesTICS is shown on both synthetic and real TIRFM images depicting Transferrin Receptor proteins.
A strategy activated in one task may be transferred to subsequent tasks and prevent activation of other strategies that would otherwise come to mind, a mechanism referred to as procedural priming. In a novel application of procedural priming we show that it can make or break cognitive illusions. Our test case is the 1/k illusion, which is based on the same unwarranted mathematical shortcut as the MPG illusion and the time-saving bias. The task is to estimate distances between values of fractions on the form 1/k. Most people given this task intuitively base their estimates on the distances between the denominators (i.e., the reciprocals of the fractions), which may yield very poor estimations of the true distances between the fractions. As expected, the tendency to fall for this illusion is related to cognitive style (Study 1). In order to apply procedural priming we constructed versions of the task in which the illusion is weak, in the sense that most people do not fall for it anymore. We then gave participants both “strong illusion” and “weak illusion” versions of the task (Studies 2 and 3). Participants who first did the task in the weak illusion version would often persist with the correct strategy even in the strong illusion version, thus breaking the otherwise strong illusion in the latter task. Conversely, participants who took the strong illusion version first would then often fall for the illusion even in the weak illusion version, thus strengthening the otherwise weak illusion in the latter task.
Redundant or excessive information can sometimes lead people to lean on it unnecessarily. Certain experimental designs can sometimes bias results in the researcher’s favor. And, sometimes, interesting effects are too small to be studied, practically, or are simply zero. We believe a confluence of these factors led to a recent paper (Isaac & Brough, 2014, JCR). This initial paper proposed a new means by which probability judgments can be led astray: the category size bias, by which an individual event coming from a large category is judged more likely to occur than an event coming from a small one. Our work shows that this effect may be due to instructional and mechanical confounds, rather than interesting psychology. We present eleven studies with over ten times the sample size of the original in support of our conclusion: We replicate three of the five original studies and reduce or eliminate the effect by resolving these methodological issues, even significantly reversing the bias in one case (Study 6). Studies 7–8c suggest the remaining two studies are false positives. We conclude with a discussion of the subtleties of instruction wording, the difficulties of correcting the record, and the importance of replication and open science.
In this perspective, I give my answer to the question of how quantum computing will impact on data-intensive applications in engineering and science. I focus on quantum Monte Carlo integration as a likely source of (relatively) near-term quantum advantage, but also discuss some other ideas that have garnered widespread interest.