To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
We explore the effects of fiscal policy shocks on aggregate output and inflation. We use the Bayesian econometric methodology of Baumeister and Hamilton applied to the fiscal structural vector autoregressive model to evaluate key elasticities and fiscal multipliers using U.S. data. In our baseline specification that ends before Covid pandemic, the government spending multiplier is equal to approximately $0.57$ and tax multiplier is approximately $-0.35$ after one year. The short-term output elasticity of government spending is statistically insignificant and the output elasticity of taxes is approximately equal to $2.26$.
This study presents a framework that combines Bayesian inference with reinforcement learning to guide drone-based sampling for methane source estimation. Synthetic gas concentration and wind observations are generated using a calibrated model derived from real-world drone measurements, providing a more representative testbed that captures atmospheric boundary layer variability. We compare three path planning strategies—preplanned, myopic (short-sighted), and non-myopic (long-term)—and find that non-myopic policies trained via deep reinforcement learning consistently yield more precise and accurate estimates of both source location and emission rate. We further investigate centralized multi-agent collaboration and observe comparable performance to independent agents in the tested single-source scenario. Our results suggest that effective source term estimation depends on correctly identifying the plume and obtaining low-noise concentration measurements within it. Precise localization further requires sampling in close proximity to the source, including slightly upwind. In more complex environments with multiple emission sources, multi-agent systems may offer advantages by enabling individual drones to specialize in tracking distinct plumes. These findings support the development of intelligent, data-driven sampling strategies for drone-based environmental monitoring, with potential applications in climate monitoring, emission inventories, and regulatory compliance.
Approaches to linguistic areas have largely focused either on purely qualitative investigation of area-formation processes, on quantitative and qualitative exploration of synchronic distributions of linguistic features without considering time, or on theoretical issues related to the definition of the notion ‘linguistic area’. What is still missing are approaches that supplement qualitative research on area-formation processes with quantitative methods. Taking a bottom-up approach, we bypass notional issues and propose to quantify area-formation processes by (i) measuring the change in linguistic similarity given a geographical space, a sociocultural setting, a time span, a language sample, and a set of linguistic data, and (ii) testing the tendency and magnitude of the process using Bayesian inference. Applying this approach to the expression of reflexivity in a dense sample of languages in northwestern Europe from the early Middle Ages to the present, we show that the method yields robust quantitative evidence for a substantial gain in linguistic similarity that sets the languages of Britain and Ireland apart from languages spoken outside of Britain and Ireland and cross-cuts lines of linguistic ancestry.
Accurate assessment of adverse event (AE) incidence is critical in clinical research for drug safety. While meta-analysis serves as an essential tool to comprehensively synthesize the evidence across multiple studies, incomplete AE reporting in clinical trials remains a persistent challenge. In particular, AEs occurring below study-specific reporting thresholds are often omitted from publications, leading to left-censored data. Failure to account for these censored AE counts can result in biased AE incidence estimates. We present an R Shiny application that implements a Bayesian meta-analysis model specifically designed to incorporate censored AE data into the estimation process. This interactive tool provides a user-friendly interface for researchers to conduct AE meta-analyses and estimate the AE incidence probability using an unbiased approach. It also enables direct comparisons between models that either incorporate or ignore censoring, highlighting the biases introduced by conventional approaches. This tutorial demonstrates the Shiny application’s functionality through an illustrative example on meta-analysis of PD-1/PD-L1 inhibitor safety and highlights the importance of this tool in improving AE risk assessment. Ultimately, the new Shiny app facilitates more accurate and transparent drug safety evaluations. The Shiny-MAGEC app is available at: https://zihanzhou98.shinyapps.io/Shiny-MAGEC/.
Analysis of experimental scalar data is tackled here. Starting from basic analysis of a large number of well-behaved data, eventually displaying Gaussian distributions, we move on to Bayesian inference and face the cases of few (or no) data, sometimes badly behaved. We first present methods to analyze data whose ideal distribution is known, and then we show methods to make predictions even when our ignorance about the data distribution is total. Eventually, various resampling methods are provided to deal with time-correlated measurements, biased estimators, anomalous data, and under- or over-estimation of statistical errors.
Based on the long-running Probability Theory course at the Sapienza University of Rome, this book offers a fresh and in-depth approach to probability and statistics, while remaining intuitive and accessible in style. The fundamentals of probability theory are elegantly presented, supported by numerous examples and illustrations, and modern applications are later introduced giving readers an appreciation of current research topics. The text covers distribution functions, statistical inference and data analysis, and more advanced methods including Markov chains and Poisson processes, widely used in dynamical systems and data science research. The concluding section, 'Entropy, Probability and Statistical Mechanics' unites key concepts from the text with the authors' impressive research experience, to provide a clear illustration of these powerful statistical tools in action. Ideal for students and researchers in the quantitative sciences this book provides an authoritative account of probability theory, written by leading researchers in the field.
Recent experiments aiming to measure phenomena predicted by strong-field quantum electrodynamics (SFQED) have done so by colliding relativistic electron beams and high-power lasers. In such experiments, measurements of collision parameters are not always feasible. However, precise knowledge of these parameters is required to accurately test SFQED.
Here, we present a novel Bayesian inference procedure that infers collision parameters that could not be measured on-shot. This procedure is applicable to all-optical non-linear Compton scattering experiments investigating radiation reaction. The framework allows multiple diagnostics to be combined self-consistently and facilitates the inclusion of known information pertaining to the collision parameters. Using this Bayesian analysis, the relative validity of the classical, quantum-continuous and quantum-stochastic models of radiation reaction was compared for several test cases, which demonstrates the accuracy and model selection capability of the framework and highlight its robustness if the experimental values of fixed parameters differ from their values in the models.
This paper introduces a method for pricing insurance policies using market data. The approach is designed for scenarios in which the insurance company seeks to enter a new market, in our case: pet insurance, lacking historical data. The methodology involves an iterative two-step process. First, a suitable parameter is proposed to characterize the underlying risk. Second, the resulting pure premium is linked to the observed commercial premium using an isotonic regression model. To validate the method, comprehensive testing is conducted on synthetic data, followed by its application to a dataset of actual pet insurance rates. To facilitate practical implementation, we have developed an R package called IsoPriceR. By addressing the challenge of pricing insurance policies in the absence of historical data, this method helps enhance pricing strategies in emerging markets.
This paper demonstrates how Bayesian reasoning can be used for an analog of replication analysis with qualitative research that conducts inference to best explanation. We overview the basic mechanics of Bayesian reasoning with qualitative evidence and apply our approach to recent research on climate change politics, a matter of major importance that is beginning to attract greater interest in the discipline. Our re-analysis of illustrative evidence from a prominent article on global collective-action versus distributive politics theories of climate policy largely accords with the authors’ conclusions, while illuminating the value added of Bayesian analysis. In contrast, our in-depth examination of scholarship on oil majors’ support for carbon pricing yields a Bayesian inference that diverges from the authors’ conclusions. These examples highlight the potential for Bayesian reasoning not only to improve inferences when working with qualitative evidence but also to enhance analytical transparency, facilitate communication of findings, and promote knowledge accumulation.
Bayesian inference is one way prediction can be formalized. It combines an estimation of the prior probability that an event will take place and an assessment of the likelihood of new data to give a new updated estimate of the posterior probability of the event. Important concepts in Bayesian inference are rational analysis, the notions of optimal inference or an ideal observer, and that processing can be corrupted in a noisy channel.
The ice shelves buttressing the Antarctic ice sheet determine the rate of ice-discharge into the surrounding oceans. Their geometry and buttressing strength are influenced by the local surface accumulation and basal melt rates, governed by atmospheric and oceanic conditions. Contemporary methods quantify one of these rates, but typically not both. Moreover, information about these rates is only available for recent time periods, reaching at most a few decades back since measurements are available. We present a new method to simultaneously infer the surface accumulation and basal melt rates averaged over decadal and centennial timescales. We infer the spatial dependence of these rates along flow line transects using internal stratigraphy observed by radars, using a kinematic forward model of internal stratigraphy. We solve the inverse problem using simulation-based inference (SBI). SBI performs Bayesian inference by training neural networks on simulations of the forward model to approximate the posterior distribution, therefore also quantifying uncertainties over the inferred parameters. We validate our method on a synthetic example, and apply it to Ekström Ice Shelf, Antarctica, for which independent validation data are available. We obtain posterior distributions of surface accumulation and basal melt averaging over up to 200 years before 2022.
Dynamic latent variable models generally link units’ positions on a latent dimension over time via random walks. Theoretically, these trajectories are often expected to resemble a mixture of periods of stability interrupted by moments of change. In these cases, a prior distribution such as the regularized horseshoe—that allows for both stasis and change—can prove a better theoretical and empirical fit for the underlying construct than other priors. Replicating Reuning, Kenwick, and Fariss (2019), we find that the regularized horseshoe performs better than the standard normal and the Student’s t-distribution when modeling dynamic latent variable models. Overall, the use of the regularized horseshoe results in more accurate and precise estimates. More broadly, the regularized horseshoe is a promising prior for many similar applications.
Data assimilation is a core component of numerical weather prediction systems. The large quantity of data processed during assimilation requires the computation to be distributed across increasingly many compute nodes; yet, existing approaches suffer from synchronization overhead in this setting. In this article, we exploit the formulation of data assimilation as a Bayesian inference problem and apply a message-passing algorithm to solve the spatial inference problem. Since message passing is inherently based on local computations, this approach lends itself to parallel and distributed computation. In combination with a GPU-accelerated implementation, we can scale the algorithm to very large grid sizes while retaining good accuracy and compute and memory requirements.
We present an hierarchical Bayes approach to modeling parameter heterogeneity in generalized linear models. The model assumes that there are relevant subpopulations and that within each subpopulation the individual-level regression coefficients have a multivariate normal distribution. However, class membership is not known a priori, so the heterogeneity in the regression coefficients becomes a finite mixture of normal distributions. This approach combines the flexibility of semiparametric, latent class models that assume common parameters for each sub-population and the parsimony of random effects models that assume normal distributions for the regression parameters. The number of subpopulations is selected to maximize the posterior probability of the model being true. Simulations are presented which document the performance of the methodology for synthetic data with known heterogeneity and number of sub-populations. An application is presented concerning preferences for various aspects of personal computers.
This paper proposes a novel collapsed Gibbs sampling algorithm that marginalizes model parameters and directly samples latent attribute mastery patterns in diagnostic classification models. This estimation method makes it possible to avoid boundary problems in the estimation of model item parameters by eliminating the need to estimate such parameters. A simulation study showed the collapsed Gibbs sampling algorithm can accurately recover the true attribute mastery status in various conditions. A second simulation showed the collapsed Gibbs sampling algorithm was computationally more efficient than another MCMC sampling algorithm, implemented by JAGS. In an analysis of real data, the collapsed Gibbs sampling algorithm indicated good classification agreement with results from a previous study.
Cognitive diagnostic models (CDMs) are discrete latent variable models popular in educational and psychological measurement. In this work, motivated by the advantages of deep generative modeling and by identifiability considerations, we propose a new family of DeepCDMs, to hunt for deep discrete diagnostic information. The new class of models enjoys nice properties of identifiability, parsimony, and interpretability. Mathematically, DeepCDMs are entirely identifiable, including even fully exploratory settings and allowing to uniquely identify the parameters and discrete loading structures (the “\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Q}$$\end{document}-matrices”) at all different depths in the generative model. Statistically, DeepCDMs are parsimonious, because they can use a relatively small number of parameters to expressively model data thanks to the depth. Practically, DeepCDMs are interpretable, because the shrinking-ladder-shaped deep architecture can capture cognitive concepts and provide multi-granularity skill diagnoses from coarse to fine grained and from high level to detailed. For identifiability, we establish transparent identifiability conditions for various DeepCDMs. Our conditions impose intuitive constraints on the structures of the multiple \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Q}$$\end{document}-matrices and inspire a generative graph with increasingly smaller latent layers when going deeper. For estimation and computation, we focus on the confirmatory setting with known \documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Q}$$\end{document}-matrices and develop Bayesian formulations and efficient Gibbs sampling algorithms. Simulation studies and an application to the TIMSS 2019 math assessment data demonstrate the usefulness of the proposed methodology.
Traditional mediation analysis assumes that a study population is homogeneous and the mediation effect is constant over time, which may not hold in some applications. Motivated by smoking cessation data, we propose a latent class dynamic mediation model that explicitly accounts for the fact that the study population may consist of different subgroups and the mediation effect may vary over time. We use a proportional odds model to accommodate the subject heterogeneities and identify latent subgroups. Conditional on the subgroups, we employ a Bayesian hierarchical nonparametric time-varying coefficient model to capture the time-varying mediation process, while allowing each subgroup to have its individual dynamic mediation process. A simulation study shows that the proposed method has good performance in estimating the mediation effect. We illustrate the proposed methodology by applying it to analyze smoking cessation data.
Brain activation and connectivity analyses in task-based functional magnetic resonance imaging (fMRI) experiments with multiple subjects are currently at the forefront of data-driven neuroscience. In such experiments, interest often lies in understanding activation of brain voxels due to external stimuli and strong association or connectivity between the measurements on a set of pre-specified groups of brain voxels, also known as regions of interest (ROI). This article proposes a joint Bayesian additive mixed modeling framework that simultaneously assesses brain activation and connectivity patterns from multiple subjects. In particular, fMRI measurements from each individual obtained in the form of a multi-dimensional array/tensor at each time are regressed on functions of the stimuli. We impose a low-rank parallel factorization decomposition on the tensor regression coefficients corresponding to the stimuli to achieve parsimony. Multiway stick-breaking shrinkage priors are employed to infer activation patterns and associated uncertainties in each voxel. Further, the model introduces region-specific random effects which are jointly modeled with a Bayesian Gaussian graphical prior to account for the connectivity among pairs of ROIs. Empirical investigations under various simulation studies demonstrate the effectiveness of the method as a tool to simultaneously assess brain activation and connectivity. The method is then applied to a multi-subject fMRI dataset from a balloon-analog risk-taking experiment, showing the effectiveness of the model in providing interpretable joint inference on voxel-level activations and inter-regional connectivity associated with how the brain processes risk. The proposed method is also validated through simulation studies and comparisons to other methods used within the neuroscience community.
The Gibbs sampler can be used to obtain samples of arbitrary size from the posterior distribution over the parameters of a structural equation model (SEM) given covariance data and a prior distribution over the parameters. Point estimates, standard deviations and interval estimates for the parameters can be computed from these samples. If the prior distribution over the parameters is uninformative, the posterior is proportional to the likelihood, and asymptotically the inferences based on the Gibbs sample are the same as those based on the maximum likelihood solution, for example, output from LISREL or EQS. In small samples, however, the likelihood surface is not Gaussian and in some cases contains local maxima. Nevertheless, the Gibbs sample comes from the correct posterior distribution over the parameters regardless of the sample size and the shape of the likelihood surface. With an informative prior distribution over the parameters, the posterior can be used to make inferences about the parameters underidentified models, as we illustrate on a simple errors-in-variables model.
Two marginal one-parameter item response theory models are introduced, by integrating out the latent variable or random item parameter. It is shown that both marginal response models are multivariate (probit) models with a compound symmetry covariance structure. Several common hypotheses concerning the underlying covariance structure are evaluated using (fractional) Bayes factor tests. The support for a unidimensional factor (i.e., assumption of local independence) and differential item functioning are evaluated by testing the covariance components. The posterior distribution of common covariance components is obtained in closed form by transforming latent responses with an orthogonal (Helmert) matrix. This posterior distribution is defined as a shifted-inverse-gamma, thereby introducing a default prior and a balanced prior distribution. Based on that, an MCMC algorithm is described to estimate all model parameters and to compute (fractional) Bayes factor tests. Simulation studies are used to show that the (fractional) Bayes factor tests have good properties for testing the underlying covariance structure of binary response data. The method is illustrated with two real data studies.