Statistical constraints on climate model parameters using a scalable cloud-based inference framework – CORRIGENDUM

James Carzon; Bruno Abreu; Leighton Regayre; Kenneth Carslaw; Lucia Deaconu; Philip Stier; Hamish Gordon; Mikael Kuusela

doi:10.1017/eds.2025.6

Statistical constraints on climate model parameters using a scalable cloud-based inference framework – CORRIGENDUM

Published online by Cambridge University Press: 05 March 2025

Hamish Gordon and

James Carzon*: Affiliation:
Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
Bruno Abreu: Affiliation:
National Center for Supercomputing Applications, University of Illinois Urbana-Champaign, Urbana-Champaign, IL, USA Now at: Pittsburgh Supercomputing Center, Pittsburgh, PA, USA
Leighton Regayre: Affiliation:
Institute for Climate and Atmospheric Science, School of Earth and Environment, University of Leeds, Leeds, UK Met Office Hadley Centre, Exeter, UK
Kenneth Carslaw: Affiliation:
Institute for Climate and Atmospheric Science, School of Earth and Environment, University of Leeds, Leeds, UK
Lucia Deaconu: Affiliation:
Atmospheric, Oceanic and Planetary Physics Department, University of Oxford, Oxford, UK Faculty of Environmental Science and Engineering, Babes-Bolyai University, Cluj, Romania
Philip Stier: Affiliation:
Atmospheric, Oceanic and Planetary Physics Department, University of Oxford, Oxford, UK
Hamish Gordon: Affiliation:
Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA Center for Atmospheric Particle Studies, Carnegie Mellon University, Pittsburgh, PA, USA
Mikael Kuusela: Affiliation:
Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA
*: Corresponding author: James Carzon; Email: jcarzon@andrew.cmu.edu

Article contents

Abstract
References

Abstract

An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.

Type: Corrigendum
Information: Environmental Data Science , Volume 4 , 2025 , e14

DOI: https://doi.org/10.1017/eds.2025.6 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Our original study relied on a workflow codebase in which a bug was identified during follow-up work. This bug interfered with the geospatial matching of the emulated and observed aerosol optical depth measurements and led to some mistakes in the numerical results of our paper. Once we fixed the bug, the distribution of the standardized residuals between the emulated and observed measurements computed according to the model in Eq. (2) of the original manuscript was markedly non-Gaussian after outlier removal. Consequently, the model was found to be misspecified to the extent that all parameter configurations were deemed inconsistent with the observations by the test described in Section 2.4, necessitating an update to the model. Our updated model replaces the Gaussian noise term $ {\unicode{x025B}}_{\mathrm{other},x} $ after outlier removal with a more flexible noise model that no longer requires the outlier removal step. The implausibility test statistic remains the same although now we leverage a model-based bootstrap method to obtain its null distribution due to the non-Gaussian noise model.

As we write down the model equation once again,

$$ {\displaystyle \begin{array}{c}z(x)=\zeta (x)+{\unicode{x025B}}_{\mathrm{meas},x}\\ {}={\hat{\eta}}_x\left({u}^{\ast}\right)+{\unicode{x025B}}_{\mathrm{emu},x}\left({u}^{\ast}\right)+{\unicode{x025B}}_{\mathrm{meas},x}+{\unicode{x025B}}_{\mathrm{other},x},\\ {}\hskip-4.3em {\hat{\eta}}_x\left({u}^{\ast}\right):= \unicode{x1D53C}\left[{\tilde{\eta}}_x\left({u}^{\ast}\right)|{D}_{\mathrm{train}}\right],\end{array}} $$

we relax our previous assumption that the term $ {\unicode{x025B}}_{\mathrm{other},x} $ is Gaussian. (It remains the case that $ {\unicode{x025B}}_{\mathrm{emu},x}\left({u}^{\ast}\right)\sim N\left(0,{\sigma}_{\mathrm{emu},x}^2\left({u}^{\ast}\right)\right) $ and $ {\unicode{x025B}}_{\mathrm{meas},x}\sim N\left(0,{\sigma}_{\mathrm{meas},x}^2\right) $ are well-founded assumptions by construction and previous literature, respectively, as was held in the original manuscript.) Instead, now we write.

$$ {\displaystyle \begin{array}{l}z(x)={\hat{\eta}}_x\left({u}^{\ast}\right)+{\sigma}_x\left({u}^{\ast },\delta \right){d}_x,\\ {}\hskip-3em {\sigma}_x\left({u}^{\ast },\delta \right)=\sqrt{\sigma_{\mathrm{emu},x}^2\left({u}^{\ast}\right)+{\sigma}_{\mathrm{meas},x}^2+{\delta}^2},\end{array}} $$

where $ \delta >0 $ is an unknown parameter and $ {d}_x\overset{\mathrm{i}.\mathrm{i}.\mathrm{d}.}{\sim }p\left(0,1\right) $ for some unknown distribution $ p $ with mean $ 0 $ and variance $ 1 $ .

Based on our updated data analysis, it appears that $ p $ is symmetric, motivating us to intermediately prescribe a Gaussian ansatz $ \tilde{p}=N\left(0,1\right) $ toward calibrating a more flexible, yet still tractable implausibility test. With this ansatz, we jointly estimate

(1)

$$ {\hat{\left(u,\delta \right)}}_{\mathrm{MLE}}=\underset{u\in \mathcal{U},\delta >0}{\mathrm{argmax}}\tilde{\mathrm{\mathcal{L}}}\left\{u,\delta; z\left({x}_1\right),z\left({x}_2\right),\dots, z\left({x}_M\right)\right\}. $$

Denote the $ u $ - and $ \delta $ -components of $ {\hat{\left(u,\delta \right)}}_{\mathrm{MLE}} $ by $ {\hat{u}}_{\mathrm{MLE}} $ and $ {\hat{\delta}}_{\mathrm{MLE}} $ , respectively. Here $ \tilde{\mathrm{\mathcal{L}}} $ denotes the ansatz-implied form of the likelihood, which is known in closed form and is efficiently optimized with the L-BFGS-B algorithm. Note that $ \tilde{\mathrm{\mathcal{L}}} $ is exactly the assumed likelihood from the original manuscript. With the maximum likelihood estimates as plug-ins, we use model-based bootstrap sampling to obtain the approximate empirical noise distribution

(2)

$$ \hat{p}=\mathrm{Unif}\left\{\frac{z(x)-{\hat{\eta}}_x\left({\hat{u}}_{\mathrm{MLE}}\right)}{\sigma_x\left({\hat{\left(u,\delta \right)}}_{\mathrm{MLE}}\right)}:x\in \mathcal{M}\right\}. $$

Using an intermediate fit to obtain an empirical estimate of the noise distribution is a standard strategy in bootstrapping regression models; see, for example, Davison and Hinkley (Reference Davison and Hinkley1997, Chapter 6). The empirical distribution $ \hat{p} $ enables us to account for heavy tails present in the observations; if we simply used the ansatz $ \tilde{p} $ , we would get unrealistically tight constraints due to model misspecification.

To test $ {H}_0:u={u}_0 $ versus the alternative that $ {H}_1:u\ne {u}_0 $ , we use the same implausibility metric as before as our test statistic,

(3)

$$ I\left({u}_0;z\left({x}_1\right),z\left({x}_2\right),\dots, z\left({x}_M\right)\right)=\sqrt{\sum \limits_{x\in \mathcal{M}}{\left(\frac{z(x)-{\hat{\eta}}_x\left({u}_0\right)}{\sigma_x\left({u}_0,{\hat{\delta}}_{\mathrm{MLE}}\right)}\right)}^2}. $$

Note that this test statistic is approximately pivotal for $ {u}_0 $ under the assumed model. Hence it suffices to calibrate the implausibility test for a single choice of $ {u}_0 $ , in particular $ {\hat{u}}_{\mathrm{MLE}} $ for this work. The critical value for the implausibility test is computed by Algorithm 1 leveraging the empirical noise distribution $ \hat{p} $ of Eq. (2). The code to reproduce the results from this corrigendum can be found on GitHub: https://github.com/JamesCarzon/smokeppe-constraints (last version: 21 October 2024).

Figure 1 (which replaces Figure 3 in the original manuscript) shows the resulting significant univariate parametric constraints by our corrected method. The number of test parameters has been increased from $ \mathrm{5,000} $ to $ \mathrm{50,000} $ to ensure a high-fidelity representation of these constraints. We find that the corrected pipeline provides meaningful univariate constraints on four variables: sea spray emission flux, accumulation mode aerosol dry deposition velocity, BVOC SOA, and the median diameter of primary ultrafine anthropogenic sulfate particles. Our data thus constrain the UKESM1 parameters more strongly than we previously reported. We note that our constraints on high values of sea spray emission flux and low values of primary sulfate diameter are consistent with the corresponding constraints displayed in Regayre et al. (Reference Regayre, Deaconu, Grosvenor and Sexton2023); see their Figures S12–13. Our constraint on low values of BVOC SOA appears to be new. Our constraint on high values of dry deposition velocity is in contrast with their relative constraint on lower values. However, they do not entirely rule out any values, and thus there is no strict contradiction with our result. On balance, our four conclusive univariate constraints appear to be in greater agreement overall with Regayre et al. (Reference Regayre, Deaconu, Grosvenor and Sexton2023) than those reported in our original manuscript.

Figure 1. Parameter constraints at 95% confidence level. (a-d) One-dimensional projections of the FrequentistConfSet using the above revised Implausibility Test. The $ 95 th $ percentile of the approximate null distribution under $ {H}_0 $ is indicated by the horizontal red lines. Each of the four displayed parameters are constrained on their own using only two weeks of data. (e) A two-dimensional projection of the FrequentistConfSet on the BVOC SOA and accumulation mode dry deposition velocity parameters, displayed as described in the manuscript. The dark purple pixels are ruled out as implausible.

Combinations of high accumulation mode dry deposition velocity and low BVOC SOA are jointly constrained (Figure 1e). This is in contrast with the original manuscript’s constraint on low accumulation mode dry deposition velocity and high BVOC SOA. This updated result can be understood as physically meaning the following: If there is only a little aerosol mass formed by vegetation emissions while the deposition rate is high for relatively large particles, then one will underestimate AOD so much that even when controlling for the uncertainty of the MODIS retrieval estimates due to instrumental error, the imperfection of our surrogate model, and any other sources of model discrepancy that we estimate for our climate model, this scenario can be ruled out as implausible at the 95% confidence level.

Algorithm 1 Estimated critical values for implausibility testCompute the critical value for the implausibility test at $ u={u}_0 $ at significance level $ \alpha \in \left(0,1\right) $ .

Input: Parameter value $ {u}_0 $ ; significance level $ \alpha \in \left(0,1\right) $ ; residual distribution $ {p}^{\ast } $ , e.g., $ \hat{p} $ in Eq. (2); parameter estimate $ \hat{\delta} $ , e.g., Eq. (1); implausibility metric $ I $ , e.g., Eq. (3)

Output: Estimated critical value of implausibility test at level $ \alpha $

1: Set $ \mathcal{I}\leftarrow \varnothing $

2: for $ b $ in $ \left\{1,2,\dots, B\right\} $ do

3: Set $ {\mathcal{Z}}_b\leftarrow [\hskip0.3em ] $

4: for $ x $ in $ \mathcal{M} $ do

5: Draw $ {d}^{\ast}\sim {p}^{\ast } $

6: Simulate $ {z}^{\ast }(x)\leftarrow {\hat{\eta}}_x\left({u}_0\right)+{\sigma}_x({u}_0,\hskip0.3em \hat{\delta})\cdot {d}^{\ast } $

7: Set $ {\mathcal{Z}}_b\leftarrow $ ADDITEM $ \left({\mathcal{Z}}_b,{z}^{\ast }(x)\right) $

8: end for

9: Compute $ {I}_b\leftarrow I\left({u}_0;{\mathcal{Z}}_b\right) $

10: Set $ \mathcal{I}\leftarrow \mathcal{I}\cup \left\{{I}_b\right\} $

11: end for

12: Compute quantile $ q\leftarrow {Q}_{1-\alpha}\left(\mathcal{I}\right) $

13: return $ q $

Acknowledgments

The authors would like to thank Victor Alejandro Sanchez for his assistance in identifying the mistake in the original manuscript and implementing these corrections.

References

Carzon, J, Abreu, B, Regayre, L, et al. (2023) Statistical constraints on climate model parameters using a scalable cloud-based inference framework, Environmental Data Science 2:e24. doi:10.1017/eds.2023.12.CrossRef Google Scholar

Davison, AC and Hinkley, DV (1997) Bootstrap Methods and Their Application. Cambridge University Press.CrossRef Google Scholar

Regayre, LA, Deaconu, L, Grosvenor, DP, Sexton, DMH, et al. (2023) Identifying climate model structural inconsistencies allows for tight constraint of aerosol radiative forcing, Atmospheric Chemistry and Physics 23(15), 8749–8768.CrossRef Google Scholar