Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-13T05:06:19.576Z Has data issue: false hasContentIssue false

Asymptotic variance for random walk Metropolis chains in high dimensions: logarithmic growth via the Poisson equation

Published online by Cambridge University Press:  15 November 2019

Aleksandar Mijatović*
Affiliation:
University of Warwick and The Alan Turing Institute
Jure Vogrinc*
Affiliation:
University of Warwick
*
* Postal address: Department of Statistics, University of Warwick, Coventry CV4 7AL, UK.
* Postal address: Department of Statistics, University of Warwick, Coventry CV4 7AL, UK.

Abstract

There are two ways of speeding up Markov chain Monte Carlo algorithms: (a) construct more complex samplers that use gradient and higher-order information about the target and (b) design a control variate to reduce the asymptotic variance. While the efficiency of (a) as a function of dimension has been studied extensively, this paper provides the first results linking the efficiency of (b) with dimension. Specifically, we construct a control variate for a d-dimensional random walk Metropolis chain with an independent, identically distributed target using the solution of the Poisson equation for the scaling limit in [30]. We prove that the asymptotic variance of the corresponding estimator is bounded above by a multiple of $\log(d)/d$ over the spectral gap of the chain. The proof hinges on large deviations theory, optimal Young’s inequality and Berry–Esseen-type bounds. Extensions of the result to non-product targets are discussed.

Type
Original Article
Copyright
© Applied Probability Trust 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Assaraf, R. and Caffarel, M. (1999). Zero-variance principle for Monte Carlo algorithms. Phys. Rev. Lett. 83, 46824685.CrossRefGoogle Scholar
Barthe, F. (1998). Optimal Young’s inequality and its converse: a simple proof. Geom. Funct. Anal. 8, 234242.CrossRefGoogle Scholar
Bédard, M. (2007). Weak convergence of Metropolis algorithms for non-I.I.D. target distributions. Ann. Appl. Prob. 17, 12221244.CrossRefGoogle Scholar
Bédard, M. and Rosenthal, J. (2008). Optimal scaling of Metropolis algorithms: heading toward general target distributions. Canad. J. Statist. 36, 483503.CrossRefGoogle Scholar
Beskos, A., Pillai, N., Roberts, G., Sanz-Serna, J.-M. and Stuart, A. (2013). Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli 19, 15011534.CrossRefGoogle Scholar
Beskos, A., Roberts, G. and Stuart, A. (2009). Optimal scalings for local Metropolis–Hastings chains on nonproduct targets in high dimensions. Ann. Appl. Prob. 19, 863898.CrossRefGoogle Scholar
Bierkens, J. and Roberts, G. (2017). A piecewise deterministic scaling limit of lifted Metropolis–Hastings in the Curie–Weiss model. Ann. Appl. Prob. 27, 846882.CrossRefGoogle Scholar
Brooks, S., Gelman, A., Jones, G. and Meng, X.-L., eds (2011). Handbook of Markov Chain Monte Carlo (Handbooks of Modern Statistical Methods). Chapman & Hall/CRC, Boca Raton, FL.Google Scholar
Dellaportas, P. and Kontoyiannis, I. (2012). Control variates for estimation based on reversible Markov chain Monte Carlo samplers. J. R. Statist. Soc. B 74, 133161.CrossRefGoogle Scholar
Duane, S., Kennedy, A., Pendleton, B. and Roweth, D. (1987). Hybrid Monte Carlo. Phys. Lett. B 195, 216222.CrossRefGoogle Scholar
Duncan, A., Lelièvre, T. and Pavliotis, G. (2016). Variance reduction using nonreversible Langevin samplers. J. Statist. Phys. 163, 457.CrossRefGoogle ScholarPubMed
Durmus, A., Roberts, G. O., Vilmart, G. and Zygalakis, K. C. (2017). Fast Langevin based algorithm for MCMC in high dimensions. Ann. Appl. Prob. 27(4), 21952237.CrossRefGoogle Scholar
Eichelsbacher, P. and Löwe, M. (2003). Moderate deviations for i.i.d. random variables. ESAIM Prob. Statist. 7, 209218.CrossRefGoogle Scholar
Geyer, C. (1992). Practical Markov chain Monte Carlo. Statist. Sci. 473483.CrossRefGoogle Scholar
Henderson, S. (1997). Variance reduction via an approximating Markov process. PhD thesis, Department of Operations Research, Stanford University.Google Scholar
Jarner, S. and Hansen, E. (2000). Geometric ergodicity of Metropolis algorithms. Stoch. Process. Appl. 85, 341361.CrossRefGoogle Scholar
Jourdain, B., Lelièvre, T. and Miasojedow, B. (2015). Optimal scaling for the transient phase of the random walk Metropolis algorithm: the mean-field limit. Ann. Appl. Prob. 25, 22632300.CrossRefGoogle Scholar
Kipnis, C. and Varadhan, S. R. S. (1986). Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusions. Commun. Math. Phys. 104, 119.CrossRefGoogle Scholar
Kolassa, J. (2006). Series Approximation Methods in Statistics (Lecture Notes Statist. 88). Springer Science & Business Media.Google Scholar
Mattingly, J., Pillai, N. and Stuart, A. (2012). Diffusion limits of the Random walk Metropolis algorithm in high dimensions. Ann. Appl. Prob. 22, 881930.CrossRefGoogle Scholar
Meyn, S. (2008). Control Techniques for Complex Networks. Cambridge University Press, Cambridge.Google Scholar
Meyn, S. and Tweedie, R. (2009). Markov Chains and Stochastic Stability, 2nd edn. Cambridge University Press, Cambridge.CrossRefGoogle Scholar
Mijatović, A. and Vogrinc, J. (2018). On the Poisson equation for Metropolis–Hastings chains. Bernoulli 24(3), 24012428.CrossRefGoogle Scholar
Oates, C., Girolami, M. and Chopin, N. (2017). Control functionals for Monte Carlo integration. J. R. Statist. Soc. B 79, 695718.CrossRefGoogle Scholar
Papamarkou, T., Mira, A. and Girolami, M. (2014). Zero variance differential geometric Markov chain Monte Carlo algorithms. Bayesian Anal. 9, 97128.CrossRefGoogle Scholar
Pillai, N., Stuart, A. and Thiéry, A. (2012). Optimal scalings and diffusion limits for the Langevin algorithm in high dimensions. Ann. Appl. Prob. 22, 23202356.CrossRefGoogle Scholar
Roberts, G. and Rosenthal, J. (1997). Geometric ergodicity and hybrid Markov chains. Electron. Commun. Prob. 2 (2), 1325.CrossRefGoogle Scholar
Roberts, G. and Rosenthal, J. (1998). Optimal scaling of discrete approximations to Langevin diffusions. J. R. Statist. Soc. B 60, 255268.CrossRefGoogle Scholar
Roberts, G. and Rosenthal, J. (2001). Optimal scaling for various Metropolis–Hastings algorithms. Statist. Sci. 16, 351367.CrossRefGoogle Scholar
Roberts, G., Gelman, A. and Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Prob. 7, 110120.Google Scholar
Tierney, L. (1994). Markov chains for exploring posterior distributions. Ann. Statist. 22, 17011762.CrossRefGoogle Scholar
Williams, D. (1991). Probability with Martingales. Cambridge University Press.CrossRefGoogle Scholar