Active Inference, Bayesian Optimal Design, and Expected Utility

doi:10.1017/9781009026949.007

Chapter 6 - Active Inference, Bayesian Optimal Design, and Expected Utility

from Part II - How Do Humans Search for Information?

Published online by Cambridge University Press: 19 May 2022

Noor Sajid ,

Lancelot Da Costa ,

Thomas Parr and

Karl Friston

Edited by

Irene Cogliati Dezza ,

Eric Schulz and

Charley M. Wu

Show author details

Irene Cogliati Dezza: Affiliation:
University College London
Eric Schulz: Affiliation:
Max-Planck-Institut für biologische Kybernetik, Tübingen
Charley M. Wu: Affiliation:
Eberhard-Karls-Universität Tübingen, Germany

Book contents

Get access

Summary

Active inference, a corollary of the free energy principle, is a formal way of describing the behavior of certain kinds of random dynamical systems that have the appearance of sentience. In this chapter, we describe how active inference combines Bayesian decision theory and optimal Bayesian design principles under a single imperative to minimize expected free energy. It is this aspect of active inference that allows for the natural emergence of information-seeking behavior. When removing prior outcomes preferences from expected free energy, active inference reduces to optimal Bayesian design (i.e., information gain maximization). Conversely, active inference reduces to Bayesian decision theory in the absence of ambiguity and relative risk (i.e., expected utility maximization). Using these limiting cases, we illustrate how behaviors differ when agents select actions that optimize expected utility, expected information gain, and expected free energy. Our T-maze simulations show optimizing expected free energy produces goal-directed information-seeking behavior while optimizing expected utility induces purely exploitive behavior, and maximizing information gain engenders intrinsically motivated behavior.

Keywords

Active inference Bayesian decision theory optimal Bayesian design free energy information gain

Information

Type: Chapter
Information: The Drive for Knowledge
The Science of Human Information Seeking
, pp. 124 - 146

DOI: https://doi.org/10.1017/9781009026949.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2022

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Attias, H. (2003). Planning by Probabilistic Inference. Paper presented at the Proc. of the 9th Int. Workshop on Artificial Intelligence and Statistics. https://proceedings.mlr.press/r4/attias03a.html.Google Scholar

Auer, P. (2002). Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research, 3(Nov.), 397–422.Google Scholar

Barlow, H. (1961). Possible principles underlying the transformations of sensory messages. In Rosenblith, W. (Ed.), Sensory Communication (pp. 217–234). MIT Press.Google Scholar

Barlow, H. B. (1974). Inductive inference, coding, perception, and language. Perception, 3, 123–134.Google Scholar

Barto, A. G. (2013). Intrinsic motivation and reinforcement learning. In Baldassarre, G & Mirolli, M, Intrinsically motivated learning in natural and artificial systems (pp. 17–47). Springer.CrossRef Google Scholar

Barto, A., Mirolli, M., & Baldassarre, G. (2013). Novelty or Surprise? Frontiers in Psychology, 4. doi:10.3389/fpsyg.2013.00907. Retrieved from www.frontiersin.org/Journal/Abstract.aspx?s=196&name=cognitive_science&ART_DOI=10.3389/fpsyg.2013.00907.CrossRef Google Scholar PubMed

Beal, M. J. (2003). Variational Algorithms for Approximate Bayesian Inference. PhD. Thesis, University College London. www.proquest.com/docview/1775215626?pq-origsite=gscholar&fromopenview=true.Google Scholar

Bellemare, M. G., Srinivasan, S., Ostrovski, G., Schaul, T., Saxton, D., & Munos, R. (2016). Unifying count-based exploration and intrinsic motivation. arXiv preprint arXiv:1606.01868.Google Scholar

Berger, J. O. (2011). Statistical decision theory and Bayesian analysis. Springer.Google Scholar

Blei, D. M., Kucukelbir, A., & McAuliffe, J. D. (2017). Variational inference: A review for statisticians. Journal of the American Statistical Association, 112(518), 859–877.CrossRef Google Scholar

Botvinick, M., & Toussaint, M. (2012). Planning as inference. Trends in Cognitive Science., 16(10), 485–488.Google Scholar

Burda, Y., Edwards, H., Storkey, A., & Klimov, O. (2018). Exploration by random network distillation. arXiv preprint arXiv:1810.12894.Google Scholar

Çatal, O., Wauthier, S., Verbelen, T., De Boom, C., & Dhoedt, B. (2020). Deep active inference for autonomous robot navigation. arXiv preprint arXiv:2003.03220.Google Scholar

Chaloner, K., & Verdinelli, I. (1995). Bayesian experimental design: A review. Statistical Science, 273–304.Google Scholar

Cullen, M., Davey, B., Friston, K. J., & Moran, R. J. (2018). Active inference in OpenAI gym: A paradigm for computational investigations into psychiatric illness. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging, 3(9), 809–818.Google Scholar

Da Costa, L., Parr, T., Sajid, N., Veselic, S., Neacsu, V., & Friston, K. (2020). Active inference on discrete state-spaces: A synthesis. Journal of Mathematical Psychology, 99, 102447. Retrieved from www.sciencedirect.com/science/article/pii/S0022249620300857.CrossRef Google Scholar PubMed

Da Costa, L., Sajid, N., Parr, T., Friston, K., & Smith, R. (2020). The relationship between dynamic programming and active inference: The discrete, finite-horizon case. arXiv preprint arXiv:2009.08111.Google Scholar

Fleming, W. H., & Sheu, S. J. (2002). Risk-sensitive control and an optimal investment model II. Annals of Applied Probability, 12(2), 730–767. Retrieved from https://projecteuclid.org:443/euclid.aoap/1026915623.Google Scholar

Fountas, Z., Sajid, N., Mediano, P. A., & Friston, K. (2020). Deep active inference agents using Monte-Carlo methods. arXiv preprint arXiv:2006.04176.Google Scholar

Friston, K. J. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138. http://dx.doi.org/10.1038/nrn2787.Google Scholar

Friston, K. (2019). A free energy principle for a particular physics. arXiv preprint arXiv:1906.10184.Google Scholar

Friston, K., Da Costa, L., Hafner, D., Hesp, C., & Parr, T. (2020). Sophisticated inference. arXiv preprint arXiv:2006.04120.Google Scholar

Friston, K. J., Daunizeau, J., Kilner, J., & Kiebel, S. J. (2010). Action and behavior: A free-energy formulation. Biological Cybernetics, 102(3), 227–260.CrossRef Google Scholar PubMed

Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., O’Doherty, J., & Pezzulo, G. (2016). Active inference and learning. Neuroscience and Biobehavioral Reviews, 68, 862–879. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/27375276.Google Scholar

Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., & Pezzulo, G. (2017). Active inference: A process theory. Neural Computation, 29(1), 1–49. Retrieved from https://www.ncbi.nlm.nih.gov/pubmed/27870614.Google Scholar

Friston, K. J., Lin, M., Frith, C. D., Pezzulo, G., Hobson, J. A., & Ondobaka, S. (2017). Active inference, curiosity and insight. Neural Computation, 29(10), 2633–2683. Friston, K. J., Parr, T., & de Vries, B. (2017). The graphical brain: Belief propagation and active inference. Network Neuroscience, 1(4), 381–414. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/29417960.CrossRef Google Scholar PubMed

Friston, K. J., Parr, T., Yufik, Y., Sajid, N., Price, C. J., & Holmes, E. (2020). Generative models, linguistic communication and active inference. Neuroscience & Biobehavioral Reviews, 118, 42–64. https://doi.org/10.1016/j.neubiorev.2020.07.005.Google Scholar

Friston, K. J., Rigoli, F., Ognibene, D., Mathys, C., Fitzgerald, T., & Pezzulo, G. (2015). Active inference and epistemic value. Cognitive Neuroscience, 6(4), 187–224. Retrieved from http://dx.doi.org/10.1080/17588928.2015.1020053.Google Scholar

Friston, K. J., Rosch, R., Parr, T., Price, C., & Bowman, H. (2018). Deep temporal models and active inference. Neuroscience and Biobehavioral Reviews, 90, 486–501.Google Scholar

Friston, K., Schwartenbeck, P., FitzGerald, T., Moutoussis, M., Behrens, T., & Dolan, R. J. (2014). The anatomy of choice: dopamine and decision-making. Philosophical Transactions of the Royal Society B: Biological Sciences, 369 (1655). Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/25267823.Google Scholar

Gottlieb, J., Oudeyer, P.-Y., Lopes, M., & Baranes, A. (2013). Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Science, 17(11), 585–593. Retrieved from https://www.sciencedirect.com/science/article/pii/S1364661313002052.CrossRef Google Scholar PubMed

Harsanyi, J. C. (1978). Bayesian decision theory and utilitarian ethics. The American Economic Review, 68(2), 223–228. Retrieved from www.jstor.org/stable/1816692.Google Scholar

Houthooft, R., Chen, X., Duan, Y., Schulman, J., De Turck, F., & Abbeel, P. (2016). Vime: Variational information maximizing exploration. Advances in Neural Information Processing Systems, 29, 1109–1117.Google Scholar

Itti, L., & Baldi, P. (2009). Bayesian surprise attracts human attention. Vision Research, 49(10), 1295–1306.Google Scholar

Jaynes, E. T. (1957). Information theory and statistical mechanics. Physical Review, 106(4), 620.Google Scholar

Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47(2), 263–291.Google Scholar

Kaplan, R., & Friston, K. J. (2018). Planning and navigation as active inference. Biological Cybernetics, 112(4), 323–343.CrossRef Google Scholar PubMed

Laureiro-Martínez, D., Brusoni, S., & Zollo, M. (2010). The neuroscientific foundations of the exploration−exploitation dilemma. Journal of Neuroscience, Psychology, and Economics, 3(2), 95.Google Scholar

Lindley, D. V. (1956). On a measure of the information provided by an experiment. The Annals of Mathematical Statistics, 986–1005.Google Scholar

Linsker, R. (1990). Perceptual neural organization: some approaches based on network models and information theory. Annual Review of Neuroscience, 13, 257–281.Google Scholar

Millidge, B., Tschantz, A., & Buckley, C. L. (2020). Whence the expected free energy? arXiv preprint arXiv:2004.08128.Google Scholar

Mirza, M. B., Adams, R. A., Mathys, C. D., & Friston, K. J. (2016). Scene construction, visual foraging, and active inference. Frontiers in Computational Neuroscience, 10 (56). Retrieved from http://journal.frontiersin.org/Article/10.3389/fncom.2016.00056/abstract. Mitchell, T., Sacks, J., & Ylvisaker, D. (1994). Asymptotic Bayes criteria for nonparametric response surface design. The Annals of Statistics, 22(2), 634–651.CrossRef Google Scholar PubMed

Optican, L., & Richmond, B. J. (1987). Temporal encoding of two-dimensional patterns by single units in primate inferior cortex. II Information theoretic analysis. Journal of Neurophysiology, 57, 132–146.Google Scholar

Parr, T. (2019). The computational neurology of active vision. UCL (Unpublished doctoral thesis, University College London). https://discovery.ucl.ac.uk/id/eprint/10084391/Google Scholar

Parr, T., Da Costa, L., & Friston, K. (2020). Markov blankets, information geometry and stochastic thermodynamics. Philosophical Transactions of the Royal Society A, 378(2164), 20190159.Google Scholar

Parr, T., & Friston, K. J. (2019a). Attention or salience? Current Opinion in Psychology, 29, 1–5.Google Scholar

Parr, T., & Friston, K. J. (2019b). Generalised free energy and active inference. Biological Cybernetics, 113(5–6), 495–513.Google Scholar

Parr, T., Markovic, D., Kiebel, S. J., & Friston, K. J. (2019). Neuronal message passing using Mean-field, Bethe, and Marginal approximations. Scientific Reports, 9(1), 1889. Retrieved from https://doi.org/10.1038/s41598-018-38246-3.Google Scholar

Pathak, D., Agrawal, P., Efros, A. A., & Darrell, T. (2017). Curiosity-driven exploration by self-supervised prediction. Paper presented at the International Conference on Machine Learning.CrossRef Google Scholar

Pukelsheim, F. (2006). Optimal design of experiments: SIAM.Google Scholar

Russo, D., Van Roy, B., Kazerouni, A., Osband, I., & Wen, Z. (2017). A tutorial on Thompson sampling. arXiv preprint arXiv:1707.02038.Google Scholar

Sacks, J., Welch, W. J., Mitchell, T. J., & Wynn, H. P. (1989). Design and analysis of computer experiments. Statistical Science, 4(4), 409–423.Google Scholar

Sajid, N., Ball, P. J., Parr, T., & Friston, K. J. (2021). Active inference: Demystified and compared. Neural Computation, 33(3), 674–712.Google Scholar

Savage, L. J. (1972). The foundations of statistics: Courier Corporation.Google Scholar

Schmidhuber, J. (1991a). Curious model-building control systems. In Proc. International Joint Conference on Neural Networks, Singapore. IEEE, 2, 1458–1463. https://mediatum.ub.tum.de/doc/814953/file.pdf.Google Scholar

Schmidhuber, J. (1991b). A possibility for implementing curiosity and boredom in model-building neural controllers. Paper presented at the Proc. of the international conference on simulation of adaptive behavior: From animals to animats. https://mediatum.ub.tum.de/doc/814958/file.pdf CrossRef Google Scholar

Schmidhuber, J. (2006). Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts. Connection Science, 18(2), 173–187. https://doi.org/10.1080/09540090600768658.Google Scholar

Schulz, E., & Gershman, S. J. (2019). The algorithmic architecture of exploration in the human brain. Current Opinion in Neurobiology, 55, 7–14.Google Scholar

Schwartenbeck, P., Passecker, J., Hauser, T. U., FitzGerald, T. H., Kronbichler, M., & Friston, K. J. (2019). Computational mechanisms of curiosity and goal-directed exploration. eLife, 8, e.41707. https://doi.org/10.7554/eLife.41703.Google Scholar

Shewry, M. C., & Wynn, H. P. (1987). Maximum entropy sampling. Journal of Applied Statistics, 14(2), 165–170.CrossRef Google Scholar

Stone, M. (1959). Application of a measure of information to the design and comparison of regression experiments. The Annals of Mathematical Statistics, 30(1), 55–70.CrossRef Google Scholar

Sun, Y., Gomez, F., & Schmidhuber, J. (2011). Planning to be surprised: Optimal Bayesian exploration in dynamic environments. In Schmidhuber, J., Thórisson, K. R., & Looks, M. (Eds.), Artificial General Intelligence: 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3–6,2011. Proceedings (pp. 41–51). Springer.Google Scholar

Sutton, R. S., & Barto, A. G. (1998). Introduction to Reinforcement Learning: MIT Press.CrossRef Google Scholar

Todorov, E. (2008). General duality between optimal control and estimation. In 2008 47th IEEE Conference on Decision and Control (pp. 4286–4292). IEEE.Google Scholar

Tschantz, A., Seth, A. K., & Buckley, C. L. (2020). Learning action-oriented models through active inference. PLoS Computational Biology, 16(4), e1007805. Retrieved from https://doi.org/10.1371/journal.pcbi.1007805.CrossRef Google Scholar PubMed

van den Broek, J. L., Wiegerinck, W. A. J. J., & Kappen, H. J. (2010). Risk-sensitive path integral control. UAI, 6, 1–8.Google Scholar

van der Himst, O., & Lanillos, P. (2020). Deep Active Inference for Partially Observable MDPs. In International Workshop on Active Inference (pp. 61–71). Springer.Google Scholar

Vasconcelos, M., Monteiro, T., & Kacelnik, A. (2015). Irrational choice and the value of information. Scientific Reports, 5(1), 13874. Retrieved from https://doi.org/10.1038/srep13874.CrossRef Google Scholar PubMed

Vértes, E., & Sahani, M. (2018). Flexible and accurate inference and learning for deep generative models. arXiv preprint arXiv:1805.11051.Google Scholar

Von Neumann, J., & Morgenstern, O. (1944). Theory of games and economic behavior. Princeton University Press.Google Scholar

Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A., & Cohen, J. D. (2014). Humans use directed and random exploration to solve the explore–exploit dilemma. Journal of Experimental Psychology: General, 143(6), 2074.Google Scholar

Zintgraf, L., Shiarlis, K., Igl, M., Schulze, S., Gal, Y., Hofmann, K., & Whiteson, S. (2019). VariBAD: A very good method for Bayes-adaptive deep RL via meta-learning. arXiv preprint arXiv:1910.08348.Google Scholar

Accessibility standard: Unknown

Accessibility compliance for the PDF of this book is currently unknown and may be updated in the future.