Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-27T23:01:58.016Z Has data issue: false hasContentIssue false

When will's wont wants wanting

Published online by Cambridge University Press:  26 April 2021

Peter Dayan*
Affiliation:
Max Planck Institute for Biological Cybernetics & University of Tuebingen, 72076Tuebingen, Germany. dayan@tue.mpg.de

Abstract

We use neural reinforcement learning concepts including Pavlovian versus instrumental control, liking versus wanting, model-based versus model-free control, online versus offline learning and planning, and internal versus external actions and control to reflect on putative conflicts between short-term temptations and long-term goals.

Type
Open Peer Commentary
Creative Commons
The target article and response article are works of the U.S. Government and are not subject to copyright protection in the United States.
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Berridge, K. C. (2009). Wanting and liking: Observations from the neuroscience and psychology laboratory. Inquiry: A Journal of Medical Care Organization, Provision and Financing, 52(4), 378398.CrossRefGoogle ScholarPubMed
Boureau, Y.-L., Sokol-Hessner, P., & Daw, N. D. (2015). Deciding how to decide: Self-control and meta-decision making. Trends in cognitive sciences, 19(11), 700710.CrossRefGoogle ScholarPubMed
Cavanagh, J. F., Eisenberg, I., Guitart-Masip, M., Huys, Q., & Frank, M. J. (2013). Frontal theta overrides Pavlovian learning biases. Journal of Neuroscience, 33(19), 85418548.CrossRefGoogle ScholarPubMed
Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8(12), 17041711.CrossRefGoogle ScholarPubMed
Dayan, P. (2012). How to set the switches on this thing. Current Opinion in Neurobiology, 22, 10681074.CrossRefGoogle ScholarPubMed
Dayan, P., Niv, Y., Seymour, B., & Daw, N. D. (2006). The misbehavior of value and the discipline of the will. Neural Networks, 19(8), 11531160.CrossRefGoogle ScholarPubMed
de Araujo, I. E., Schatzker, M., & Small, D. M. (2020). Rethinking food reward. Annual Review of Psychology, 71, 139164.CrossRefGoogle ScholarPubMed
Dickinson, A. (1980). Contemporary animal learning theory. Cambridge, UK: Cambridge University Press.Google Scholar
Dickinson, A., & Balleine, B. (2002). The role of learning in motivation. In Gallistel, C. (Ed.), Stevens’ handbook of experimental psychology (Vol. 3, pp. 497533). New York, NY: Wiley.Google Scholar
Eldar, E., Lièvre, G., Dayan, P., & Dolan, R. J. (2020). The roles of online and offline replay in planning. eLife, 9.CrossRefGoogle ScholarPubMed
Gershman, S. J. (2020). Origin of perseveration in the trade-off between reward and complexity. bioRxiv.Google ScholarPubMed
Kahneman, D. (2011). Thinking, fast and slow. Macmillan.Google Scholar
Keramati, M., Smittenaar, P., Dolan, R. J., & Dayan, P. (2016). Adaptive integration of habits into depth-limited planning defines a habitual-goal-directed spectrum. Proceedings of the National Academy of Sciences of the United States of America, 113, 1286812873.CrossRefGoogle ScholarPubMed
Kurzban, R., Duckworth, A., Kable, J. W., & Myers, J. (2013). An opportunity cost model of subjective effort and task performance. Behavioral and Brain Sciences, 36(6), 661679.CrossRefGoogle ScholarPubMed
Liu, Y., Dolan, R. J., Kurth-Nelson, Z., & Behrens, T. E. (2019). Human replay spontaneously reorganizes experience. Cell, 178(3), 640652.CrossRefGoogle ScholarPubMed
Loewenstein, G. (1996). Out of control: Visceral influences on behavior. Organizational Behavior and Human Decision Processes, 65(3), 272292.CrossRefGoogle Scholar
Mackintosh, N. J. (1983). Conditioning and associative learning. Oxford, UK: Oxford University Press.Google Scholar
Mattar, M. G., & Daw, N. D. (2018). Prioritized memory access explains planning and hippocampal replay. Nature Neuroscience, 21, 16091617.CrossRefGoogle ScholarPubMed
Ng, A. Y., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. ICML (Vol. 99, pp. 278287).Google Scholar
Pezzulo, G., Rigoli, F., & Chersi, F. (2013). The mixed instrumental controller: Using value of information to combine habitual choice and mental simulation. Frontiers in Psychology, 4, 92.CrossRefGoogle ScholarPubMed
Pfeiffer, B. E., & Foster, D. J. (2013). Hippocampal place-cell sequences depict future paths to remembered goals. Nature, 497, 7479.CrossRefGoogle ScholarPubMed
Shenhav, A., Musslick, S., Lieder, F., Kool, W., Griffiths, T. L., Cohen, J. D., & Botvinick, M. M. (2017). Toward a rational and mechanistic account of mental effort. Annual Review of Neuroscience, 40, 99124.CrossRefGoogle Scholar
Stevens, J. R., & Stephens, D. W. (2008). Patience. Current Biology, 18(1), R11R12.CrossRefGoogle ScholarPubMed
Sutton, R. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3(1), 944.CrossRefGoogle Scholar
Sutton, R. S. (1991). Dyna, an integrated architecture for learning, planning, and reacting. ACM Sigart Bulletin, 2(4), 160163.CrossRefGoogle Scholar
Watkins, C. (1989). Learning from Delayed Rewards. PhD thesis, University of Cambridge.Google Scholar
Wikenheiser, A. M., & Redish, A. D. (2015). Decoding the cognitive map: Ensemble hippocampal sequences and decision making. Current Opinion in Neurobiology, 32, 815.CrossRefGoogle ScholarPubMed