The ingredients for proxy failure are a target and an agent that optimizes for an approximation (proxy) of the target. Because the proxy is not the actual target, the behavior of the agent can become misaligned with the target (John et al.). In fact, Sohl-Dickstein (Reference Sohl-Dickstein2022) points out that if proxy optimization is too efficient, it reliably becomes not only ineffective but also actively harmful. Here, we argue, from molecules to societies, that the harm of proxy failure is minimized by a diverse and dynamic population of proxies; and that periodic separation between agents forces them to both individualize and work together, leading to new solutions.
John et al. give the example of decision-making algorithms in the brain as proxies for evolutionary fitness. These proxies fail with, for example, abused drugs or excessive consumption of food. In our view, diversity in decision-making systems is a central defense against this kind of proxy failure. The hypothalamus contains a set of segregated circuits, each implementing a distinct “hard-wired” behavioral policy aimed toward one homeostatic or reproductive goal, such as feeding, drinking, or mating (Saper & Lowell, Reference Saper and Lowell2014; Schulkin & Sterling, Reference Schulkin and Sterling2019; Sewards & Sewards, Reference Sewards and Sewards2003). In service of basic drives, corticostriatal circuitry also learns a more general and flexible set of goals (Balleine, Delgado, & Hikosaka, Reference Balleine, Delgado and Hikosaka2007; Cardinal, Parkinson, Hall, & Everitt, Reference Cardinal, Parkinson, Hall and Everitt2002; Frank & Claus, Reference Frank and Claus2006; Saunders & Robinson, Reference Saunders and Robinson2012). Of course, the behaviors prescribed by different goals often conflict, and the striatum can be viewed as a “parliament” dynamically arbitrating between goals (Cui et al., Reference Cui, Jun, Jin, Pham, Vogel, Lovinger and Costa2013; Da Silva, Tecuapetla, Paixão, & Costa, Reference Da Silva, Tecuapetla, Paixão and Costa2018; Graybiel & Grafton, Reference Graybiel and Grafton2015; Klaus et al., Reference Klaus, Martins, Paixao, Zhou, Paninski and Costa2017; Mohebi et al., Reference Mohebi, Pettibone, Hamid, Wong, Vinson, Patriarchi and Berke2019). Humans in particular adopt a dizzying diversity of goals (O'Reilly, Hazy, Mollick, Mackie, & Herd, Reference O'Reilly, Hazy, Mollick, Mackie and Herd2014; Schank & Abelson, Reference Schank and Abelson1977) and also synthesize new goals when existing ones are frustrated. Each goal represents a different proxy for evolutionary fitness, and they better approximate fitness when they are in balance than when an individual goal is excessively optimized. Pathological states occur when the system gets stuck on a single goal, such as in addiction or rumination.
Diversity of beliefs protects against proxy failure in the same way as diversity of goals. Every human holds many distinct beliefs. The beliefs are “separate,” in that they are not required to be consistent with one another (Wood, Douglas, & Sutton, Reference Wood, Douglas and Sutton2012), and when one is active, others are largely inaccessible (Hills, Todd, Lazer, Redish, & Couzin, Reference Hills, Todd, Lazer, Redish and Couzin2015). Each belief (or perspective, or metaphor) is only a partial description of the world – a proxy for a broader truth. This proxy diversity serves us well. An individual with multiple perspectives on a problem is less likely to get stuck in a particular approach (De Bono, Reference De Bono1970; Duncker, Reference Duncker1945; Ohlsson, Reference Ohlsson1992), and a deep understanding of a topic means having many different perspectives available (Feyerabend, Reference Feyerabend1975; Lakoff & Johnson, Reference Lakoff and Johnson1980; Saffo, Reference Saffo2008; Wittgenstein, Reference Wittgenstein1953). Conversely, if we attach to and optimize for a single perspective, our thinking is rigid and shallow: Optimizing too strongly for that single proxy leads to divergence from the broader truth. In the brain, a network centered on hippocampus appears to support diversity and dynamism. This network separates knowledge modularly into distinct entities and narratives (McClelland, McNaughton, & O'Reilly, Reference McClelland, McNaughton and O'Reilly1995; Yassa & Stark, Reference Yassa and Stark2011). Vitally, after they are separated, the entities are then also flexibly composed together in many different ways, synthesizing new knowledge and perspectives (Buckner, Reference Buckner2010; Kazanina & Poeppel, Reference Kazanina and Poeppel2023; Kurth-Nelson et al., Reference Kurth-Nelson, Behrens, Wayne, Miller, Luettgau, Dolan and Schwartenbeck2023; O'Reilly, Ranganath, & Russin, Reference O'Reilly, Ranganath and Russin2022).
Just as the brain holds diverse motivations and beliefs in balance, multiagent systems such as human societies contain diverse and competing forces, which can be seen as proxies for collective welfare. There is a rich tradition of studying the conditions under which this diversity of objectives is conducive to broader success (Ostrom, Gardner, & Walker, Reference Ostrom, Gardner and Walker1994). Empirically, excess communication reduces diversity and worsens performance in human groups (Lorenz, Rauhut, Schweitzer, & Helbing, Reference Lorenz, Rauhut, Schweitzer and Helbing2011; Page, Reference Page2017). However, if individuals are allowed to spend time first working on a problem in isolation and then combine solutions, the group performs better (Bernstein, Shore, & Lazer, Reference Bernstein, Shore and Lazer2018). This example follows the general pattern that entities must first separate to diversify and gain individual stability. Then, interaction creates higher-order structures, leading to hierarchies and open-ended evolution.
Diversity plays a similar role in groups of artificial agents. Imagine an evolving population of game-playing agents, where the fitness of each individual is determined by playing paper-rock-scissors against each other. If the population loses diversity and collapses on a single strategy, such as “always play rock,” then a mutation that produces the strategy “always play paper” will dominate the population. These waves of dominant strategies can go in circles through the optimization landscape, never improving overall. However, if the population is diverse, agents are forced to discover truly new solutions, an effect also documented in much more complex games (Crepinšek, Liu, & Mernik, Reference Crepinšek, Liu and Mernik2013; Czarnecki et al., Reference Czarnecki, Gidel, Tracey, Tuyls, Omidshafiei, Balduzzi and Jaderberg2020; Leibo, Hughes, Lanctot, & Graepel, Reference Leibo, Hughes, Lanctot and Graepel2019; Vinyals et al., Reference Vinyals, Babuschkin, Czarnecki, Mathieu, Dudzik, Chung and Silver2019).
As a final example, sexual reproduction is remarkably common (Judson & Normark, Reference Judson and Normark1996; Speijer, Lukeš, & Eliáš, Reference Speijer, Lukeš and Eliáš2015), despite the cost of producing males and the challenge of finding mates in the vast world (Lehtonen, Jennions, & Kokko, Reference Lehtonen, Jennions and Kokko2012; Maynard Smith, Reference Maynard Smith1978). What advantages does sex offer? A traditional view is that recombination generates diversity by exploring new combinations of genes. A fascinating extension of this theory is that recombination also forces the genes to be modular, democratizing the genome (Agren, Haig, & McCoy, Reference Agren, Haig and McCoy2022; Livnat, Papadimitriou, Dushoff, & Feldman, Reference Livnat, Papadimitriou, Dushoff and Feldman2008; Melo, Porto, Cheverud, & Marroig, Reference Melo, Porto, Cheverud and Marroig2016; Srivastava, Hinton, Krizhevsky, Sutskever, & Salakhutdinov, Reference Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov2014; Veller, Reference Veller2022). A gene can't depend on the presence of another particular gene because it might disappear in the next shuffling. Instead, each gene is incentivized to function productively with any new genome it finds itself in – yielding a genetic foundation ripe for synthesis of new solutions. Although each gene is selfish and is only an imperfect proxy for the welfare of the organism, a diverse and dynamic set of genes protects against proxy failure.
In conclusion, connectedness must be balanced with periods of separation to maintain diversity and protect against proxy failures. We should be cautious about moving toward continual interconnectedness and premature exchange of information. Similarly, with rapid advances in artificial intelligence (AI), we should be cautious about concentrating intelligence in one place. Diverse AI systems should exist with different objectives and modes of operation. Troublingly, proxy failure may explain the Fermi paradox – the puzzle that we don't see other intelligent life in the universe. Through Earth's history, evolutionary experiments have had opportunities to develop separately. Archaea and prokaryotic mitochondrial ancestors specialized separately for hundreds of millions of years before achieving the distinct forms that enabled fruitful endosymbiosis, fueling the explosion of multicellular complexity (Lane & Martin, Reference Lane and Martin2010; Margulis, Reference Margulis1970; Roger, Muñoz-Gómez, & Kamikawa, Reference Roger, Muñoz-Gómez and Kamikawa2017). However, the trend with increased intelligence is toward immediate exchange of information between entities across the planet, reducing proxy diversity, with risk of catastrophic failure (Diamond, Reference Diamond2005).
The ingredients for proxy failure are a target and an agent that optimizes for an approximation (proxy) of the target. Because the proxy is not the actual target, the behavior of the agent can become misaligned with the target (John et al.). In fact, Sohl-Dickstein (Reference Sohl-Dickstein2022) points out that if proxy optimization is too efficient, it reliably becomes not only ineffective but also actively harmful. Here, we argue, from molecules to societies, that the harm of proxy failure is minimized by a diverse and dynamic population of proxies; and that periodic separation between agents forces them to both individualize and work together, leading to new solutions.
John et al. give the example of decision-making algorithms in the brain as proxies for evolutionary fitness. These proxies fail with, for example, abused drugs or excessive consumption of food. In our view, diversity in decision-making systems is a central defense against this kind of proxy failure. The hypothalamus contains a set of segregated circuits, each implementing a distinct “hard-wired” behavioral policy aimed toward one homeostatic or reproductive goal, such as feeding, drinking, or mating (Saper & Lowell, Reference Saper and Lowell2014; Schulkin & Sterling, Reference Schulkin and Sterling2019; Sewards & Sewards, Reference Sewards and Sewards2003). In service of basic drives, corticostriatal circuitry also learns a more general and flexible set of goals (Balleine, Delgado, & Hikosaka, Reference Balleine, Delgado and Hikosaka2007; Cardinal, Parkinson, Hall, & Everitt, Reference Cardinal, Parkinson, Hall and Everitt2002; Frank & Claus, Reference Frank and Claus2006; Saunders & Robinson, Reference Saunders and Robinson2012). Of course, the behaviors prescribed by different goals often conflict, and the striatum can be viewed as a “parliament” dynamically arbitrating between goals (Cui et al., Reference Cui, Jun, Jin, Pham, Vogel, Lovinger and Costa2013; Da Silva, Tecuapetla, Paixão, & Costa, Reference Da Silva, Tecuapetla, Paixão and Costa2018; Graybiel & Grafton, Reference Graybiel and Grafton2015; Klaus et al., Reference Klaus, Martins, Paixao, Zhou, Paninski and Costa2017; Mohebi et al., Reference Mohebi, Pettibone, Hamid, Wong, Vinson, Patriarchi and Berke2019). Humans in particular adopt a dizzying diversity of goals (O'Reilly, Hazy, Mollick, Mackie, & Herd, Reference O'Reilly, Hazy, Mollick, Mackie and Herd2014; Schank & Abelson, Reference Schank and Abelson1977) and also synthesize new goals when existing ones are frustrated. Each goal represents a different proxy for evolutionary fitness, and they better approximate fitness when they are in balance than when an individual goal is excessively optimized. Pathological states occur when the system gets stuck on a single goal, such as in addiction or rumination.
Diversity of beliefs protects against proxy failure in the same way as diversity of goals. Every human holds many distinct beliefs. The beliefs are “separate,” in that they are not required to be consistent with one another (Wood, Douglas, & Sutton, Reference Wood, Douglas and Sutton2012), and when one is active, others are largely inaccessible (Hills, Todd, Lazer, Redish, & Couzin, Reference Hills, Todd, Lazer, Redish and Couzin2015). Each belief (or perspective, or metaphor) is only a partial description of the world – a proxy for a broader truth. This proxy diversity serves us well. An individual with multiple perspectives on a problem is less likely to get stuck in a particular approach (De Bono, Reference De Bono1970; Duncker, Reference Duncker1945; Ohlsson, Reference Ohlsson1992), and a deep understanding of a topic means having many different perspectives available (Feyerabend, Reference Feyerabend1975; Lakoff & Johnson, Reference Lakoff and Johnson1980; Saffo, Reference Saffo2008; Wittgenstein, Reference Wittgenstein1953). Conversely, if we attach to and optimize for a single perspective, our thinking is rigid and shallow: Optimizing too strongly for that single proxy leads to divergence from the broader truth. In the brain, a network centered on hippocampus appears to support diversity and dynamism. This network separates knowledge modularly into distinct entities and narratives (McClelland, McNaughton, & O'Reilly, Reference McClelland, McNaughton and O'Reilly1995; Yassa & Stark, Reference Yassa and Stark2011). Vitally, after they are separated, the entities are then also flexibly composed together in many different ways, synthesizing new knowledge and perspectives (Buckner, Reference Buckner2010; Kazanina & Poeppel, Reference Kazanina and Poeppel2023; Kurth-Nelson et al., Reference Kurth-Nelson, Behrens, Wayne, Miller, Luettgau, Dolan and Schwartenbeck2023; O'Reilly, Ranganath, & Russin, Reference O'Reilly, Ranganath and Russin2022).
Just as the brain holds diverse motivations and beliefs in balance, multiagent systems such as human societies contain diverse and competing forces, which can be seen as proxies for collective welfare. There is a rich tradition of studying the conditions under which this diversity of objectives is conducive to broader success (Ostrom, Gardner, & Walker, Reference Ostrom, Gardner and Walker1994). Empirically, excess communication reduces diversity and worsens performance in human groups (Lorenz, Rauhut, Schweitzer, & Helbing, Reference Lorenz, Rauhut, Schweitzer and Helbing2011; Page, Reference Page2017). However, if individuals are allowed to spend time first working on a problem in isolation and then combine solutions, the group performs better (Bernstein, Shore, & Lazer, Reference Bernstein, Shore and Lazer2018). This example follows the general pattern that entities must first separate to diversify and gain individual stability. Then, interaction creates higher-order structures, leading to hierarchies and open-ended evolution.
Diversity plays a similar role in groups of artificial agents. Imagine an evolving population of game-playing agents, where the fitness of each individual is determined by playing paper-rock-scissors against each other. If the population loses diversity and collapses on a single strategy, such as “always play rock,” then a mutation that produces the strategy “always play paper” will dominate the population. These waves of dominant strategies can go in circles through the optimization landscape, never improving overall. However, if the population is diverse, agents are forced to discover truly new solutions, an effect also documented in much more complex games (Crepinšek, Liu, & Mernik, Reference Crepinšek, Liu and Mernik2013; Czarnecki et al., Reference Czarnecki, Gidel, Tracey, Tuyls, Omidshafiei, Balduzzi and Jaderberg2020; Leibo, Hughes, Lanctot, & Graepel, Reference Leibo, Hughes, Lanctot and Graepel2019; Vinyals et al., Reference Vinyals, Babuschkin, Czarnecki, Mathieu, Dudzik, Chung and Silver2019).
As a final example, sexual reproduction is remarkably common (Judson & Normark, Reference Judson and Normark1996; Speijer, Lukeš, & Eliáš, Reference Speijer, Lukeš and Eliáš2015), despite the cost of producing males and the challenge of finding mates in the vast world (Lehtonen, Jennions, & Kokko, Reference Lehtonen, Jennions and Kokko2012; Maynard Smith, Reference Maynard Smith1978). What advantages does sex offer? A traditional view is that recombination generates diversity by exploring new combinations of genes. A fascinating extension of this theory is that recombination also forces the genes to be modular, democratizing the genome (Agren, Haig, & McCoy, Reference Agren, Haig and McCoy2022; Livnat, Papadimitriou, Dushoff, & Feldman, Reference Livnat, Papadimitriou, Dushoff and Feldman2008; Melo, Porto, Cheverud, & Marroig, Reference Melo, Porto, Cheverud and Marroig2016; Srivastava, Hinton, Krizhevsky, Sutskever, & Salakhutdinov, Reference Srivastava, Hinton, Krizhevsky, Sutskever and Salakhutdinov2014; Veller, Reference Veller2022). A gene can't depend on the presence of another particular gene because it might disappear in the next shuffling. Instead, each gene is incentivized to function productively with any new genome it finds itself in – yielding a genetic foundation ripe for synthesis of new solutions. Although each gene is selfish and is only an imperfect proxy for the welfare of the organism, a diverse and dynamic set of genes protects against proxy failure.
In conclusion, connectedness must be balanced with periods of separation to maintain diversity and protect against proxy failures. We should be cautious about moving toward continual interconnectedness and premature exchange of information. Similarly, with rapid advances in artificial intelligence (AI), we should be cautious about concentrating intelligence in one place. Diverse AI systems should exist with different objectives and modes of operation. Troublingly, proxy failure may explain the Fermi paradox – the puzzle that we don't see other intelligent life in the universe. Through Earth's history, evolutionary experiments have had opportunities to develop separately. Archaea and prokaryotic mitochondrial ancestors specialized separately for hundreds of millions of years before achieving the distinct forms that enabled fruitful endosymbiosis, fueling the explosion of multicellular complexity (Lane & Martin, Reference Lane and Martin2010; Margulis, Reference Margulis1970; Roger, Muñoz-Gómez, & Kamikawa, Reference Roger, Muñoz-Gómez and Kamikawa2017). However, the trend with increased intelligence is toward immediate exchange of information between entities across the planet, reducing proxy diversity, with risk of catastrophic failure (Diamond, Reference Diamond2005).
Acknowledgments
We thank Zora Wessely for her comments on an earlier version of the manuscript.
Financial support
This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
Z. K.-N. and J. Z. L. are employed by Google DeepMind. S. S. and M. G.-M. have no competing interest to declare.