R1. Introduction
When we began exploring the idea that the disparate array of phenomena referred to here as proxy failure might reflect a single core mechanism, we were a little intimidated. It seemed obvious that there are deep similarities between descriptions in widely varying fields, and indeed many previous authors have highlighted these similarities (e.g., McCoy & Haig, Reference Mccoy and Haig2020; Smaldino & McElreath, Reference Smaldino and McElreath2016; Zhuang & Hadfield-Menell, Reference Zhuang and Hadfield-Menell2021). It also seemed clear that many of the issues discussed were of current scientific interest (e.g., honest signaling) or indeed pressing social importance (e.g., scientific replicability, artificial intelligence [AI]-alignment, economic growth-driven climate breakdown; see Table 1 of the target article).
But, if proxy failure was really, at its core, a single phenomenon, would not someone else have elaborated on this? Indeed, several separate disciplines had developed entire literatures which seemed to specifically address the issue of proxy failure, for instance principal-agent theory in economics, signaling theory in ecology, or AI alignment research. Clearly, countless scholars have thought long and hard about the issue. So perhaps the similarities that were so frequently noted were superficial. Perhaps we were succumbing to an overgeneralization bias (Peters, Reference Peters, Krauss and Braganza2022). Or perhaps, trying to explain everything would end up explaining nothing. It also did not escape our attention that many of the issues we identified as proxy failure (e.g., honest vs. runaway signaling, revealed vs. normative preferences, teleonomy vs. teleology) remain highly contentious within disciplines. Perhaps it was sound academic intuition which had led previous scholars to avoid an attempt at theoretical unification.
Nevertheless, it occurred to us that barriers to transdisciplinary communication could also have prevented a unified account. Scholars are rightly careful when trying to interpret the concepts of a neighboring discipline. Often, these are built on a body of tacit knowledge and assumptions that is not fully understood by an outsider. Indeed, it took our team, composed of a neuroscientist, an economist, an ecologist, and a neuro/meta-scientist, over three years of almost weekly meetings to build the confidence that we are talking about the same things. Yet the litmus test remained to be conducted. Our account clearly required broad scrutiny from experts in each of the disciplines we drew on. Furthermore, if we had described a general and recurring pattern, then there would almost certainly be numerous examples of proxy failure beyond what we had considered. We are gratified to find that the wide range of commentaries provides both in-depth scrutiny and numerous additional examples. Overall, we have received 20 comments by 35 authors in more than six fields (Table R1). These have highlighted numerous interrelated trade-offs affecting proxy failure (Table R2) and in more than one instance introduced intriguing novel “proxynomic” concepts (Box R1). In the following, we discuss both criticisms raised and additions made, organized around four broad themes:
1. On goals
2. Additional examples
3. Drivers
4. Proposed solutions
• Proxy pruning (Higgins et al.): When less legible proxies (or aspects of a proxy) are successively pruned away, leaving only highly legible and communicable proxies which may however be poor approximations of the goal. Example given: The pruning away of construct validity controls in psychological research.
• Proxy complementarity (Watve): When more than one set of agents have an interest in hacking a proxy. Example given: Journals and authors have an aligned incentive to inflate the journal impact factor (JIF) when the latter publish in the former, and even funding agencies may benefit from the illusion of impact.
• Proxy exploitation (Watve, Kapeller et al.): When an established proxy in one system is exploited by an external system. Example given: Private academic publishers exploit the JIF to attain record profit margins of 20–40%, with little apparent contribution to the original goal (furthering science). Disambiguation: This seems closely related to our concept of proxy appropriation, in which we have however posited a nested system in which “a higher level goal” appropriates a lower level proxy.
• Predatory proxy (Watve): When a “proxy devours the goal itself.” This seems well aligned with the concept of “goal displacement” according to which the proxy eliminates the old goal and becomes a new goal in its own right (Merton, Reference Merton1940; Wouters, Reference Wouters, Biagioli and Lippman2020).
R2. On goals: Nested hierarchies from nature to society?
Several commenters raised potential problems which revolve around the identification or attribution of goals or “teleonomy.” For instance, some argued that particular examples do not show proxy failure when a different entity within the system is designated as the agent or the regulator; or that if we only understood the goal correctly, we would see that there is no proxy failure at all (e.g., Nonacs). Other commenters argued that we relied too much (Pernu) or too little (Bartlett) on “anthropomorphic” notions of goal-directed behavior. Further comments discussed the difficult issue of defining human goals (Ainslie and Sacco), and in particular collective social goals (Fernández-Dols). In our view, these comments serve to reiterate one of our key points: Proxy failure (or success) is always, and inevitably, defined with respect to a specific assignment of a regulator, goal, and agent. Disputes about specific examples of proxy failure often stem from the assumption that there is a “correct” level of analysis or a single “correct” goal. Instead, recognizing teleonomy across a multiplicity of biological and social levels can help us understand the overall behavior of a biological system or indeed a society.
R2.1. What is a “correct” goal?
Nonacs argues that the “true goal” of a peahen is to maximize offspring fitness, which itself depends on the proxy. This is a completely valid, and indeed important, perspective. In this Prum-ian view, beauty is itself the important goal (because it makes for a “sexy son,” Fisher, Reference Fisher1930; Prum, Reference Prum2017). However, many biologists view honest signaling as the main driver behind beautiful traits – that is, beautiful ornaments evolve to prominently signal male quality above and beyond simply the odds of (a son) landing a mate. So yes, proxy self-referentiality (in this case across generations) implies that a proxy becomes a goal in itself. Similar runaway processes routinely happen in social systems (Frank, Reference Frank2011). But we feel it is nevertheless illuminating to ask if the proxy reflects some “underlying” value or simply drives a runaway self-referential process. The interesting comment by Harris dissects the potential determinants of which of the two might apply, and seems well worth further developing. The reason is that the distinction matters. The issue is closely related to the potential conflict between the welfare of a population and that of the individual, which Frank (Reference Frank2011) called “Darwin's Wedge.” Consider for instance, a neighboring population of peafowl (say on a different island), which for some reason does not slip into the runaway process (say a predator catches peacocks with above average tails). All other things being equal, this second population would spend less energy on wasteful signaling, and thus have greater average fitness. In this way Nonacs' comment highlights the hierarchical nature of many nested systems of proxies and regulators. What is good at one level, enabling a proxy-trait to sweep through a population, might plunge it toward extinction (Prum, Reference Prum2017, p. 131, also note Kurth-Nelson et al.'s reference to the Fermi Paradox). A cancer cell might fulfill its goal of unmitigated growth, but at the ultimate cost of the host organism. Or as Pernu writes for the case of addiction: “from the perspective of the dopamine system there's a success.” Regarding sexual selection, Prum (Reference Prum2017, p. 133) calls this aesthetic decadence – every individual club winged manakin might have won mates with its club wings but this came at the cost of a decrease in overall survivability of the species. Likewise, the individual marsupial who developed large strong forearms as a neonate might have beaten out all its competitors but the spread of this trait may have permanently restricted marsupial mammals to limited evolutionary niches compared to mammalian radiations on other continents (no marsupial whales or bats). So yes, one can always characterize a system in terms of a goal for which there is no proxy failure. And the notion of “the good of the population” as a goal is rightly contentious. But a broader perspective – asking for instance why some populations or species exist or proliferate and others collapse or go extinct, or why aesthetic decadence seems pervasive in both animals and humans – reveals the epistemic power of rejecting the notion of a correct goal. However, we certainly agree with Nonacs' perspective that beautiful signals need not signal anything other than an individual's beauty.
R2.2. Goals in nature: Invalid anthropomorphism or underestimated reality?
Two comments, those of Bartlett and Pernu, dig deeper into the underlying philosophical question of whether it is appropriate to engage in the ostensibly anthropomorphic activity of identifying goals outside of conscious human behavior. Previous discussions had made us painfully aware that this issue would lead to controversy, in particular as we write about goals of presumably non-conscious social and biological systems and rather liberally use the concept of an agent. We were gratified to find that opposing academic poles, between which we attempted to tiptoe, found articulation in the comments by Pernu and Bartlett. On the one hand, Pernu argues that we have not gone far enough in “naturalizing” the phenomenon of proxy failure, suggesting we should avoid all anthropomorphisms (even for human systems?). He argues that, for a fully general account, speaking of goals and agents would be “misleading.” On the other hand, Bartlett thinks that our “selectionist bent” does not do sufficient justice to teleonomic (i.e., clearly goal-directed) forces in evolution, which have been extensively examined under the heading of the “extended evolutionary synthesis.” In our view, both comments add valuable perspective and are compatible with our account.
First, we want to applaud Pernu for asking whether we have gone far enough in identifying a fully general model. He proposes to use the language and tools of adaptive control theory and explore the parallels between engineering research and biology, which seems to us an excellent program of research. To “view all organisms and evolutionary processes as hierarchies of control systems,” is precisely what we have attempted to do – the notion of nested hierarchies is central to sections 4 and 5 of the target article. This helps us emphasize yet again that proxy failure for one level can look like a success for another level. Interestingly, Pernu suggests that the “correct” level to assess the goal is always the “encompassing” system (would this be the species for Nonacs’ case?). While this claim intuitively holds for examples like that of addiction (the higher level being the conscious individual and the lower level the dopamine system), there are also many cases where lower-level goals seem to be more relevant. For instance, economic/corporate goals and proxies (the “encompassing” social level) can subvert individual goals (conscious individual level) by hacking our dopamine system (sub-individual level, also see Box 2, 4, and 8 of the target article as well as the comments by Sacco and Robertson et al.). In this case, a higher-level proxy (profit) should arguably be subordinated to the lower-level goals of individual humans, which may not include being manipulated into an addiction for profit. Note that in this case the higher-level proxy (profit) leads to social teleonomic behavior, that is, the market will behave as if to pursue the higher-level goal of say maximizing GDP, and that this emergent higher-level goal cannot be assumed to be identical to the individual-level goals of market participants (Box 4 of the target article). A final point worth making is that, while Pernu sets out to extinguish anthropomorphism from our account, the very notion of “adaptive control” is in our view just another restatement of the same anthropomorphic concepts. What is the target state of an adaptive controller other than a teleonomic goal, or indeed (in engineering) an intentional human goal? In our view, a truly general account must include human goal-oriented behavior at both individual and social levels. So we wonder if Pernu is suggesting we should also avoid anthropomorphizing humans?
This leads us to Bartlett, whom we want to thank for pointing the reader to the extensive literature on teleonomy and its more recent developments in the “extended evolutionary synthesis” (Laland et al., Reference Laland, Uller, Feldman, Sterelny, Müller, Moczek and Odling-Smee2015). Our thinking is so closely aligned with this literature that we struggle to see how exactly Bartlett thinks our account conflicts with it. He claims that our “selectionist bent” led to problems, but this seems to be based on several misunderstandings. Granted, our brief treatment of evolutionary biology did not do justice to the diverse forces that influence the natural world (drift, epigenetics, controlled hypermutation, etc.); we had little choice but to focus on the areas where proxy failure and its constraints were easiest to illustrate. But we do explicitly discuss several concepts that fall squarely into the extended evolutionary synthesis such as the “evolution of beauty” through runaway sexual selection, or “runaway niche construction,” both of which Bartlett himself highlights as paradigmatically teleonomic (Bartlett, Bartlett, & Jonathan, Reference Bartlett, Bartlett and Jonathan2017). Further, it can be argued that the proxy-treadmill is a key mechanism for developing novel elaborate traits (evolutionary decadence), thereby accounting for far more than simplistic selectionist mechanisms based on honest signaling. Another place where Bartlett feels our “selectionist bent” led us astray is our statement that selection at a higher level is a “hard constraint” on proxy failure at a lower level. Yet the statement is difficult to dispute. If a peacock's tail grows too large, the peacock will not live. If a corporation maximizing KPIs remains unprofitable for two long, it will go bankrupt. None of this contradicts what we believe Bartlett actually wants to emphasize, namely that phenomena such as the “evolution of beauty” or “directed cultural niche construction” (Braganza, Reference Braganza2022) – which operate within the bounds of this hard constraint – can play a pivotal role in evolution because they shape what is later available for (potentially changing) selection. Furthermore, in the case of markets, the rules of selection are almost entirely socially constructed, and in a sense thus purely intentional. Again, this does not conflict with the observation that a given system of market selection provides a “hard constraint” on which types of firms can exist, or the degree to which proxies within them can inflate. Instead, it highlights that the way we set up selection (the proxy) matters profoundly. It is the close inspection of the proxy – the determinants of selection – which can allow us to judge whether or not our individual and social goals are served by market competition.
R2.3. Nested hierarchies of goals within humans and societies?
In biology, disputes about the “correct goal” seem to mostly entail conflicting interpretations. In psychological and social areas, such disputes tend to spill over into conflicts about which goals we should pursue. The transition between biological and human systems is notoriously contentious, as for instance illustrated by Bartlett's admonition that we should not “equate” the intentional “selection of action” with non-intentional “selection of traits.” We in no way meant to “equate” the processes, but only to point out a “functional equivalence,” which seems to explain why proxy failure can occur in both domains. Such functional equivalence is fully compatible with the fundamental differences highlighted by Bartlett (though we do wonder about intermediate cases, e.g., in animal decision making or the immune system; Noble & Noble, Reference Noble and Noble2018). Above all, this perhaps highlights the contentious transition between teleonomy and teleology. While we of course have no answers to this challenging philosophical issue, we do believe both the target article and the commentary highlight that it is often essential to jointly consider natural/biological and human/intentional proxy systems.
Both Ainslie and Sacco touch upon the transition between biological “as if” goals and genuine human goals. Ainslie points out that proxy-rewards in human brains must necessarily occur over much shorter timescales than evolution. Indeed, we can quickly and adaptively learn new proxies to guide our every-day behavior. In many cases, a metric that starts as a proxy for a bigger goal can become a goal in itself. Examples could include the pursuit of money – once a tool to achieve survival and reproductive objectives, now pursued by many people as a goal in itself. But the underlying point, to us, seems more profound, and closely linked to the historical distinction between teleonomy and teleology, or what one might call the emergence of genuine goal-oriented behavior in humans. Ainslie argues that “The self-selecting potential of proxies in neuroscience sets them apart from the other kinds of proxy the authors describe.” Proxies within the brain “may compete like fiat currencies,” that is through a self-referential process that frees them “from the need for predicting external rewards.” A proxy may thus “turn into a goal in its own right.” In the words of the neuroscientist Mitchell (Reference Mitchell2023, p. 68), the “open-ended ability for individuals to learn, to create new goals further and further removed from the ultimate imperatives of survival […] and ultimately to inspect their own reasons and subject them to metacognitive scrutiny” seems to underlie “the kind of sophisticated cognition and agency” which “might qualify as free will in humans.” In other words, the human tendency to create, represent, and reflect complex webs of proxies and goals may have furnished our ability to individually and socially pursue genuine goals that have become distinct from evolutionarily programmed dictates. This ability to define and deliberate our own goals (and how to achieve them) is the foundation of the classical liberal conception of the rational agent which underlies not only economics but also democracy. In a sense, the free, rational agent of classical liberalism isolates this reflective ability and declares its biological underpinnings to be beyond inquiry – a given fixed utility function which we need only to optimize without questioning its origin or nature.
This is the starting point for Sacco who highlights that “a key source of proxy failures in economics stems from a mismatch between the evolutionary function of biological reward systems and traditional conceptions of utility maximization.” He continues to note that “the unprecedented production and marketing of super-addictive goods exploits vulnerabilities in dopaminergic reinforcement learning circuits.” Robertson et al. provide a similar view in the context of social media companies, which hack our attention allocation systems by presenting “threatening stimuli.” This aligns well with the notion of our target article, whereby market economies, which are automated optimization devices, can undermine individual goals by hacking our decision-making systems. Sacco suggests that we can overcome such problems by redesigning our economic theory around concepts from biology (allostasis), neuroscience (reward prediction errors), machine learning (multicriterion optimization), and psychology (motivational heterogeneity). We enthusiastically applaud this vision, though it (much like our article) provides little detail on how exactly proxy failures through the economic provision of super-addictive products might be mitigated.
Another of Ainslie's formulations naturally links individual goals to social goals: “A proxy that is a good story may turn into a goal in its own right (emphasis added).” “Good stories” can become “cultural attractors” (Falandays & Smaldino, Reference Falandays and Smaldino2022; Jones & Hilde-Jones, Reference Jones and Hilde-Jones2023) – shared narratives that can establish a cohesive societal goal across countless individual minds. This understanding seems closely related to that of Fernández-Dols’, who introduces a dark twist. The commenter suggests that a shared narrative's purpose is not a seeming social goal like “justice,” but only the “illusion of accomplishment” toward this goal, which allows the proxy (the narrative that something is more or less just) to self-perpetuate. The proxies become self-fulfilling and self-sustaining, quite independent of the abstract goals they purport to further. This is a profound insight which we have already highlighted in Braganza (Reference Braganza2022), and indeed prominent sociologists have long before us (e.g., Luhmann, Reference Luhmann1995). In social systems, proxies very frequently take on the role of perpetuating power structures. The approximation of some abstract, unattainable goal is only required as an illusion to “legitimate” the proxy, and thereby the power structure, to the masses. But, Fernández-Dols' conclusion from this observation, namely that the goals rather than the proxies are the problem, seems to go decidedly too far. Yes, the social proxies for abstract goals like “liberty, justice, wealth, hygiene, safety” will always remain imperfect and the goals themselves unattainable. We would also not dispute the author's claim that the pursuit of unattainable goals can be disheartening or even damaging. But it is key to recognize that if we do deem a goal worthy, then a proxy can be both a means of social legitimization and an approximation of an unattainable ideal. In this reading, the task is precisely to identify where proxies exclusively provide “illusions of success” or “social legitimations” and work to make those successes more real and the proxies more legitimate. The overwhelming majority of commenters seem to agree, since they suggest ways to further the goals, rather than abandoning them (e.g., Samuelsson, Watve, Haig, Szocik, Browning, Kapeller et al., Higgins et al., Sadri et al., Kurth-Nelson et al., etc.). That said, we must of course recognize that many goals, Fernández-Dols names “recreating France in colonial Vietnam,” are ill-conceived and worth abandoning. Examining whether a goal is really worth pursuing is of course essential, and we must thank the author for emphasizing this.
R3. Additional examples: Open and closed proxy failure and non-scalar proxies
Numerous commenters suggested additional examples of proxy failure, strengthening our intuition that the phenomenon manifests far beyond what we were able to review. In order to contextualize these additional examples, we feel the need to introduce a distinction between what we will term open and closed proxy failure. Open proxy failure simply describes the existence of a poor proxy – the proxy fails in the sense that it does not allow the achievement of the goal for reasons that are external to the proxy-based optimization process. By contrast, closed proxy failure involves a proxy that becomes worse because of proxy-based optimization. Our initial definition was restricted to closed proxy failure – we will borrow the words of Pernu who concisely recapitulates this: “proxy failure [is not] a mere signal failure: central to the analysis is that the target system hacks the proxy.” However, as Kapeller et al. rightly state, this is “not the only way in which proxies can fail.” Some examples from Szocik, Watve, Kapeller et al., and Higgins et al. appear to reflect instances of open proxy failure, that is, cases that go beyond our original definition. Given that so many authors intuitively apply the term to both open and closed cases, it seems more appropriate to introduce the distinction than to insist on our original definition. Nevertheless, the distinction is key because it can lead to diametrically opposing implications: In open proxy failure, a greater reliance on the proxy may help to achieve the goal, while for closed proxy failure, it will tend to make the problems worse.
Finally, Burns, Ullman & Bridgers, and Nonacs suggest extensions or additional examples, of which we remain unsure if they fit well within the present framework. This is not to say that they are not related to it or interesting in their own right. But expanding the scope too wide comes at the cost of decreasing conceptual precision and clarity.
R3.1. Open proxy failures
Both Kapeller et al. and Watve explicitly highlight the distinction between open and closed proxy failure. They refer to “the political economy of scientific publishing” (Kapeller et al.) as a key illustrative example of what Watve calls “proxy exploitation.” Specifically, they note that the proxy of one system (journal profitability in the economic system) is simply not geared toward the goal of another system (advancement of knowledge in academia). In the words of Watve: “Editorial boards appear to strive more for journal prestige than the soundness and transparency of science (Abbot, Reference Abbott2023). More prestigious journals often have higher author charges (Triggle, MacDonald, Triggle, & Grierson, Reference Triggle, Macdonald, Triggle and Grierson2022) and make larger profits with little contribution to the original goals.” This is best construed as an instance of open proxy failure because it is not the pursuit of the proxy within the academic system that leads to failure. Indeed, the notion that a well-established proxy in one system could be exploited by another system or party with distinct goals seems well worth the additional label of “proxy exploitation” supplied by Watve. For instance, the phenomenon of mimicry among prey species, supplied by Nonacs, seems to fit the label proxy exploitation well. In it, a dangerous or poisonous prey species advertises this danger to a predator through bright colors or high visual contrasts (a proxy signal). This allows a second prey species (which is not dangerous) to mimic the proxy, and exploit the proxy-system, in order to repel predators. In contrast to our concept of proxy appropriation, in which a lower-level proxy is appropriated by a higher-level goal, proxy exploitation could refer to any case in which the systems are not nested.
Another example provided by Kapeller et al. is the academic “third mission,” which can be summarized as unfolding societal impact. Typical proxies for this, such as the number of public appearances or so-called alt-metrics, may simply fail to capture broader societal goals. This may be because the proxies are poor, or it may reflect an instance of “proxy exploitation,” in which the proxies of one system (such as social media engagement) fail to reflect the goals of another system (dissemination of accurate knowledge by academia).
The tension between open and closed proxy failure is also apparent in Szocik, who criticizes the use of quotas as proxies for “antiracism” and “antisexism.” We should begin by applauding Szocik for drawing attention to this important topic. Szocik explains why quotas are not sufficient to eliminate all the subtle drivers and determinants of racism or sexism. He highlights, for example, that quotas in some western power structures will do nothing to alleviate sexism against millions of women around the world. Furthermore, limited quotas tend to help a small non-representative subgroup of a minority rather than those most discriminated against. For instance, antiracist quotas in US universities tend to lead to the recruitment of highly educated first-generation immigrants rather than the descendants of former slaves. However, we feel it is important to highlight that these arguments imply only “open proxy failure” and thus do not help to predict if more or stricter quotas would improve the situation. For instance, the observation that proxies do not help where they are not applied seems trivial, and clearly suggests wider application rather than abandonment as a solution.
That said, we would acknowledge several ways in which Szocik suggests quotas could indeed also drive closed proxy failure. First, quotas could create sexist or racist backlashes, for instance when individuals are dismissed as having gained their positions not via merit but via quotas. Second, quotas can create an “illusion of success”, as highlighted by both Fernández-Dols and Szocik. Both aspects are clearly worth sustained attention. Nevertheless, it seems to us that the overall evidence strongly suggests quotas do help (Bratton & Ray, Reference Bratton and Ray2002; Chattopadhyay & Duflo, Reference Chattopadhyay and Duflo2004). Representation does not, alone, solve racism and sexism, but it is one important component of a solution.
Another example of open proxy failure (in our reading) is provided by Higgins et al. The commenters note that “constructs” in psychological research are typically assessed via forms of standardized tests, which are then used to create a score which stands as proxy for the presumed psychological trait (consider for instance the controversial IQ score; Gould, Reference Gould1996). Whether or not such a score is a good approximation of an underlying construct is then expressed under the heading of “construct validity,” which psychologists have developed sophisticated methods to assess. However, Higgins et al. note that “there is a growing body of evidence demonstrating that studies across psychological science routinely accept measurements as valid without sufficient validity evidence […], including measurements used for important clinical applications.” In other words, psychologists routinely use poor proxies, even though they should know better. The reason this could be classified as open proxy failure is because the limitations of the proxy seem to derive primarily from limits to the legibility of the underlying construct, and it is not the psychologist's pursuit of an accurate understanding of this construct that leads to failure.
R3.2. Closed proxy failure
Notably, a subtle shift of frame recasts the example described by Higgins et al. as closed proxy failure. Specifically, if the goal is defined as the scientific communication of valid constructs, then this may cause a “prioritisation of legibility over fidelity.” In this framing, it is ultimately academic competition for publication space (which hinges on highly legible test-scores) that undermines the validity of those scores in capturing the purported underlying constructs. Higgins et al. introduce the intriguing concept of “proxy pruning” to describe how this might occur in practice: Difficult qualitative questions about construct validity may be successively pruned away, to the benefit of simple legible metrics (such as test scores). Unproven metrics are often considered more valid if they correlate with other, also unproven but older, metrics. Similarly, Sadri & Paknezhad show that drug discovery operates by researchers trying to find molecules that bind to a target protein, rather than drugs which affect the human health phenotype. Huge biotech companies are built on this “target-based drug discovery” method, which is a reductionist method that is highly legible but of questionable validity. Browning et al. describe how animal welfare is typically measured through biochemical proxies of stress, such as cortisol – which is in fact not even an accurate metric of emotional state (stress can be negative fear, positive arousal, etc.). Instead, they argue for more complicated, but accurate, measurements of actual emotional state. We have already mentioned Szocik, who argues that many highly legible antisexist and antiracist policies fail to achieve the actual goal, equality. In each of these cases, proxy pruning may translate good legibility to proxy failure.
Several other comments presented examples of closed proxy failure in academia. Indeed, Watve admonishes that we did not “adequately cover proxy failure in academia” where “proxy failure has reached unprecedented and unparalleled levels.” This is perhaps the right place to disclose that proxy failure in academia was in fact one of the main areas of research that lead up to the current target article (Braganza, Reference Braganza2020, Reference Braganza2022; Peters et al., Reference Peters, Krauss and Braganza2022). So our main justifications for neglecting the topic in the present article are that (i) we have written about it previously and (ii) we felt sure it would be raised by others. Indeed, not only Watve, but also Kapeller et al., Higgins et al., Sadri & Paknezhad, and Browning et al. addressed proxy failures in academia. Watve's point that studying academia has particular potential to propel proxynomics forward is well taken. In addition to the above-mentioned concept of proxy exploitation, he proposes the concept of proxy complementarity, which again seems highly generalizable. Proxy complementarity suggests that proxy failures can accelerate (or become entrenched) when several parties stand to mutually gain from hacking (e.g., researchers and journals have a joint incentive to inflate Impact Factors).
We have already mentioned two examples of open proxy failure introduced by Kapeller et al. In addition, they describe an example of closed proxy failure in academia. Specifically, they highlight that large and established fields (or institutions) are more attractive for new entrants, or more impressive for evaluation agencies. This can lead to a preferential attachment dynamic – the proxies of size and prestige lead to a lack of diverse and disruptive science, which many evaluation agencies or researchers may claim as their actual goal.
In summary, mechanisms like proxy pruning or proxy complementarity likely foster closed proxy failure in many domains. Proxy-usage favors more legible over more accurate proxies. The intensive use of the proxy then subsequently causes failure along all the dimensions in which fidelity was sacrificed for legibility. Having more (rich white) women CEOs, (sad) cows with low cortisol, (useless) papers on candidate drug therapies and diabetes, and (invalid) metrics of psychological capabilities may be counterproductive to the real goals of equality, animal welfare, successful medical therapies, and rich understandings of our minds because we feel complacent, having checked the box. More sinister cases, as outlined for example by Watve, involve direct intentional gaming and are thus also clear cases of closed proxy failure.
R3.3. Unclear cases
Finally, several authors have suggested extensions of our framework about which we remain unsure.
Nonacs proposes the lack of a reliable paternity signal among animals as an instance of proxy failure. Such a proxy would be a “greenbeard” signal (Haig, Reference Haig2013) that could prove advantageous to both father and offspring by allowing fathers to allocate parental efforts only to their own offspring. Nonacs thus raises the question of why it is not common, suggesting it could be a risk-tradeoff between being nurtured by the own father versus being killed by another's father. In our view, since the lack of a genetic proxy here would not be due to any failure, but due to the advantage of not signaling, this example does not qualify as proxy failure.
Burns as well as Ullman & Bridgers suggest we could extend the notion of proxy failure to non-scalar proxies, namely heuristics and utterances, respectively. Although both contributions are highly interesting in their own right, we feel that this expansion in scope might come at too great a cost. Defining proxies as scalars allows a researcher to map proxy failure to mathematical approaches to optimization within several disciplines – something we view as highly desirable. Moreover, the scalar nature of proxies appears to be central to the amplifying effect of regulatory feedback – a point we discuss further below. But this in no way diminishes the insightfulness of the comments, which do highlight clear similarities.
R4. Additional drivers of proxy failure
Several commenters outlined additional drivers of proxy failure, which are likely to be of some generality.
R4.1. Sophisticated agents
Moldoveanu proposes that proxy failure occurs because agents are generally more sophisticated than regulators concerning the relevant task. This is likely true in many cases, and for these the “explanation-generating engine” outlined by Moldoveanu should be very valuable. In particular, the notion that agents typically will (i) see more, (ii) spend more time, and (iii) think more, concerning the particular task at hand, and thus exceed the regulatory model's sophistication for it, seems plausible. However, we feel it is worth reemphasizing that agents do not need to be sophisticated at all for proxy failure to occur. This is illustrated by the examples in our target article in which agents are passive recipients of selection, such as in many ecological cases, in the neuroscientific case, and in machine learning. Even in social and economic cases it may often be more accurate to think of proxy failure as a passive selection phenomenon, rather than a consequence of devious hacking by agents (Braganza, Reference Braganza2022; Smaldino & McElreath, Reference Smaldino and McElreath2016). We would also question Moldoveanu's concluding notion that “the desire of agents to ‘one day’ themselves become principals and regulators” can help mitigate proxy failure. He seems to follow a similar notion as Nonacs, who argues that the fact that marsupials who win the race to the teat will some day later themselves become mothers, would somehow constrain proxy failure. Instead, we would argue that the rational agent (or the rational marsupial) must concentrate on maximally gaming the system while young in order to become a regulator. This may reduce overall fitness and may well sometimes lead to catastrophic failure, but if higher-level selection is sufficiently slack, both firms and species will persist despite wasteful proxy-games. As Frank (Reference Frank2011) has argued, all marsupials would be better off if they could form a contract to halve their front paw size – because proxy value is relative to that of competitors, this would leave the outcome of competition completely unaffected, while saving everyone lots of energy. Of course, marsupials (or in Frank's original example, deer) are not known for their contracting skills. But more sophisticated agents can be better at this, as in the case of athletes' organizations that regulate the terms of their competition by banning doping. Notwithstanding such observations, which indicate that agent sophistication can both amplify and reduce proxy failure depending on specific factors, Moldoveanu's general insight, that there may be a sophistication gradient between agent and regulator, strikes us as worth exploring further, particularly in economic contexts.
R4.2. Proxy legibility
Several commenters have outlined how proxy legibility can drive proxy failure. This is an intriguing and perhaps ironic insight in a world that increasingly relies on objective, quantitative, and legible measures of accountability. In this context, it may be worth highlighting that the most famous and pithy formulation of the phenomenon of proxy failure – “when a measure becomes a target, it ceases to be a good measure” – received its name in an essay by Hoskin (Reference Hoskin, Munroe and Mouritsen1996) entitled “The awful idea of accountability,” and received its final phrasing by Strathern (Reference Strathern1997) in a critique of the “audit explosion” in higher education.
We have already introduced Higgins et al.'s proposed mechanism of proxy pruning. It could happen passively, simply because more legible proxies are easier to communicate. But legibility could also distract the attention of regulators from the task of assessing the agreement between goal and proxy. Similarly, Haig has highlighted in the context of education that more legible, or more “objective” evaluation criteria are also far easier for students to hack than “subjective” evaluations. He notes that “there may be more striving for excellence when an agent does not know how their work will be judged.” And indeed, both “revealed preference in economics, and fitness in evolutionary biology” have “low legibility before judgment is pronounced.” Resilience toward proxy failure may therefore derive from limited legibility.
R4.3. Proxy complementarity
Watve highlights a key property that can drive proxy failure, which he calls “proxy complementarity.” It applies when more than one set of agents have an interest in hacking a proxy. A prominent example, one that incidentally was also highlighted by Kapeller et al., is the following: Journals and authors have an aligned incentive to inflate a journal's impact factor when the latter publish in the former. Indeed, even funding agencies may benefit from the illusion of impact. It appears like most of the relevant players have an overt proxy incentive to allow inflation (or to aid other parties in hacking the proxy). This could be argued to have led to a situation that Merton (Reference Merton1940) called “goal displacement” and Watve calls “predatory proxy”: The proxy has become so dominant as an evaluation tool that it has begun to replace the underlying goal. Nevertheless, the very fact that countless voices are discussing the issue arguably proves that goal displacement has not been completed in academia. Regardless, the key point here is that proxy complementarity may be a powerful driver of proxy failure.
R5. Proposed solutions
Perhaps the most natural question to ask when confronted with the issue of proxy failure is: “How do we solve or at least mitigate it?” This question already featured prominently in our article, where, for instance, we suggested preemptive action by managers, or higher level selection, as mechanisms that can mitigate proxy failure. However, we were thrilled by the depth and quality of additional mitigation proposals offered by, among others, Kurth-Nelson et al., Samuelsson, Burns, Sacco, and Ullman & Bridgers.
R5.1. Emphasizing intrinsic incentives and play
In human systems, a group of mitigation strategies for proxy failure revolves around the intrinsic incentives of the agents.
Samuelsson highlights the under-studied role of “play” in human and animal education, offering it as a paradigmatic solution to avoid proxy failure: Play is an intrinsically joyful activity, which greatly enhances education, seemingly as a by-product. Proxy failure explains why, “paradoxically, relying solely on educational goals can lead to worse learning.” Students seem to succeed best when they are driven by free and intrinsic motivation. The same notion is captured by Haig when he writes, “Monet and Picasso did not paint by numbers.” Such a view highlights how counterproductive it might be for schools in the USA to cut recess in order to free up more time to teach to the test! Samuelsson's assessment, that play is an antidote to proxy-failure-laden socio-educational-economic pursuits, seems wholly plausible to us. He describes play as “loosening” educational proxies and argues for a world where we embrace looseness to avoid proxy failure. A further key insight is that “as children play with and socialize around the tools and technologies in their environment (Samuelsson, Price, & Jewitt, Reference Samuelsson, Price and Jewitt2022),” play allows highly flexible or “culturally appropriate learning” even in rapidly changing cultural environments. This hints at another way in which play might award resilience toward proxy failure. That play has no measurable proxy for educational success is, in fact, a strength; we suggest that practitioners look for analogs of play elsewhere. For example, we have noticed among ourselves and our students that side projects – those pursued for joy rather than for credit, grants, or a degree – can be richer, more interesting, and more quickly completed than “main” projects. Of course, abandoning educational proxies altogether will likely prove counterproductive in a world where children need to meet basic standards concerning literacy or numeracy. Here, both sequential and parallel approaches to proxy diversity, discussed in more detail below, may come in handy. A sequential approach might involve the regulator interleaving proxy-oriented epochs with time dedicated to activities that do not count toward any proxy. A parallel approach might involve dividing up classrooms into teams based on specific projects, student preferences, and/or randomly assigned proxies. A regulator employing such approaches would need to be a playful “fox” rather than a focused “hedgehog” when it comes to the use – or avoidance – of proxies.
Several commenters more directly suggested that proxy failure can be avoided by aligning regulation with intrinsic agent goals and motivations. Ullman & Bridgers, in their charmingly written comment, suggest that “highlighting common-ground” can mitigate proxy failure where it is wholly or primarily caused by an agent intentionally misunderstanding the regulator's goal. For example, the rat-breeders in Hanoi knew what they were doing and so too does a “smart-ass” child following the letter, but not the sprit, of a parental diktat. Fernández-Dols seems to push toward a similar conclusion when he highlights an experiment in which gaming “was a consequence of the participants’ perception of the [regulator's] instructions as an arbitrary imposition” and disappeared when “instructions were clear and procedurally fair.” Burns argues that medical heuristics may be superior to proxies largely because they are often used in cases where the goal of the regulator and agent are aligned. Heuristics then simply turn hard questions into easy ones: “is this patient having a heart attack” becomes a list of three simple questions for doctors to triage patients. Hospital regulators want to regulate their doctors, but doctors already want to accurately diagnose heart attacks. We fully agree with these notions: Where common-ground can be found, it should be emphasized. Fostering the “intrinsic motivation” of agents – for instance by encouraging “peer norms and professionalism” (Muller, Reference Muller2018, p. 112) – is preferable to extrinsic proxy incentives for another reason. It allows the agent to actually be a free agent, rather than an entity to be controlled. As Haig states “a subject's power to make decisions that others must accept on trust is a measure of the subject's sovereign agency.” Social regulators rarely perceive the insult that is implied, when they impinge on this “sovereign agency.” The insult lies not only in not trusting professional decisions, but in the tacit assertion that for example, doctors don't intrinsically want to help their patients or scientists are not intrinsically motivated to do good science (Muller, Reference Muller2018). The upshot is that using proxies can do harm beyond the harm to the goal, which we referred to as (closed) proxy failure, namely it can demotivate the agents.
A more subtle instance of relying on intrinsic incentives is provided by Robertson et al.: Social media users' intrinsic incentives, almost by definition, reflect their goals. However, social media companies appear to have hacked user's decision-making apparatus to undermine these incentives by showing threatening stimuli. Humans instinctively attend to such stimuli, such that company proxies of user goals, namely watch time or click-rate, can be increased by presenting threatening stimuli. The case mirrors that of addictive substances – users engage in behaviors which they claim not to want and which demonstrably harm their wellbeing. To better align user goals with behavior, Robertson et al. propose that companies allow users to explicitly state their respective preferences about what they would like to see. For example, companies could implement an easy to access button labeled “I don't want to see content like this.” Of course, companies may not do this because it will not make them money (here we have shifted frames, arguing that the corporate goal is not user-satisfaction but profit). In the terminology of our target article “a higher-level proxy of profit may constrain their ability or desire to further their customers goals.” But this in no way invalidates Robertson et al.'s proposal as a step to solving the proxy failure they highlight: Take steps to align the regulator's goal with the agent's goal; give consumers the chance to opt out of certain content types; let email users actually unsubscribe from spam easily and permanently. If such actions do not currently align with a company's (profitability) goals, then an altered regulatory framework or better consumer education may be necessary.
A final point worth making in this context is of course that there may often be no common ground to seek, or no intrinsic incentive that aligns with the regulator's goals. In many ways this is the standard assumption of most economists, as perfectly illustrated by the comment of Moldoveanu (though there are of course exceptions; Baker, Reference Baker1992; Holmstrom & Milgrom, Reference Holmstrom and Milgrom1991). Here, the notion that agents could in any way care about the regulator's goal simply does not feature in analysis. While we certainly feel Ullman & Bridgers, Fernández-Dols, Burns or Kapeller et al. are correct when they highlight the importance of intrinsic agent incentives (particularly in areas like parenting, medicine, or science), Moldoveanu's approach is clearly also appropriate in many situations. Some jobs will not be done because of intrinsic motivation, and here we will have to deal with extrinsic proxy incentives, and the ensuing failures. It is perhaps also worth reemphasizing that this best reflects most non-human instances of proxy failure, in which it often makes no sense to speak of intrinsic incentives toward the regulator's goal.
R5.2. Multiple proxies and “dynamic diversity”
Another fundamental mitigation strategy for proxy failure, outlined by Kurth-Nelson et al. and Sacco, seems to clearly apply in natural systems (Ågren, Haig, & McCoy, Reference Ågren, Haig and McCoy2022), is increasingly explored in social systems (Bryar & Carr, Reference Bryar and Carr2021; Coscieme et al., Reference Coscieme, Mortensen, Anderson, Ward, Donohue and Sutton2020), and is now being formalized at a fundamental level in AI research. It entails the use of an array of heterogeneous proxies, together with dynamic evaluation schedules and continuous reappraisal – Kurth-Nelson et al. use the term “dynamic diversity.” Some strands of reinforcement learning research suggest that multi-objective approaches can mitigate overfitting to a single objective (Dulberg, Dubey, Berwian, & Cohen, Reference Dulberg, Dubey, Berwian and Cohen2023; Hayes et al., Reference Hayes, Rădulescu, Bargiacchi, Källström, Macfarlane, Reymond and Roijers2022). Kurth-Nelson et al. also point to the “rich tradition” studying how diversity can mitigate proxy failure in the social sciences literature (Ostrom & Walker, Reference Ostrom and Walker1994). All this clearly resonates with the comments by Samuelsson and Sacco who highlight motivational diversity and the absence of a single scalar to optimize.
Before we proceed, we must address a key issue about the nature of proxies. In the target article, we claimed that a proxy which a regulator uses to rank competing agents can typically be described as a one-dimensional quantity, that is, a single scalar. This requirement is inherent to consistent ranking: Scalar values map to unambiguous orderings. Such ordering and scalarity is implicit in abstractions such as fitness, on which natural selection is based, as well as utility, on which most of economics is built. Similarly, in reinforcement learning, there is a reward which is apportioned in a scalar fashion, necessitating a scalar proxy to guide credit-assignment. We have argued that in the brain, a course of action that produces higher amounts of a scalar proxy (such as dopamine bursts) receives higher amounts of regulator feedback (such as synaptic weight enhancements). In this view, regulatory feedback, whether it takes the form of rank-based selection or reward-like feedback, depends on the use of a single scalar proxy. The fundamental underlying claim is that, to the degree that a system produces consistent decisions, even a multi-proxy optimization can be redescribed as a function producing one final scalar. The upshot is that, even if our final proxy is a complex aggregation of many input proxies and factors (take, e.g., the Journal Impact Factor, where this is undeniably the case), this does not eliminate the risk of proxy failure. We stand by this claim.
But Kurth-Nelson et al. and Sacco's point nevertheless stands. Firstly, even if diversity (say of the type we see in multi-objective optimization) does not fully “solve” proxy failure, it can clearly mitigate the risk or slow its time course. Integrating more proxies, even if they are formally equivalent to a single meta-proxy, increases the probability that weaknesses of individual inputs are counterbalanced. Secondly and more fundamentally, Kurth-Nelson et al. and Sacco reject a key premise on which we based our claim of scalarity – they drop the requirement of consistency. Distinct proxies, which are simultaneously used to guide behavior, may act at cross-purposes to each other and Sacco emphasizes that this closely resembles biological allostatic principles. Indeed, in an underappreciated study, Kapeller et al. (Reference Kapeller, Schütz, Steinerberger, Kapeller, Schütz and Steinerberger2013), building on Arrow (Reference Arrow1950), derive the “impossibility of rational consumer choice” from the premise that consumers use multiple incommensurable proxies. “Impossibility” in this context means that, if incommensurable proxies lead to distinct rankings, then they cannot be integrated into a single scalar. For instance, if humans have distinct proxy systems for optimizing uptake of fluids and carbohydrates, and these proxies lead to distinct rankings of products, then no single scalar utility function exists over all products. Cutting edge research from reinforcement learning (e.g., Dulberg et al., Reference Dulberg, Dubey, Berwian and Cohen2023) shows that using multiple parallel proxies, which are not required to be consistent, can indeed improve performance particularly in changing environments. Another advantage was that it led to a natural exploratory tendency, reminiscent of biological organisms. Further, the alteration of proxies on a timescale much slower than that of individual regulatory decisions might create epochs that are long enough for agents to boost the production of each proxy, but not hack it. The system remains in the sweet-spot below over-optimization or “overfitting.” Kurth-Nelson et al. bring up another interesting possibility related to proxy diversity: A subagent- or team-based “division of proxy labor.” Agents could be divided into teams or sub-groups each incentivized to pursue a distinct proxy in parallel. In both the sequential and parallel approaches to proxy diversity, regulatory feedback must be more active. Some isolation of each team may be required to prevent unwanted combining of proxies. These and other forms of diversity may allow systems to avoid recapitulating proxy failure in new “meta-proxies.” The usefulness of isolated subagents or teams raises additional questions that warrant further study: How does information-sharing influence how agents pursue proxies? Natural systems are never perfectly modular, which implies that information cannot be fully hidden from agents. How might this affect how agents balance the pursuit of proxies with the pursuit of their own goals, which may be distinct from those of the regulators? Even with diverse teams or partially inconsistent proxies, we suspect that no globally optimal strategy will be possible: Context-specific tradeoffs will inevitably be required between interpersonal legibility and motivational heterogeneity. These considerations point to the subtlety and complexity that successful mitigation strategies will likely require.
On a more general note, it seems almost certain that a wealth of research from across the disciplines touched upon will already have examined questions like those raised above. But integrating this knowledge remains a formidable task, which seems to require an entirely new level of trans-disciplinary scholarship. The remarkable reccurrence of proxy failure across all manner of natural and social systems does raise the question whether it is something akin to a natural law of goal-oriented systems, which leaves its traces even if it does not come to realize. It also raises deep philosophical questions about the nature of goals and agency and whether the traditional academic segregation between the humanities (including economics) and the natural sciences (in particular biology) can be sustained.
Rather ominously, Kurth-Nelson et al. close by suggesting that global catastrophic proxy failure (due to an excessive integration of “information societies”) may be the reason for the Fermi Paradox. Fermi wondered why, given the infinitude of planets, we have not yet been contacted by extraterrestrial life forms. The commenters seem to suggest that advanced civilization may tend to self-annihilate due to planetary proxy-failure. Kurth-Nelson et al.'s speculation reminds of other dire warnings, such as Bostrom's (Reference Bostrom2014) famous paperclip factory, or indeed less theoretically, the notion of GDP-growth-driven planetary ecological collapse (Diamond, Reference Diamond2004; Kemp et al., Reference Kemp, Xu, Depledge, Ebi, Gibbins, Kohler and Lenton2022; Pueyo, Reference Pueyo2018). We cannot comment further on such issues here, but we do think they serve to emphasize the potential scope and relevance of a trans-disciplinary study of proxynomics.
Target article
Dead rats, dopamine, performance metrics, and peacock tails: Proxy failure is an inherent risk in goal-oriented systems
Related commentaries (20)
An updated perspective on teleonomy
Animal welfare science, performance metrics, and proxy failure
Behavioral proxies compete by the time courses of their rewards, including endogenous rewards
Changing the incentive structure of social media may reduce online proxy failure and proliferation of negativity
Dynamic diversity is the answer to proxy failure
Genies, lawyers, and smart-asses: Extending proxy failures to intentional misunderstandings
It's the biology, stupid! Proxy failures in economic decision making
Navigating proxy failures in education: Learning from human and animal play
On abstract goals’ perverse effects on proxies: The dynamics of unattainability
Proxies, heuristics, and goal alignment
Proxy failure and poor measurement practices in psychological science
Proxy failure as a feature of adaptive control systems
Proxy failure in academia: More than just another example
Proxy failure in social policies as one of the main causes of persistent sexism and racism
Proxy failures in practice: Examples from the sociology of science
Reductionism and proxy failure: From neuroscience to target-based drug discovery
Regulator and agent sophistication as an explanation-generating engine for proxy failure dynamics
Subjective and objective corruption of intuition and rational choice
The cost of success or failure for proxy signals in ecological problems
The determinants of proxy treadmilling in evolutionary models of reliable signals
Author response
Teleonomy, legibility, and diversity: Do we need more “proxynomics”?