Like many aspects of cognition, learning can be analyzed at multiple levels. At a high level (Marr's [Reference Marr1982] “computational” level) we can model learning by providing an abstract characterization of the learner's inductive biases: The preferences that the learner has for some types of generalizations over others (Mitchell, Reference Mitchell1997). At a lower level, learning can be modeled by specifying the particular algorithms and representations that the learner uses to realize its inductive biases. For each of these levels, there are modeling traditions that have been successful: Rational analysis and Bayesian models are defined at the computational level, while neural networks are defined at the level of algorithm and representation. But how can we connect these different traditions? How can we work toward unified theories that bridge the divide between levels? In this piece, we agree with, and extend, Binz et al.'s point that meta-learning is a powerful tool for studying inductive biases in a way that spans levels of analysis.
Binz et al. describe how an agent can use meta-learning to derive inductive biases from its environment. This makes meta-learning well-suited for modeling situations where human inductive biases align with some problem that humans face – the situations that are well-covered by the paradigm of rational analysis (Anderson, Reference Anderson1990). As Binz et al. discuss, meta-learning can therefore be used to enable an algorithmically defined model (such as a neural network) to find the solution predicted by rational analysis, a procedure that bridges the divide between abstract rational solutions and specific algorithmic instantiations.
This direction laid out by Binz et al. is exciting. We argue that it can in fact be viewed as one special case within a broader space of possible lines of inquiry about inductive biases that meta-learning opens up. In the more general case, the Bayesian perspective allows us to define an inductive bias as a probability distribution over hypotheses. A neural network can meta-learn from data sampled from this distribution, giving it the inductive bias in question. The distribution that is used could be drawn from (an approximation of) a human's experience, in which case this framing matches the extension of rational analysis that Binz et al. advocate for. But it is also possible to use other approaches for defining this distribution, which can correspond to any probabilistic model. Since we can control probabilistic models, using a probabilistic model to define the distribution makes it possible to control the inductive biases that the meta-learned model ends up with (Lake, Reference Lake2019; Lake & Baroni, Reference Lake and Baroni2023; McCoy, Grant, Smolensky, Griffiths, & Linzen, Reference McCoy, Grant, Smolensky, Griffiths and Linzen2020). This allows us to take an inductive bias defined at Marr's computational level and distill it into a neural network defined at the level of algorithm and representation.
Traditionally, certain types of inductive biases have been associated with certain types of algorithms and representations: The strong inductive biases of Bayesian models have generally been based on discrete, symbolic representations (e.g., Goodman, Tenenbaum, Feldman, & Griffiths, Reference Goodman, Tenenbaum, Feldman and Griffiths2008), while neural networks use continuous vector representations (Hinton, McClelland, & Rumelhart, Reference Hinton, McClelland, Rumelhart, Rumelhart and McClelland1986) and have weak inductive biases. However, meta-learning enables us to separately manipulate inductive biases and representations, making it possible to model previously inaccessible combinations of representations and inductive biases. One noteworthy example is that we can use meta-learning to give symbolic inductive biases to a neural network, allowing us to study whether and how structured hypothesis spaces (of the sort often used in Bayesian models) can be realized in a system with continuous vector representations (the type of representation that is central in both biological and artificial neural networks). Thus, while Binz et al. note that meta-learning can be used as an alternative to Bayesian models, another use of meta-learning is in fact to expand the applicability of Bayesian approaches by reconciling them with connectionist models – thereby bringing together two successful research traditions that have often been framed as antagonistic (e.g., Griffiths, Chater, Kemp, Perfors, & Tenenbaum, Reference Griffiths, Chater, Kemp, Perfors and Tenenbaum2010; McClelland et al., Reference McClelland, Botvinick, Noelle, Plaut, Rogers, Seidenberg and Smith2010).
In our prior work, we have demonstrated the efficacy of this approach in the domain of language (McCoy & Griffiths, Reference McCoy and Griffiths2023). We started with a Bayesian model created by Yang and Piantadosi (Reference Yang and Piantadosi2022), whose inductive bias is defined using a symbolic grammar. We then used meta-learning (specially, MAML: Finn, Abbeel, & Levine, Reference Finn, Abbeel and Levine2017; Grant, Finn, Levine, Darrell, & Griffiths, Reference Grant, Finn, Levine, Darrell and Griffiths2018) to distill this Bayesian model's prior into a neural network. The resulting system had strong inductive biases of the sort traditionally found only in symbolic models, enabling this system to learn formal linguistic patterns from small numbers of examples despite being a neural network, a class of systems that normally requires far more examples to learn such patterns. Additionally, the flexible neural implementation of this system made it possible to train it on naturalistic textual data, something that is intractable with the Bayesian model that we built on. Thus, meta-learning enabled the creation of a model that combined the complementary strengths of Bayesian and connectionist models of language learning.
These results show that inductive biases traditionally defined using symbolic Bayesian models can instead be realized inside a neural network. Therefore, symbolic inductive biases do not necessarily require inherently symbolic representations or algorithms. This demonstration provides one already-realized example of how meta-learning can advance our understanding of foundational questions about how different levels of cognition relate to each other, in ways that go beyond the realm of rational analysis.
Like many aspects of cognition, learning can be analyzed at multiple levels. At a high level (Marr's [Reference Marr1982] “computational” level) we can model learning by providing an abstract characterization of the learner's inductive biases: The preferences that the learner has for some types of generalizations over others (Mitchell, Reference Mitchell1997). At a lower level, learning can be modeled by specifying the particular algorithms and representations that the learner uses to realize its inductive biases. For each of these levels, there are modeling traditions that have been successful: Rational analysis and Bayesian models are defined at the computational level, while neural networks are defined at the level of algorithm and representation. But how can we connect these different traditions? How can we work toward unified theories that bridge the divide between levels? In this piece, we agree with, and extend, Binz et al.'s point that meta-learning is a powerful tool for studying inductive biases in a way that spans levels of analysis.
Binz et al. describe how an agent can use meta-learning to derive inductive biases from its environment. This makes meta-learning well-suited for modeling situations where human inductive biases align with some problem that humans face – the situations that are well-covered by the paradigm of rational analysis (Anderson, Reference Anderson1990). As Binz et al. discuss, meta-learning can therefore be used to enable an algorithmically defined model (such as a neural network) to find the solution predicted by rational analysis, a procedure that bridges the divide between abstract rational solutions and specific algorithmic instantiations.
This direction laid out by Binz et al. is exciting. We argue that it can in fact be viewed as one special case within a broader space of possible lines of inquiry about inductive biases that meta-learning opens up. In the more general case, the Bayesian perspective allows us to define an inductive bias as a probability distribution over hypotheses. A neural network can meta-learn from data sampled from this distribution, giving it the inductive bias in question. The distribution that is used could be drawn from (an approximation of) a human's experience, in which case this framing matches the extension of rational analysis that Binz et al. advocate for. But it is also possible to use other approaches for defining this distribution, which can correspond to any probabilistic model. Since we can control probabilistic models, using a probabilistic model to define the distribution makes it possible to control the inductive biases that the meta-learned model ends up with (Lake, Reference Lake2019; Lake & Baroni, Reference Lake and Baroni2023; McCoy, Grant, Smolensky, Griffiths, & Linzen, Reference McCoy, Grant, Smolensky, Griffiths and Linzen2020). This allows us to take an inductive bias defined at Marr's computational level and distill it into a neural network defined at the level of algorithm and representation.
Traditionally, certain types of inductive biases have been associated with certain types of algorithms and representations: The strong inductive biases of Bayesian models have generally been based on discrete, symbolic representations (e.g., Goodman, Tenenbaum, Feldman, & Griffiths, Reference Goodman, Tenenbaum, Feldman and Griffiths2008), while neural networks use continuous vector representations (Hinton, McClelland, & Rumelhart, Reference Hinton, McClelland, Rumelhart, Rumelhart and McClelland1986) and have weak inductive biases. However, meta-learning enables us to separately manipulate inductive biases and representations, making it possible to model previously inaccessible combinations of representations and inductive biases. One noteworthy example is that we can use meta-learning to give symbolic inductive biases to a neural network, allowing us to study whether and how structured hypothesis spaces (of the sort often used in Bayesian models) can be realized in a system with continuous vector representations (the type of representation that is central in both biological and artificial neural networks). Thus, while Binz et al. note that meta-learning can be used as an alternative to Bayesian models, another use of meta-learning is in fact to expand the applicability of Bayesian approaches by reconciling them with connectionist models – thereby bringing together two successful research traditions that have often been framed as antagonistic (e.g., Griffiths, Chater, Kemp, Perfors, & Tenenbaum, Reference Griffiths, Chater, Kemp, Perfors and Tenenbaum2010; McClelland et al., Reference McClelland, Botvinick, Noelle, Plaut, Rogers, Seidenberg and Smith2010).
In our prior work, we have demonstrated the efficacy of this approach in the domain of language (McCoy & Griffiths, Reference McCoy and Griffiths2023). We started with a Bayesian model created by Yang and Piantadosi (Reference Yang and Piantadosi2022), whose inductive bias is defined using a symbolic grammar. We then used meta-learning (specially, MAML: Finn, Abbeel, & Levine, Reference Finn, Abbeel and Levine2017; Grant, Finn, Levine, Darrell, & Griffiths, Reference Grant, Finn, Levine, Darrell and Griffiths2018) to distill this Bayesian model's prior into a neural network. The resulting system had strong inductive biases of the sort traditionally found only in symbolic models, enabling this system to learn formal linguistic patterns from small numbers of examples despite being a neural network, a class of systems that normally requires far more examples to learn such patterns. Additionally, the flexible neural implementation of this system made it possible to train it on naturalistic textual data, something that is intractable with the Bayesian model that we built on. Thus, meta-learning enabled the creation of a model that combined the complementary strengths of Bayesian and connectionist models of language learning.
These results show that inductive biases traditionally defined using symbolic Bayesian models can instead be realized inside a neural network. Therefore, symbolic inductive biases do not necessarily require inherently symbolic representations or algorithms. This demonstration provides one already-realized example of how meta-learning can advance our understanding of foundational questions about how different levels of cognition relate to each other, in ways that go beyond the realm of rational analysis.
Financial support
This material is based upon work supported by the National Science Foundation SBE Postdoctoral Research Fellowship under Grant No. 2204152 and the Office of Naval Research under Grant No. N00014-18-1-2873.
Competing interests
None.