Quilty-Dunn et al. survey empirical evidence consistent with Bayesian models of cognition to advertise the explanatory breadth of language-of-thought (LoT)-based cognitive architectures. They show that Bayesian models that treat concepts as stochastic programmes and thinking as approximate Bayesian inference can fit various sets of experimental data. But they do not say much about the origin of the representational system constituting a probabilistic LoT (PLoT), the nature of learning supported by LoT-based architectures implementing probabilistic inference, and how the explanatory breadth of such architectures is more convincing compared to architectures that do not posit any LoT.
(In)famously, Jerry Fodor developed a circularity objection to LoT. If concept learning is a process of inductive inference aimed at testing hypotheses concerning the identity conditions for a given concept, then this process must recruit the very concept it seeks to learn. But if that is the case, then no new concept can be learned through inductive inference aimed at hypothesis testing (2008, p. 139). Furthermore, if the very representational system constituting an LoT cannot be learned through inductive inference – because that would also generate a vicious circularity – then all representations constituting an LoT must be innate (1975, p. 65).
Similar objections apply to PLoT-based architectures too, where the problem of concept learning can be understood as the problem of performing Bayesian inference to compute the posterior probability of hypotheses consisting in stochastic programmes formulated in a PLoT, given, say, a set of observed example objects and observed labels for those objects (Goodman, Tenenbaum, Feldman, & Griffiths, Reference Goodman, Tenenbaum, Feldman and Griffiths2008; Goodman, Tenenbaum, & Gerstenberg, Reference Goodman, Tenenbaum, Gerstenberg, Margolis and Laurence2015). Because the hypothesis space of stochastic programmes in PLoT-based architectures is prespecified, these architectures do not seem to support genuine learning of any new concept. The probabilistic inferences that they implement are aimed at updating and comparing the posterior probabilities of hypotheses that they possess from the outset, which implies an implausibly strong nativist picture of cognitive development (see, e.g., Elman et al., Reference Elman, Bates, Johnson, Karmiliff-Smith, Parisi and Plunkett1996; Putnam, Reference Putnam1988, Ch. 1).
To address this circularity objection, one approach is to distinguish between learning and acquiring a concept, and point out that thinkers acquire concepts without learning them – where learning consists in a rationally evaluable process, whereas acquiring a concept is a nonrational, purely causal process, driven, for example, by associative processes implemented in a connectionist architecture (Fodor, Reference Fodor1975, Reference Fodor2008). But if nonrational, noncognitive, associative processes provide us with the best explanation of concept acquisition in many domains, then PLoT-based architectures would enjoy significantly less explanatory breadth than what Quilty-Dunn et al. suggest.
A different approach is to insist that thinkers learn concepts in a way that is rationally evaluable and is based on probabilistic inferences in PLoT-based architectures, and to also emphasize that this learning process does not need to generate any vicious circularity, or be committed to an implausible nativism.
Carey (Reference Carey2009), for example, argues that children learn concepts based on Quinian-bootstrapping processes operating on innate, core systems of knowledge. Although the notions of “bootstrapping” and “core knowledge” have been helpful for developing empirically adequate PLoT-based models (e.g., Piantadosi, Tenenbaum, & Goodman, Reference Piantadosi, Tenenbaum and Goodman2012), one objection is that bootstrapping in a PLoT-based architecture cannot really explain concept learning, because its built-in knowledge would already include the very concepts that it purports to explain (cf. Beck, Reference Beck2017, for a critical assessment of this objection). In this conceptualization, learning would amount to combining and recombining built-in representations based on built-in rules of composition. But then, one may wonder how combining and recombining a stock of built-in representations constituting one's core knowledge qualifies as genuine learning, and, more substantially why building domain-specific, core knowledge into a PLoT-based architecture is not a mere exercise in ad hoc modelling.
To resolve these issues, we should notice that, first, any conception of learning without any built-in hypothesis space is incoherent; second, a learner's hypothesis space is hierarchically organized, and includes stacks of latent and explicit hypothesis spaces (Perfors, Reference Perfors2012); third, in PLoT-based architectures, there is ample latitude for the choice of built-in hypotheses/representations and learning rules (Colombo, Reference Colombo, Sprevak and Colombo2019). But this choice – though it is often left unconstrained by evolutionary, neurobiological, and psychological evidence – is typically transparent and empirically evaluable, which facilitates clearer understanding of the nativist (or empiricist) character of any given PLoT-based architecture compared to connectionist ones (Colombo, Reference Colombo2018).
Considering these three points, it is easier to appreciate why we should reject the worries that PLoT-based architectures must presuppose an unacceptable amount of innate structure and cannot support genuine learning. If a learner's latent hypothesis space defines the learner's representational capacity – that is, the range and kinds of possible thoughts that the learner can entertain over a lifetime – then some latent hypothesis space defined by abstract primitives (sort of Kantian categories) is built into any PLoT-based architecture. Manipulating these abstract primitives can generate a learner's explicit hypothesis space, which defines the learner's actual thoughts at a given time. Such thoughts can play various causal roles in perception, action, and other cognitive functions, but are not built into the learner: They are generated from the latent hypothesis space. As Perfors (Reference Perfors2012, pp. 131–132) helpfully puts it, a latent hypothesis space is like a typewriter with an infinite amount of paper, which can generate certain kinds of documents like Paradise Lost, but not others like La Madonna della Pietà; the set of actual documents that have been typed out and can enter various causal relationships (e.g., Paradise Lost can be read or burnt) is like an explicit hypothesis space. Given this distinction, concept learning consists in an extended, hierarchically organized process of hypothesis generation and hypothesis testing tapping abstract primitives defining the learner's overall representational power.
But although this distinction helps us to successfully address a traditional theoretical objection to LoT, showing that PLoT-based architectures can support genuine concept learning without necessarily positing an implausible amount of innate structure, it remains an open empirical question whether PLoT-based architectures enjoy more explanatory breadth in relation to concept learning compared to architectures that do not posit any LoT.
Quilty-Dunn et al. survey empirical evidence consistent with Bayesian models of cognition to advertise the explanatory breadth of language-of-thought (LoT)-based cognitive architectures. They show that Bayesian models that treat concepts as stochastic programmes and thinking as approximate Bayesian inference can fit various sets of experimental data. But they do not say much about the origin of the representational system constituting a probabilistic LoT (PLoT), the nature of learning supported by LoT-based architectures implementing probabilistic inference, and how the explanatory breadth of such architectures is more convincing compared to architectures that do not posit any LoT.
(In)famously, Jerry Fodor developed a circularity objection to LoT. If concept learning is a process of inductive inference aimed at testing hypotheses concerning the identity conditions for a given concept, then this process must recruit the very concept it seeks to learn. But if that is the case, then no new concept can be learned through inductive inference aimed at hypothesis testing (2008, p. 139). Furthermore, if the very representational system constituting an LoT cannot be learned through inductive inference – because that would also generate a vicious circularity – then all representations constituting an LoT must be innate (1975, p. 65).
Similar objections apply to PLoT-based architectures too, where the problem of concept learning can be understood as the problem of performing Bayesian inference to compute the posterior probability of hypotheses consisting in stochastic programmes formulated in a PLoT, given, say, a set of observed example objects and observed labels for those objects (Goodman, Tenenbaum, Feldman, & Griffiths, Reference Goodman, Tenenbaum, Feldman and Griffiths2008; Goodman, Tenenbaum, & Gerstenberg, Reference Goodman, Tenenbaum, Gerstenberg, Margolis and Laurence2015). Because the hypothesis space of stochastic programmes in PLoT-based architectures is prespecified, these architectures do not seem to support genuine learning of any new concept. The probabilistic inferences that they implement are aimed at updating and comparing the posterior probabilities of hypotheses that they possess from the outset, which implies an implausibly strong nativist picture of cognitive development (see, e.g., Elman et al., Reference Elman, Bates, Johnson, Karmiliff-Smith, Parisi and Plunkett1996; Putnam, Reference Putnam1988, Ch. 1).
To address this circularity objection, one approach is to distinguish between learning and acquiring a concept, and point out that thinkers acquire concepts without learning them – where learning consists in a rationally evaluable process, whereas acquiring a concept is a nonrational, purely causal process, driven, for example, by associative processes implemented in a connectionist architecture (Fodor, Reference Fodor1975, Reference Fodor2008). But if nonrational, noncognitive, associative processes provide us with the best explanation of concept acquisition in many domains, then PLoT-based architectures would enjoy significantly less explanatory breadth than what Quilty-Dunn et al. suggest.
A different approach is to insist that thinkers learn concepts in a way that is rationally evaluable and is based on probabilistic inferences in PLoT-based architectures, and to also emphasize that this learning process does not need to generate any vicious circularity, or be committed to an implausible nativism.
Carey (Reference Carey2009), for example, argues that children learn concepts based on Quinian-bootstrapping processes operating on innate, core systems of knowledge. Although the notions of “bootstrapping” and “core knowledge” have been helpful for developing empirically adequate PLoT-based models (e.g., Piantadosi, Tenenbaum, & Goodman, Reference Piantadosi, Tenenbaum and Goodman2012), one objection is that bootstrapping in a PLoT-based architecture cannot really explain concept learning, because its built-in knowledge would already include the very concepts that it purports to explain (cf. Beck, Reference Beck2017, for a critical assessment of this objection). In this conceptualization, learning would amount to combining and recombining built-in representations based on built-in rules of composition. But then, one may wonder how combining and recombining a stock of built-in representations constituting one's core knowledge qualifies as genuine learning, and, more substantially why building domain-specific, core knowledge into a PLoT-based architecture is not a mere exercise in ad hoc modelling.
To resolve these issues, we should notice that, first, any conception of learning without any built-in hypothesis space is incoherent; second, a learner's hypothesis space is hierarchically organized, and includes stacks of latent and explicit hypothesis spaces (Perfors, Reference Perfors2012); third, in PLoT-based architectures, there is ample latitude for the choice of built-in hypotheses/representations and learning rules (Colombo, Reference Colombo, Sprevak and Colombo2019). But this choice – though it is often left unconstrained by evolutionary, neurobiological, and psychological evidence – is typically transparent and empirically evaluable, which facilitates clearer understanding of the nativist (or empiricist) character of any given PLoT-based architecture compared to connectionist ones (Colombo, Reference Colombo2018).
Considering these three points, it is easier to appreciate why we should reject the worries that PLoT-based architectures must presuppose an unacceptable amount of innate structure and cannot support genuine learning. If a learner's latent hypothesis space defines the learner's representational capacity – that is, the range and kinds of possible thoughts that the learner can entertain over a lifetime – then some latent hypothesis space defined by abstract primitives (sort of Kantian categories) is built into any PLoT-based architecture. Manipulating these abstract primitives can generate a learner's explicit hypothesis space, which defines the learner's actual thoughts at a given time. Such thoughts can play various causal roles in perception, action, and other cognitive functions, but are not built into the learner: They are generated from the latent hypothesis space. As Perfors (Reference Perfors2012, pp. 131–132) helpfully puts it, a latent hypothesis space is like a typewriter with an infinite amount of paper, which can generate certain kinds of documents like Paradise Lost, but not others like La Madonna della Pietà; the set of actual documents that have been typed out and can enter various causal relationships (e.g., Paradise Lost can be read or burnt) is like an explicit hypothesis space. Given this distinction, concept learning consists in an extended, hierarchically organized process of hypothesis generation and hypothesis testing tapping abstract primitives defining the learner's overall representational power.
But although this distinction helps us to successfully address a traditional theoretical objection to LoT, showing that PLoT-based architectures can support genuine concept learning without necessarily positing an implausible amount of innate structure, it remains an open empirical question whether PLoT-based architectures enjoy more explanatory breadth in relation to concept learning compared to architectures that do not posit any LoT.
Competing interest
None.