Introduction
As infants develop from knowing little about language to comprehending dozens of words during their first year (Frank et al., Reference Frank, Braginsky, Yurovsky and Marchman2017), they learn aspects of word usage from interactions with caregivers in social contexts (Tamis-LeMonda et al., Reference Tamis-LeMonda, Kuchirko and Song2014). To understand this process, previous research has mostly focused on words with relatively concrete referents: object-labeling nouns. However, little is known about contexts that predict word instances in infant-directed speech (IDS) with less concrete meanings (Maguire et al., Reference Maguire, Hirsh-Pasek, Golinkoff, Hirsh-Pasek and Golinkoff2006). For example, how do infants acquire verbs, which cannot be mapped onto physical objects or often even concrete actions? How might caregiver-infant interactions facilitate this – for example, what contextual factors covary with verb usage in naturalistic infant-parent interactions?
Verb acquisition
Verb learning appears to be harder than noun learning (e.g., Golinkoff & Hirsh-Pasek, Reference Golinkoff and Hirsh-Pasek2008). Infants show evidence of understanding their first nouns around 6.0 - 7.5 months of age (Bortfeld et al., Reference Bortfeld, Morgan, Golinkoff and Rathbun2005), but do not recognize their first verbs until 11.0 - 13.5 months (e.g., Nazzi et al., Reference Nazzi, Dilley, Jusczyk, Shattuck-Hufnagel and Jusczyk2005). Similarly, although infants typically produce their first words around the end of the first year, first verbs are not produced until months later (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994). This gap in acquisition has been argued to reflect differential difficulty in concept learning: children induce object categories easier than relational or state/change categories (Rattermann & Gentner, Reference Rattermann and Gentner1998). Thus, they may attend to, and learn, regularities corresponding to noun meanings prior to regularities corresponding to verb meanings (Gentner, Reference Gentner, Hirsh-Pasek and Golinkoff2006).
There is indirect evidence, however, that input factors moderate verb as well as noun learning. Differential word usage in language input and events in caregiver-infant interactions might contribute to the early prevalence of nouns vs. verbs (Chan & Nicoladis, Reference Chan and Nicoladis2010). For example, cross-linguistic differences in the prominence of verbs and nouns in IDS might lead to different proportions of nouns and verbs in early vocabularies (Waxman et al., Reference Waxman, Fu, Arunachalam, Leddon, Geraghty and Song2013). Mandarin-speaking mothers end sentences with verbs (which facilitates encoding) more often than English-speaking mothers (Tardif et al., Reference Tardif, Shatz and Naigles1997). Conversely, when reading picture books, English-speaking mothers of 20-month-olds produced more nouns than Mandarin-speaking mothers (Tardif et al., Reference Tardif, Gelman and Xu1999). Such differences might contribute to the higher proportion of verbs in Mandarin-speaking toddlers’ vocabulary, relative to English-speaking peers (Tardif et al., Reference Tardif, Gelman and Xu1999). Interestingly, English-speaking adults guess verbs from muted videos of caregiver-infant interactions more accurately if the caregiver is Mandarin-speaking than if she is English-speaking (Snedeker et al., Reference Snedeker, Li and Yuan2003), suggesting that non-verbal cues to verb meaning also differ cross-culturally. Evidence therefore suggests that both linguistic and non-linguistic input during caregiver-infant interactions support infants’ verb acquisition. Thus, it is important to document how and when verbs are used in naturalistic caregiver-infant interactions.
Distributional contexts
The distributional contextual patterns of natural speech are far from random (Redington et al., Reference Redington, Chater and Finch1998), and evidence suggests that infants are sensitive to these patterns. Words strongly and successively constrain the types and positions of other words in an utterance. For example, “I eat…” is much more likely to be followed by “apples” than by “walk,” “exuberant,” or “tigers.” Such patterns of lexical co-occurrence could help infants infer word meaning and assign words to syntactic categories. Willits et al. (Reference Willits, Seidenberg and Saffran2014) suggested that the more frequent and consistent distributional contexts of nouns compared to verbs might contribute to the noun precedence in infant vocabularies. They also confirmed that 7.5- and 9.5-month-old infants learn distributional statistics of verbs, because infants looked longer in response to verbs appearing in infrequent linguistic contexts.
With respect to lexical co-occurrences, recent studies showed that verbs co-occur non-randomly with pronouns (Babineau et al., Reference Babineau, Shi and Christophe2020; Laakso & Smith, Reference Laakso and Smith2007), object nouns (Yuan et al., Reference Yuan, Fisher, Kandhadai and Fernald2011), and adverbs (Syrett et al., Reference Syrett, Arunachalam and Waxman2014). However, these studies examined a limited number of specific verbs to establish the possible importance of distributional statistics. A broader survey of the degree to which many verbs from varied verb categories co-occur with multiple lexical and non-lexical contextual factors would more clearly show how naturalistic input patterns might support verb learning.
One type of contextual co-occurrence information is pronouns. Pronouns are closed-class markers for subjects or objects, with limited semantic information (e.g., number or gender), that are typically disambiguated by syntactic and pragmatic context. They typically refer to established or “given” constituents (e.g., “I like it”, Messer, Reference Messer1978). Moreover, although limited in informativeness and concreteness, pronouns are among the most frequent words in English IDS (Laakso & Smith, Reference Laakso and Smith2007). Importantly, because pronouns carry both semantic and syntactic information (e.g., “She eats them” vs. “They eat her”), they should have discernable distributional statistics, and these different distributions might differ among verbs and verb categories. Thus, we investigated whether pronouns alone could predict verb semantics within naturalistic American-English IDS. This extends previous findings on verb-pronoun co-occurrences in natural language (Babineau et al., Reference Babineau, Shi and Christophe2020; Laakso & Smith, Reference Laakso and Smith2007), and their potential to support infants’ verb learning.
Embodied contexts
Caregiver-infant interactions are multimodal and dynamic (Suarez-Rivera et al., Reference Suarez-Rivera, Schatz, Herzberg and Tamis-LeMonda2022). A sequence of actions accompanying speech, like a caregiver showing their infant a toy while talking, then the infant looking at and grabbing the toy, is common in everyday interactions, and might scaffold word learning. For example, infants learn object nouns more readily during joint attention with caregivers (Tomasello & Farrar, Reference Tomasello and Farrar1986), and when the referent is visually dominant (Yu & Smith, Reference Yu and Smith2012). Caregivers’ object naming utterances are also predicted by infants’ and mothers’ gaze and object-directed manual actions, and by shared attentional focus during the interaction (Chang et al., Reference Chang, de Barbaro and Deák2016; Custode & Tamis-LeMonda, Reference Custode and Tamis-LeMonda2020; West & Iverson, Reference West and Iverson2017). Similarly, embodied contexts also facilitate verb learning. A recent home-recording study showed that caregivers often used movement verbs when their 13-month-old infants locomoted, and manual verbs when the infants manipulated objects (West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022). Also, an eye-tracking study showed that infants (15 to 25 months) paid more attention when caregivers produced verbs corresponding with their actions (Liu et al., Reference Liu, Zhang and Yu2019). This suggests that joint attention to actions (see Deák et al., Reference Deák, Krasno, Triesch, Lewis and Sepeda2014) might help infants learn verb as well as noun meanings. However, because actions are more transient than objects, action verbs might co-occur less reliably than object-nouns with their respective referents during bouts of shared attention. In investigating the role of verb-action co-occurrences in verb acquisition, previous studies largely focused on object-related actions and on locomotion (West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022). Here we consider a wider range of verb categories (e.g., cognition/perception and volition verbs) and study how caregivers use them co-occurrent with object handling, gaze target, and locomotion.
The current study
To understand how verbs are distributionally represented in pronominal and embodied contexts during caregiver-infant play, we video-recorded mother-infant free-play sessions at 12 months of age, and transcribed mothers’ utterances. We annotated mothers’ and infants’ gaze and manual actions, as well as infant locomotion. We classified the most frequent verbs and pronouns in mothers’ speech based on common semantic (e.g., mental vs. action verbs) and syntactic (e.g., transitive vs. intransitive) features of interest in previous research. We analyzed co-occurrences of each verb category with pronoun categories as well as embodied contextual factors (i.e., gaze, hands, locomotion). We tested the strength of co-occurrence frequencies using linear mixed effects models. We hypothesized that different verb categories co-occur with distinct combinations of linguistic and embodied factors.
Specifically, our first prediction was that, like object-naming nouns (Tomasello & Farrar, Reference Tomasello and Farrar1986), object-action verbs may also co-occur with episodes of joint attention and object handling. Second, we predicted that movement and mental verbs would be differentiated by correlated pronominal and embodied variables during play. Previous research indicated that motion verbs tend to precede the word “it”, whereas psychological attitude verbs tend to precede a clause (Laakso & Smith, Reference Laakso and Smith2007). Mental verbs are learned later than movement verbs (Bloom et al., Reference Bloom, Lightbown and Hood1975; Shatz et al., Reference Shatz, Wellman and Silber1983), so any differential context of usage might be especially crucial for learning the former. Third, we further predicted that among mental verbs, cognition verbs would be differentiated from volition verbs, similar to Laakso and Smith (Reference Laakso and Smith2007) who found that epistemic verbs (e.g., “think”) were more likely to co-occur with “I”, whereas deontic verbs (e.g., “like”) co-occurred with “you”.
Method
Participants
Forty-two mothers with their infants (20 female) were recruited in San Diego County, for a longitudinal study of infant social development (Deák et al., Reference Deák, Triesch, Krasno, de Barbaro and Robledo2013). An experimenter visited the participants’ homes every month while infants were between the ages of 3 to 9 months, and again at 12 months of age. Upon recruitment, mothers’ mean age was 32.1 years (range = 21-42), with a mean of 16.1 years of formal education (range = 12-21). Twenty-nine infants were White, two were Asian, five were “other” or multiracial. Four infants were of Hispanic origins. Two parents did not provide information about ethnicityFootnote 1. The current study reports data from the 12-month home session, when infants averaged 371 days old (range: 356-450). This age was chosen based on previous related studies of early verb learning and caregiver verb use (see, e.g., Liu et al., Reference Liu, Zhang and Yu2019; West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022).
Procedure
Infants were seated across from their mother in a room of their home where the dyad typically played. Dyads were recorded by two cameras while they played with three sets of infant objects (see Figure 1 for details). Mothers were instructed to “play as they normally would” with their infants for about 15 minutes (M = 14.12 min, SD = 1.73). Intervals when infants were fussy or locomoted outside of view, or when an experimenter was present (e.g., to deliver toy sets), were excluded from coding and analyses. On average, 12.19 min of play per session were coded and analyzed (SD = 2.22).
Coding
Videos were digitized and synchronized for coding. Coders (blind to specific hypotheses) annotated the videos using ELAN (2023)Footnote 2 for maternal speech, and Datavyu (Reference Team2014)Footnote 3 for nonverbal actions. All utterances were transcribed, and actions classified, with their start and stop times specified (frame-wise, 10 Hz precision), using coding protocols developed within our lab (available at https://osf.io/bnyhk). Utterances were defined as bouts of meaningful speech separated by pauses >200msFootnote 4 (e.g., Chang et al., Reference Chang, de Barbaro and Deák2016). Different coders independently coded behaviors including: infants’ gaze target (object or mother’s face), infants’ object-touches (defined as any deliberate contact of infant’s hand, arm, or mouth to an object), infant locomotion (by self or mother), and mothers’ manual actions (to infant-visually-attended or -unattended object; see Table 1). To assess reliability, a second coder independently annotated 20% of files (randomly selected). Cohen’s kappas (Cohen, Reference Cohen1968) were .76 for infant gaze, .81 for infant touches, .81 for infant locomotion, and .88 for mother manual actions.
Note. Mothers and infants’ behaviors were coded with Datavyu (http://datavyu.org), a free open-source software application.
We selected the 67 most frequent verbs and 17 most frequent pronouns from the dataset. These occurred at least eight times total and were used by at least five mothers. Frequencies of specific verbs and pronouns are shown in Supplementary Tables S1 and S2. Verbs were classified semantically as Movement (e.g., swim, go; Laakso & Smith, Reference Laakso and Smith2007; West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022), Object-action (e.g., squeeze, hold; West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022), Cognition/Perception (e.g., think, see; Davis & Landau, Reference Davis and Landau2021; Laakso & Smith, Reference Laakso and Smith2007), or Volition (e.g., want, like; Laakso & Smith, Reference Laakso and Smith2007) verbs. In addition, verbs were tagged for the syntactic categories Transitive (e.g., want, eat; Kline et al., Reference Kline, Snedeker and Schulz2017), Intransitive (e.g., swim, look) and Auxiliary (e.g., do, can; Tincoff et al., Reference Tincoff, Santelmann and Jusczyk2000). Pronouns were categorized as First person (e.g., I, me), Second person (e.g., you, your), Third person (e.g., she, his), or Deictic (e.g., that, this; Strauss, Reference Strauss2002). Verbs could belong to multiple categories, whereas pronouns only belonged to one.
Data analysis
To test whether verb categories were differentiated by linguistic and embodied contexts in mothers’ naturalistic IDS, we constructed binomial Generalized Linear Mixed Models (binomial GLMM) for each verb category (glmer R functionFootnote 5; Bates et al., Reference Bates, Maechler, Bolker and Walker2015). Data were separated by utterance, with binary columns for each verb category and for each contextual variable. Co-occurrence with a pronominal context was defined as the verb and pronoun occurring in the same utterance. Co-occurrence with any embodied context was defined as the verb utterance overlapping temporally with the embodied context.
Each verb category was entered as a predicted variable, and pronominal and embodied contextual variables were entered as predictors. Dyad was included as a random effect. Predictors were considered significant if p < 0.05. We calculated predictor effect sizes as Odds Ratios (OR), which are independent of variable base rates (Spitznagel & Helzer, Reference Spitznagel and Helzer1985). A predictor more likely than chance to co-occur with a verb type has OR > 1.0; a predictor less likely than chance to co-occur has OR < 1.0Footnote 6.
Results
Mothers on average produced 17.06 utterances/min (SD = 7.94), each containing a mean of 3.23 words (SD = 1.63). Mothers held objects on average 10.90 times/min (SD = 5.03). Infants on average locomoted 0.50 times (SD = 0.38), were moved by mothers 0.62 times (SD = 0.41), produced 13.10 gaze fixations to an object or mother’s face (SD = 4.64), and touched objects 7.56 times (SD = 3.48), all per min. There were no significant gender differences in any of these rates (two-tailed, all ps > 0.05).
GLMMs revealed that verbs were differentiated by their linguistic and embodied contexts. As we predicted, co-occurrence patterns differentiated Object-action verb use (Figure 2; Table 2), differentiated Movement vs. Cognition/Perception verbs (Figure 3; Table 3), and differentiated Cognition/Perception vs. Volition-based mental verbs (Figure 4; Table 4). Our analyses focused on the three predictions above. However, the full model of every verb category in relation to all contextual factors is provided in Table S4.
Note. Numbers reported are odds ratios indicating how many times more or less likely than chance for a verb type and a context to co-occur. Significance levels were obtained from binomial linear mixed effects models. * p < 0.05, ** p < 0.01, *** p < 0.001. Examples of utterances for each significant co-occurrence pattern are provided. The bolded verb was the most frequent Object-action verb for each co-occurrence scenario.
Note. Numbers reported are odds ratios indicating how many times more or less likely than chance for a verb type and a context to co-occur. Significance levels were obtained from binomial linear mixed effects models. * p < 0.05, ** p < 0.01, *** p < 0.001. Examples of utterances for each significant co-occurrence pattern are provided. The bolded verb was the most frequent Movement or Cognition/Perception verb for each co-occurrence scenario.
Note. Numbers reported are odds ratios indicating how many times more or less likely than chance for a verb type and a context to co-occur. Significance levels were obtained from binomial linear mixed effects models. * p < 0.05, ** p < 0.01, *** p < 0.001. Examples of utterances for each significant co-occurrence pattern are provided. The bolded verb was the most frequent Volition or Cognition/Perception verb under each co-occurrence scenario.
Co-occurrence of object-action verbs with joint attention
Mothers used Object-action verbs more often than chance when focusing on the same object as their infants (i.e., joint-attention; Table 2, Figure 2). Object-action verbs also co-occurred above chance when infants touched one object (OR = 1.23, p < 0.01) or gazed at one object (OR = 1.26, p < 0.05). However, they occurred below chance when mothers moved the infant (OR = 0.68, p < 0.05). Using object-action verbs during joint-attention to objects might be optimal input for infants to learn associations between actions and related verbs (e.g., rolling a ball and “roll”).
Differentiation of movement and cognition/perception verbs
Movement and Cognition/Perception verbs co-occurred with distinctly different embodied and linguistic contexts (Table 3, Figure 3). Regarding embodied factors, Movement verbs co-occurred above chance with infant crawling (OR = 3.49, p < 0.001), infant walking (OR = 2.28, p < 0.05), and mother moving the infant (OR = 3.44, p < 0.001), whereas Cognition/Perception verbs co-occurred near chance levels with these contexts. Regarding linguistic context, Movement verbs co-occurred above chance with 3rd-person pronouns (OR = 2.20, p < 0.001) and near chance with deictic pronouns, whereas Cognition/Perception verbs co-occurred above chance with deictic pronouns (OR = 1.98, p < 0.001), and near chance with 3rd-person pronouns.
Differentiation of cognition/perception and volition verbs
Within the class of mental verbs, Volition and Cognition/Perception verbs shared some contextual distribution patterns but differed in others (Table 4, Figure 4). Both verb types co-occurred above chance with 1st-person (Volition: OR = 1.44, p < 0.05; Cognition/Perception: OR = 2.04, p < 0.001), 2nd-person (Volition: OR = 6.27, p < 0.001; Cognition/Perception: OR = 1.29, p < 0.01), and deictic (Volition: OR = 2.42, p < 0.001; Cognition/Perception: OR = 2.04, p < 0.001) pronouns. However, Volition verbs co-occurred above chance with 3rd-person pronouns (OR = 1.55, p < 0.001) and near chance level with infant gazing at an object. In contrast, Cognition/Perception verbs co-occurred near chance with 3rd-person pronouns, and below chance with infant gazing at an object (OR = 0.81, p < 0.05). Note that Volition verbs were more likely to co-occur with 2nd- than 1st-person pronouns, whereas Cognition/Perception verbs showed the opposite pattern. Laakso and Smith (Reference Laakso and Smith2007) has reported similar pronominal co-occurrence differences between Volition and Cognition/Perception verbs.
In addition to analyzing the co-occurrences between verb types and various contextual variables, the dendrogram of verbs based on hierarchical clustering (hclust function in RFootnote 7; R Core Team, 2021) is shown in Figure 5. The input to hclust was the probability of each verb co-occurring with each of 25 contextual factors (17 pronouns and 8 embodied variables). This visualization shows how the similarity of verbs’ context predicts various verbs’ common semantic or syntactic categories.
Discussion
Infants acquire word knowledge as observers and participants in social interactions. Historically, investigations of infants’ word learning have largely focused on object nouns; however, researchers have argued that verbs are harder for infants and children to learn (e.g., Golinkoff & Hirsh-Pasek, Reference Golinkoff and Hirsh-Pasek2008). However, little research has examined how infants eventually learn verbs from naturalistic interactions, and in particular what verbal and nonverbal information is available for infants to disambiguate verb usage and meanings. The current study provides new evidence that pronouns, infant manual actions, gaze, and locomotion, as well as parent object actions, differ across verb types.
Object-action verbs
Infants prefer looking at caregivers manipulating objects more than caregivers’ faces or isolated objects (Deák et al., Reference Deák, Krasno, Triesch, Lewis and Sepeda2014). Additionally, caregiver/infant object attention-sharing is positively correlated with caregivers’ utterance rates and infants’ object-noun vocabularies (Tomasello & Farrar, Reference Tomasello and Farrar1986). During free-play, caregivers tend to use object-action verbs when either they or their infants manipulate objects (West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022). Moreover, infants attend to caregivers’ actions around times when the caregiver uses an action verb (Liu et al., Reference Liu, Zhang and Yu2019). Our results complement these findings: mothers produced object-action verbs (e.g., spin, turn) while they manipulated an object and infants watched, or when infants touched or looked at an object.
Like object-naming nouns, then, object-action verbs were frequently used when caregivers and infants jointly attended to an object. However, despite this contextual similarity, object-action verbs are learned and produced by English-learning infants later than object nouns (Fenson et al., Reference Fenson, Dale, Reznick, Bates, Thal, Pethick, Tomasello, Mervis and Stiles1994). Can usage co-occurrence statistics explain this gap in learning object-related verbs? One possible explanation is that caregivers more regularly produce object nouns than action verbs while the infant is watching the action (or, relatedly, that verb forms are more variable than object labels during actions). Another possibility is that actions are more transient, so infants are less likely to attend to both a particular action and its co-occurring action verb (Liu et al., Reference Liu, Zhang and Yu2019). To test these hypotheses, finer-grained, higher-power future studies should explore verb-noun, verb-action, and noun-action co-occurrences within naturalistic infant-caregiver interactions, ideally across diverse languages and cultures. Such studies could reveal generalized distributional patterns that support infants’ verb learning.
Mental versus movement verbs
Mental verbs are learned late. English-learning children produce and understand some action or movement verbs in the second year (Bloom et al., Reference Bloom, Lightbown and Hood1975), but do not regularly produce or comprehend multiple mental verbs until the third year (Shatz et al., Reference Shatz, Wellman and Silber1983). However, children’s production of mental verbs can be facilitated by syntactic and observational cues in a scene description task (Papafragou et al., Reference Papafragou, Cassidy and Gleitman2007). Thus, it is natural to wonder how mental verbs are used by caregivers, and with what contextual cues. Our results indicate that some mental verbs are used frequently by English-speaking caregivers of 12-month-olds, and co-occur with specific lexical and nonverbal contextual elements. For example, cognition/perception verbs were used significantly more when caregivers handled the object infants gazed at, and used deictic pronouns. Cognition/Perception verbs also co-occurred with first person pronouns, whereas movement verbs co-occurred with second and third person pronouns, replicating and expanding findings from Laakso and Smith (Reference Laakso and Smith2007). Movement verbs were used often when infants locomoted or when caregivers moved infants (replicating West et al., Reference West, Fletcher, Adolph and Tamis-LeMonda2022), whereas cognition/perception verbs were used near chance frequency during those times. Co-occurrences between movement verbs and infant locomotion might build infants’ semantic associations between specific verbs and actions. In addition, caregivers often used the same movement verbs with first, second or third person pronouns in different utterances, to describe the movement of the dyads (e.g., “Let’s go and check that out!”), infants (e.g., “where are you going mister?”) or objects (e.g., “where did the boat go?”). These cross-situational usages of the verb “go” might bootstrap infant learning of how “go” references the movements of different entities. Plausibly, movement verbs were deliberately chosen by mothers to narrate salient events when the infant locomoted or was relocated by the mother. By contrast, mothers used mental verbs to comment on infants’ mental states while they were looking at and handling objects. However, these events also co-occurred with other types of verbs, notably object-action verbs. This non-specificity of verbs to context, and of context to verbs, might partly explain toddlers’ late acquisition of mental verbs (Bloom et al., Reference Bloom, Lightbown and Hood1975; Shatz et al., Reference Shatz, Wellman and Silber1983).
Cognition/perception versus volition verbs
Toddlers’ acquisition of mental verbs might be facilitated by co-occurrence statistics that differ among semantic sub-types. Although both cognition/perception and volition verbs co-occurred weakly (i.e., small effect) with 1st-person pronouns, the co-occurrence with 2nd-person pronouns was much greater (i.e., medium effect) for volition than cognition/perception verbs. These effect size differences could hypothetically contribute to infants’ differentiation of pronouns that describe their own and others’ mental states. For example, when infants indicate a desire for one specific toy, they might hear caregivers say, “you like that one?” Their affective state, and prior learning that a caregiver can satisfy their desire for objects, might increase their attentiveness to the caregiver’s utterance. Such co-occurrences, if regular, could bootstrap infants’ mapping of the words “you” and “like” to their own volitional states. Comparatively, caregiver use of cognition verbs to describe their own mental activities (e.g., “I think that’s good”) might be less salient to infants, due to a less intense and/or focused co-occurring affective state. This would predict that infants comprehend volition verbs earlier than cognition verbs, because volition verbs are more often associated with their own experience of salient emotional states. Future studies should compare the age of acquisition volition and cognition verbs, relative to pronoun context as well as social-emotional context, and investigate whether these co-occurrences might better explain acquisition differences.
Summary
This study examined how different verb categories co-occurred with distributional and embodied contexts during caregiver-infant free-play. We found that caregiver linguistic input and play contexts provided statistical regularities that could facilitate infant verb acquisition. However, several limitations should be addressed by future studies. First, we segmented caregiver utterances with a temporal cut-off, which might alter the complexity of utterance content that infants might further differentiate. Future studies could consider additional utterance boundary cues like terminal pitch contours or grammatical units. Also, our linguistic context only considered pronouns, but nouns and adverbs also co-occur non-randomly with verbs. We focused on pronouns because they are frequent, and their co-occurrence patterns have seldom been examined as a distributional cue for infants. Future studies with larger datasets should investigate co-occurrences between verbs and nouns, adverbs, and other closed-class elements as well as pronouns. Lastly our results only sampled one language and subculture, and are therefore limited in generalizability.
The present study is among several that document the co-occurrence patterns of both language and embodied contextual variables across a range of verb categories within naturalistic caregiver-infant interactions. The results suggest that infant acquisition of verbs could be supported by learning mechanisms and environment statistics parallel to those that support acquisition of object nouns – that is, a capacity to learn contextual regularities, rather than a specific “verb-learning module”. Our results suggest potential explanations for the later acquisition of verbs, and specifically mental verbs: notably, contextual factors co-occurring with mental verbs were least specific. The approach exemplified here should be applied to datasets from diverse populations, for comparisons that will broaden our understanding of how infant-caregiver contextual statistics support verb learning.
Supplementary material
The supplementary material for this article can be found at http://doi.org/10.1017/S0305000923000636.
Acknowledgements
This research was supported by grants from the National Science Foundation (SES-0527756) and from the UC - San Diego Academic Senate. We thank student members of the Cognitive Development Lab for their assistance with data collection and coding, and we thank the families who participated in this research.
Competing interest
The authors declare none.