Hostname: page-component-5f745c7db-tvc9f Total loading time: 0 Render date: 2025-01-06T07:35:39.642Z Has data issue: true hasContentIssue false

When Dogs Talk: Technologically Mediated Human-Dog Interactions as Semiotic Assemblages

Published online by Cambridge University Press:  01 January 2025

Miriam Lind*
Affiliation:
European University Viadrina, Germany
*
Contact Miriam Lind at Europa-Universität Viadrina, Große Scharrnstraße 59, 15230 Frankfurt (Oder), Germany (lind@europa-uni.de).
Rights & Permissions [Opens in a new window]

Abstract

Pets using “talking buttons” to ostensibly tell their owner about their thoughts and needs have become a huge success on social media. With buttons that upon activation play a prerecorded message, these devices are marketed as tools in teaching human language to animals in order to allow them to “speak their minds.” This article investigates these practices of technologically mediated human-dog interactions through the analysis of social media videos and examines the claim that these button-based interactions are illustrative of animals’ language acquisition. This article concludes that “talking buttons” in human-dog communication should rather be understood as semiotic assemblages in which meaning is collaboratively constructed through the dynamic, situated interaction of bodies, linguistic resources, objects, and touch.

Type
Articles
Copyright
Copyright © 2024 Semiosis Research Center at Hankuk University of Foreign Studies. All rights reserved.

The question of whether animals “have” language is not new—ethology, linguistics, and philosophy have been trying to provide a definitive answer for more than a century. The interest in animal language is not limited to whether there are language-like communication systems within specific animal species but stretches further to inquiries into animals’ ability to understand and potentially even produce human language. This interest is furthered by contemporary technological advancements that offer new tools for language-based human-animal communication, which play an important role in the emerging field of animal-computer interaction. One such technology will be at the center of this article: so-called talking buttons, large plastic buttons that play a prerecorded message when pressed. These buttons are based on augmentative and alternative communication devices used in speech therapy for children and people with speech-related disabilities and have become highly popular for pet-human communication, particularly in social media.

Looking at the development of these devices within the larger frame of dog-human interaction alongside social media videos of their actual use in interspecies households, this article investigates technologically mediated human-dog interactions and asks whether they should be understood as language-based communication, as the marketing of these buttons suggests, or whether different approaches to and interpretations of these interactions might be more accurate. Drawing on scholarship in linguistics, language philosophy, and animal ethics, I consider the implications of these talking buttons for our concepts of animals, of language, and of interspecies interaction. I will suggest that instead of attempting to understand dogs’ use of these buttons as them (humanly) speaking their minds, these forms of communication should rather be considered as semiotic assemblages (Pennycook Reference Pennycook2017) in which bodies, language, and objects come together to create meaning in interaction.

Before the introduction of my empirical data, I will situate technologically mediated dog-human communication in two contexts: first, I will consider perspectives from linguistics, philosophy of language, and human-animal ethics on the question of language in the distinction between humans and other animals. Second, a brief history of “talking” animals and of human-dog relationships will be given in order to provide a better understanding of the cultural history that informs contemporary human-dog interaction. This will then be used as the basis for discussing the development of talking buttons in human-dog interactions and for analysing social media videos of dog-human interactions that use talking buttons. It will be argued that these interactions ought best be understood as practices of multimodal collaborative semiosis rather than an exercise in language-acquisition, for which I will draw on the theoretical concept of assemblages (Deleuze and Guattari Reference Deleuze[1987] 2005) and Pennycook’s (Reference Pennycook2017) suggestion to understand the situatedness of multisensory and multimodal interaction as semiotic assemblages.

Humans, Animals, and the Question of Language

The question of whether animals have language and whether there can be meaningful communication between humans and animals has been widely discussed in linguistics, language philosophy and human-animal studies, particularly from an animal ethics point of view (see, e.g., Cate and Okanoya Reference Cate2012; de Waal Reference Waal2016; Kulick Reference Kulick2017).

In a traditional linguistic perspective, a variety of criteria have been used to distinguish human language from animal communication: In his book The Language Instinct, Pinker (Reference Pinker1995, 347) lists “reference, use of symbols displaced in time and space from their referents, creativity, categorical speech perception, consistent ordering, hierarchical structure, infinity, recursion” as distinct features of human language that are not present in animal communication. Particularly recursion has consistently been pointed out as the central feature that distinguishes human language from other forms of communication (e.g., Hauser et al. [Reference Hauser2002]; see further the following discussions on the evolution of the language faculty between Jackendoff and Pinker [Reference Jackendoff2005] and Fitch et al. [Reference Fitch2005]). While aspects like symbolism and creativity being distinctly human have been challenged by studies on animal communication (e.g., Savage-Rumbaugh et al. [Reference Savage-Rumbaugh1978] on symbolic communication between chimpanzees), the claim to human uniqueness when it comes to syntactical features of language, that is, recursion, hierarchical structures, and consistent ordering, continues to be upheld (e.g., Corballis Reference Corballis2007). In this attempt to distinguish human language from animal communication, the multimodality and situatedness through which human language-based interaction gains its meaning is negated for the sake of human exceptionalism; the “multimodal turn” in linguistics (e.g., Jewitt Reference Jewitt2009) aims to correct this fixation on language and speech as a context-independent abstract system.

From a perspective of language philosophy, it frequently is Wittgenstein who is taken as a starting point in discussions on animal’s capacity for language. In his posthumously published Philosophical Investigations, Wittgenstein states that “to imagine a language means to imagine a form of life” (Reference Wittgenstein[1958] 1986, 8). Wolfe (Reference Wolfe2003) reads this quote together with another thought from Wittgenstein’s Philosophical Investigations—“If a lion could talk we could not understand him” (Wittgenstein Reference Wittgenstein[1958] 1986, 225)—and wonders “what it can mean to imagine a language we cannot understand, spoken by a being who cannot speak” (Wolfe Reference Wolfe2003, 1). Following Wolfe, it can then be asked what it means for our understanding of language, of animals, and of being if humans are incapable of understanding animal language (if this were to exist). Does the human attempt to make animals participate in human language inevitably mean imagining them as being human and thus anthropomorphizing them? This question is discussed at length by Kari Weil in her book Thinking Animals (Reference Weil2012). She is interested in the relationship between language and the self and in the (alleged) contrast between “real” language learning and “mere” imitation, asking what it is that we do when we try to teach animals human language. Referring to studies performed throughout the second half of the twentieth century that tried to teach sign language to primates, Weil asks whether “language [will] enable them to speak of their animal lives or simply bring them to mimic (or ape) human values and viewpoints? Indeed, if they learn our language, will they still be animals?” (Reference Weil2012, 6). She links these questions to Spivak’s seminal essay “Can the Subaltern Speak?” (Reference Spivak1988), asking what kind of language animals would speak if humans taught them to do so, and what it would be that they could ever say (Weil Reference Weil2012, 5). Using Kafka’s “A Report to an Academy” as an example, Weil suggests that the assimilation process that teaching human language to animals inherently entails “gives voice only by destroying the self that would speak” (6). A speaking animal would thus cease to be animal and the human attempt to be closer and more knowledgeable of the animal self through language would only ever render this very self impenetrable by destroying it (9).

In this article, I address Weil’s question of “Must animals mean what humans say?” (Reference Weil2012, 7) by investigating technologically mediated dog-human communication. My central argument will be that the focus on humans’ and dogs’ use of talking buttons as a form of language-based expression is a hindrance to understanding what animals mean, and that the use of these buttons in human-dog interactions should rather be acknowledged as a form of collaborative semiosis in which the multimodality of embodied communication is central to recognizing it as meaningful interaction. This perspective is further informed by literature that consistently finds that affective communication in which the relationships between interactants is negotiated primarily takes place nonverbally both between humans and between animals. Bateson (Reference Bateson1972) prominently concludes that mammalian nonlinguistic communication is fundamentally communication about the relationship. This point is reinforced by Haraway (Reference Haraway2008, 26): “An embodied communication is more like a dance than a word. The flow of entangled meaningful bodies in time … is communication about relationship, the relationship itself, and the means of reshaping relationship and so its enacters.” The practical use of talking buttons in human-dog interactions will thus be understood as an embodied practice of relating and sense-making in which questions of animals’ capacity to learn and use human language are secondary. Before the empirical material is presented, the next section provides a brief introduction to the idea of “talking” animals as well a short history of human-dog relations in order to situate contemporary relationships between humans and their “best friend” in their sociohistorical context.

A Brief History of “Talking” Animals

One of the most famous cases of a “talking” animal was undoubtedly the so-called Clever Hans, a horse in early twentieth-century Germany who became famous for being able to answer simple mathematical tasks. By stamping his hooves the correct number of times, Hans could ostensibly count and solve additions, much to the appreciation of his growing audiences. In controlled tests, however, where the horse was spatially separated from the person asking the questions so that Hans could not see their body language, he failed to accomplish the tasks he was previously claimed to have mastered with ease (Wilson Reference Wilson2021; Rafferty Reference Rafferty2023). Instead of having mathematical and linguistic skills that allowed Hans to understand questions and solve them correctly, it was in fact his social and communicative intelligence that let him recognize and react to the unconscious cues of the questioner, correctly ascertaining when to stop stomping. Instead of being remembered as an example of the impressive observational and social capacities of horses, however, the story of Clever Hans is most often seen as an illustration of animals’ failure to demonstrate human-like intellectual ability.

Another famous example of humans’ attempts to establish language in human-animal relationships is the case of Koko, a western lowland gorilla, who in the 1970s was taught a modified version of American Sign Language by animal psychologist Francine Patterson. Koko was claimed to have obtained a vocabulary of more than 1,000 signs along with a passive understanding of roughly 2,000 English words, in addition to being able to use syntax, express emotions, and illustrate self-awareness. Already at the time doubts were raised concerning Koko’s actual linguistic capacities, and many critics suggested that Patterson’s close relationship to Koko allowed her to interpret Koko’s signs as meaningful utterances, rather than illustrating the gorilla’s actual intelligence and linguistic capacity (Wilson Reference Wilson2021; Rafferty Reference Rafferty2023). Internationally prominent linguists and cognitive scientists such as Noam Chomsky and Steve Pinker have deemed primates’ demonstrations of linguistic skills as rather a complex form of mimicry based on their trainers’ signing than “actual language,” as they allegedly fail to exhibit the features discussed above that make human language “human language” (e.g., Pinker Reference Pinker1995). Similarly, Terrace and colleagues concluded already in the late 1970s that while apes, dogs, horses, and other animal species can learn isolated symbols, “they show no unequivocal evidence of mastering the conversational, semantic, or syntactic organization of language” (Reference Terrace1979, 901).

The renowned “talking parrot” Alex, an African grey parrot that scientist Irene Pepperberg worked with for 30 years, received similar criticism. Even though it is known that parrots, and particularly grey parrots, are highly intelligent birds and can show an impressive talent in mimicking the sound of the human voice, their ability to actually acquire language is disputed. Pepperberg showed in multiple studies that Alex had learned a variety of categories such as shape and color (Pepperberg Reference Pepperberg1987), numbers (Pepperberg Reference Pepperberg1994), and obtained a verbal vocabulary consisting of several dozen words (Pepperberg Reference Pepperberg1981). While she discusses Alex’s cognitive abilities at length, Pepperberg avoids the question of whether his sound-based articulation was to be considered language, speaking instead of “functional vocalizations” and “verbal communication” (Reference Pepperberg1981). The case of Alex the grey parrot further fueled debates on animals’ capacity to acquire human language, particularly because parrots do not use tools or signs to communicate a message but can vocally communicate with clearly distinguishable words. Whether animals’ learning of human language is actual language acquisition or “mere” imitation leads Weil to wonder “how recognition and response (or intention) are ever clearly distinct from imitation. When it comes to language, are not all of us dependent on a field of signification that precedes us, making it difficult to say that language itself is ever not imitative?” (Reference Weil2012, 9). Whereas Weil focuses here on the imitative element of language, de Waal (Reference Waal2016) rather considers symbolization and flexibility as central for the distinction between human language use and animal communication, seeing humans as “the only linguistic species,” not because animals lack the capacity to communicate inner processes or to coordinate actions and plans but because their modes of communication are “neither symbolized nor endlessly flexible like language” (106).

In his literature review on human-animal communication, Kulick (Reference Kulick2017) points out that the interest in communication between human and nonhuman animals stretches far beyond these primarily scientifically oriented studies: animal communicators, dog trainers, and many others are concerned with the diverse interactions between humans and animals and through this provide a larger sociocultural frame in which we understand human-animal communication. In addition, a large array of cultural productions reaching from books to films equip us with templates for understanding the interactions of humans and animals—most often pets—as communication based in human language and translations between animal and human languages (Kulick Reference Kulick2021). Similar to how science fiction imaginations of talking machines have shaped the way for human interaction with voice user interfaces (Hoy Reference Hoy2018), books and films such as Dr. Dolittle have provided us with imaginaries of translating animal communication into human language for decades.

Man’s Best Friend—Dog-Human Relationships

While language in interspecies interactions has been studied and discussed with respect to human relationships with a variety of species, there is undoubtedly no other species with whom humans have interacted and communicated more than dogs. The social bond between humans and dogs has existed for at least 15,000 years and is largely understood as a form of coevolution (see, e.g., Chambers et al. Reference Chambers2020). The oldest evidence of a social relationship between humans and dogs that went beyond mere utility as a work animal can be found in the so-called Bonn-Oberkassel dog: the remains of a dog were found buried together with two humans, and its skeleton shows that it must have been cared for during a severe illness when it was 19–23 weeks old. Before and during this time of illness, the dog was of no functional use for humans, which has led researchers to conclude that the dog must have been kept alive for emotional value, thus proving a social bond between the dog and its human owners (Janssens et al. Reference Janssens2018).

While dog training dates back to antiquity, the beginning of contemporary dog training is attributed to Konrad Most, whose Die Abrichtung des Hundes (The training of dogs) in the early twentieth century provided the first how-to manual for training dogs by means of forced obedience through punishment (Pręgowski Reference Pręgowski2015). In this training guide, as well as in a more general public understanding of human-dog interactions, the relationship was understood as obedient servant (the dog) and uncontested master (the human; Pręgowski Reference Pręgowski2015). Together with a changing human-dog relationship throughout the twentieth century and an increasing consideration of ethical questions in human-animal interactions, new understandings of learning and teaching methods have led to drastic changes in dog training, away from punitive methods toward reward-based training and mutual understanding (Greenebaum Reference Greenebaum2010; Pręgowski Reference Pręgowski2015). Studies on different approaches in dog training have also shown that positive reinforcement leads to better results in dog-owner relationships and dog behavior than punitive methods (Rooney and Cowan Reference Rooney2011). In contemporary approaches to dog training, humans and dogs are often considered equals with mutual respect where mutual understanding is the central goal. In line with this, it is assumed that it is mostly the human who should adapt to the dog’s communicative behavior, that is, it is the human who is trained in dog training just as much as the dog (Greenebaum Reference Greenebaum2010). This perspective, in which training is a mutual process of learning and adapting to each other’s mode of communication, also resonates with Weil’s sentiment that “training cannot give me your world or give you mine—although it may allow us to find a place of intersection between our worlds” (Reference Weil2012, 11). This “place of intersection” will be central in the analysis of technologically mediated dog-human communication.

The changed approach to training dogs reflects the overall change in human-dog relationships, where dogs are increasingly seen as family members and quasi-human companions who deserve the same love and care as the human members of a family (Irvine and Cilia Reference Irvine2017; Owens and Grauerholz Reference Owens2018). In the following, it will be shown that the inclusion of pets into human families and their understanding as family members goes hand in hand with attempts to give them human language. It must, however, be asked whether teaching dogs to use buttons in order to perform utterances in human language is contradictory to approaches in modern dog training that focus on humans learning to understand canine modes of communication rather than making dogs adapt to human communication.

From Augmentative and Alternative Communication Devices to Dog Buttons

The adaptation of so-called augmentative and alternative communication (AAC) devices to dog-human communication was first experimented with by speech-language pathologist Christina Hunger when she and her partner got a puppy, which she describes in How Stella Learned to Talk (Reference Hunger2021). The book details Hunger’s professional background in speech therapy, which led her to recognize “the glaring similarities between dog and human communication skills” (236), primarily with respect to the prelinguistic capabilities of infants as compared to the capabilities of canines. Augmentative and alternative communication methods used for people with communication impairments include “signing, use of symbols and voice output devices” (Baxter et al. Reference Baxter2012, 115). Technology for AAC is advancing rapidly and offers both high- and low-technology speech-generating solutions: low-technology options include single message or static multimessage devices that provide a prerecorded spoken output when activated (usually by push); high- technology devices usually involve more complex computer software tools (Baxter et al. Reference Baxter2012). Common low-technology AACs are voice-output switches such as BIGmack or Step-by-Step—large plastic buttons that hold a recording device and speaker so that users can prerecord a message that is then played when the button is pushed. Hunger adapted this AAC technology for use with her puppy Stella, first using a single button that played the word outside, training Stella to use it to communicate when she needed to go out to relieve herself.Footnote 1 From there on, more and more buttons were added to increase Stella’s “vocabulary.” Based on the success of Stella using buttons to communicate, Hunger has started her own business, Hunger for Words, which sells these talking buttons, floor mats where the buttons can be installed to keep them in place, and more. The company’s website greets the reader with the statement “Dedicated to the belief that everyone has something to say” next to a picture of Hunger and her dog Stella, illustrating that “everyone” is by no means a human-exclusive pronoun but that dogs are someones too.Footnote 2 A second company that has started to sell talking buttons for pets is California-based FluentPet, whose buttons have become particularly popular due to Alexis Devine’s social media videos showing her sheepadoodle Bunny using FluentPet’s buttons. The videos of Bunny have gained tremendous attention in social media, with 8.3 million followers on TikTok and 1.3 million on Instagram as of late 2023.Footnote 3 FluentPet collaborates with the Comparative Cognition Lab at the University of California San Diego, where the project TheyCanTalk seeks to determine “whether, and if so to what degree, non-humans are able to express themselves in language-like ways.”Footnote 4 FluentPet sells its buttons in sets installed on hexagonal tiles (“HexTiles”) that can be attached to others, with each tile having space for six buttons. FluentPet suggests that each HexTile should be organized around one word category, although their understanding of a “word category” seems to combine syntactic and semantic aspects and doesn’t follow a linguistic understanding of word classes.Footnote 5 The arrangement of multiple tiles is based on the Fitzgerald Key, a system developed in the 1920s to teach deaf students grammatically correct sentence structure (see Paul Reference Paul2009; Franco et al. Reference Franco2018). FluentPet’s adaptation of the Fitzgerald Key suggests starting with a tile for “sentence subjects,” which seems to exclusively mean animate beings, followed by a tile to the right for “action words” and then “sentence objects.” To the right of the objects tile, FluentPet suggests to place a tile with buttons for “places.” In a second row below the first, additional tiles for “social words” and “descriptors” can be placed. FluentPet writes on its homepage that they “can’t yet be certain that the FluentPet approach to organizing button boards will be successful, but our expertise in cognitive science and years spent designing teaching tools for dogs leads us to believe that this organization is likely to be significantly easier to remember than sound buttons organized in a plain cartesian grid.”Footnote 6

This suggested word order carries a number of both cognitive and linguistic assumptions. Following a structure used for teaching grammar and syntax to deaf children implies both that animal cognition is comparative to human cognition in terms of learning and that animals, or at least dogs, can develop an understanding of grammar in general and syntax in particular—precisely the feature of language that is consistently claimed to be exclusively human. Additionally, basing the layout suggestion on the standard English word order of subject-verb-object, which is neither the only nor typologically the most common word order, either implies a complete lack of awareness for the typological diversity of word order or suggests that the company expects all their customers to use English, or another SVO language, with their pets—or suggests the rather bizarre assumption of a “universal dog grammar” that somehow favors this type of word order.

Talking Dogs?

This section includes screenshots and transcripts from two videos of human-dog interactions mediated by buttons—one video of an interaction between Christina Hunger and Stella and one of an interaction between Alexis Devine and Bunny. While there are many videos on social media of pets using talking buttons, Bunny is by far the most popular example. A video of Christina Hunger and her dog Stella was selected for comparison, as she is the inventor of these talking buttons and the individual who popularized their use through social media. Videos from the platform YouTube were chosen, as it can be argued that on such a video-sharing platform “maximum visibility can be expected to be either the users’ explicit goal or an accepted fact” (Legewie and Nassauer Reference Legewie2018, 10); using videos shared on this site is thus arguably the most justifiable option from a research ethics perspective. The videos were selected based on two criteria: first and foremost, they should show the use of the soundboard in a multiturn interaction between the dog and the human and not just the dog using the soundboard; second, the video with the most views that fulfilled criterion one was chosen based on the assumption that it would be particularly exemplary for the interactions between the respective dog and human. The videos chosen are “Stella Wants to Know ‘When’” from the YouTube account @hungerforwords, which has 52,000 views (as of spring 2023), and “What an Amazing Conversation!!” from the YouTube account @whataboutbunny, which has 2.6 million views. This disparity in the number of views reflects the vast difference in reach of Alexis Devine’s videos of Bunny and Christina Hunger’s of Stella. Transcripts of the videos were created based on GAT2 and Mondada’s (Reference Mondada2018) conventions for multimodal transcription, noting verbal utterances, bodily actions, and gaze. The “primate of the verbal” (Mondémé Reference Mondémé2019, 81) inherent in regular transcripts for conversation analysis proves problematic for the transcription of interspecies interactions where speech is not necessarily primary to bodily action. In order to represent this in the transcripts, it was decided that bodily actions are not transcribed as part of the numbered segment of previous speech but as separate occurrences represented through individual line numbering. Their temporal interrelatedness with other events during the interaction is represented by the placement of the symbols for the respective bodily action or gaze. During the transcription process, the additional problem arose of whether the messages played from the buttons upon being pressed should be presented as an interlocutor in its own right or whether their being activated and playing a prerecorded utterance should rather be described as embodied action involving a sound-output device. I decided to represent the soundboard as an interlocutor based on two considerations: it is the messages played after activation to which the human interlocutors react rather than the pressing of the buttons itself; and in conversation analyses of human interactions with voice user interfaces like Amazon’s Alexa or the Google Assistant, these entities are represented as interlocutors (e.g., Porcheron et al. Reference Porcheron2018; Habscheid Reference Habscheid2022), thus providing a precedence for treating “talking” technology as interactants. This additionally makes sense from an actor-network-theory perspective that takes into account the agency of objects (Latour Reference Latour2005) and considers their role in networks of interacting actors.

The first video to discuss here was published on April 30, 2021, by Christina Hunger on her account @hungerforwords and is 35 seconds long. It shows Christina Hunger and her dog Stella in an indoor room, presumably the living room, where Stella’s soundboard is located (fig. 1). The soundboard is a wooden plank with four rows of 12 buttons. The video begins with Christina sitting on a rug in the middle of the room and Stella standing in front of her.Footnote 7

Transcript 1.

It becomes evident in the transcript that using the talking buttons on the soundboard is a joint interactive practice that both human and dog engage in, and through this they create a kind of conversation. The scene starts with an interaction between human and dog where Christina affectionally speaks to Stella while stroking and caressing her, which is reciprocated by the dog by coming closer and stretching her head toward Christina’s face. Even though Stella then walks toward the soundboard, it is Christina who first uses it for conversational purposes by pressing the “love you” button, repeating through the use of the buttons her previous utterance. As this is the third utterance of the same content, it can be assumed that this use of the talking buttons is an attempt to model their use to the dog in order to motivate her to use them as well. It is only after Christina’s initiation that Stella engages with the soundboard, pressing the buttons “outside” and “come.” Christina does not immediately respond to this beyond laughing until Stella then walks around the soundboard and pushes a button that plays back the interrogative pronoun when. Laughing, Christina repeats the playback messages outside, come, and when and prosodically turns them into a meaningful question, albeit one with an ungrammatical syntax. Christina answers this question that Stella, the soundboard, and she herself have cocreated by pressing the buttons “outside,” “come,” and “now” while simultaneously uttering the same words. She then repeats the words played back by the soundboard and again uses prosodic structures to turn these single words into a statement, which can be understood as a response to the previous question by modeling the same ungrammatical syntax and replacing the interrogative pronoun with the temporal adverb now. This verbalization is accompanied by the physical action of getting up and walking out of the video frame, presumably toward a door to then indeed go outside.

Figure 1. Christina Hunger and Stella using the soundboard

Throughout this interaction, Stella uses the buttons only twice, once to create the utterance outside come and then to press the button “when,” whereas Christina uses the soundboard repeatedly to either make the buttons mimic what she previously said or to sound along to what she is verbally saying. Her participation in the interaction with Stella is thus constantly multimodal and multimedial in the sense that her contributions are always both verbal and technologically sounded out. As she adapts her speech to the possibilities the soundboard offers, Christina’s language use in the interaction with Stella becomes heavily reduced and lacks syntactical structure. Instead, it can rather be described as the combination of single lexemes and phrases that inherit their meaningfulness from context and situational framing. This is very similar in the YouTube video “What an Amazing Conversation” that Alexis Devine uploaded to her account @whataboutbunny on September 9, 2020. The video shows Bunny in a room in which the FluentPet soundboard, with about 50 buttons, is set up (fig. 2). In the middle of the room is a blue rug on which lies a piece of rope. In the background are a door leading to a balcony or terrace, a bookshelf, and some plants. Alexis Devine is filming with a handheld camera, probably a smartphone, and is therefore not visible.Footnote 8

Transcript 2.

Here, the video starts with Bunny using the soundboard and pressing the button “good.” Alexis Devine takes this as an invitation to start a conversation and verbally asks who Bunny is referring to. When Bunny presses the button “Bunny,” Alexis understands this as a response to her question and confirms this by first saying, then sounding, and then saying again yes Bunny good (lines 12–18). As we have seen in the previous transcript of Christina Hunger’s interaction with Stella (transcript 1, lines 27 and 41), Alexis carries over the ungrammatical syntax from the buttons into her verbalizations.

Figure 2. Bunny using the soundboard

Seemingly unrelated to repeatedly being told Bunny good, Bunny then goes on to sound dad and play, which Alexis verbally repeats before verbally adding dad bye, assumedly to tell Bunny that her “dad,” that is, Alexis’s partner, is not currently at home. Using the buttons, Alexis then sounds dad play later and repeats those words orally. She then asks Bunny verbally do you want mom play—noticeable here is the do-support in her speaking, which is absent in the soundboard-based communication—which is then followed by a longer pause (7.2 seconds). When Bunny doesn’t react, Alexis rephrases her question to Bunny want mom play and repeats these words through the soundboard (lines 37–39). After another pause of six seconds, Bunny uses the buttons “come” and “tug.” Alexis verbally repeats the two words and confirms this with okay before saying let’s play tug. While saying this, she holds out the piece of rope that was previously lying on the floor, showing that she has understood Bunny’s come tug as a request to play tug and not, for example, a question concerning where the rope for playing tug is. After repeating let’s play tug twice more and moving the rope toward Bunny’s face, the dog finally takes the rope into her mouth and starts playing.

Semiotic Assemblages

Rather than showing actual language acquisition by dogs, these videos show that training dogs to use talking buttons allows both humans and dogs to find a communicative medium, or, in Weil’s words, “to find a place of intersection between [their] worlds” (Reference Weil2012, 11). Christina Hunger echoes this phrasing when she notes that “it felt like the two of us entered our own bubble of communication together” (Reference Hunger2021, 165). This “place of intersection” or “bubble of communication” seems to be located somewhere between “mere” communication and “real” language: While the dogs appear to form language-like utterances by activating the buttons to play back lexical items and short phrases, the humans in these interactions both model the button use to the dogs but at the same time verbally imitate the button-based form of language use, reducing their verbal language to combinations of lexemes without syntactical structure or grammatical features like inflection or the use of auxiliaries. This might indeed be well suited for dogs’ passive capacity for human language that studies show to be centered on lexical processing (e.g., Kaminski et al. Reference Kaminski2004; Andics et al. Reference Andics2016). The imitative quality of (potential) language learning is here displayed not just by the animals, who learn to use the buttons based on humans’ modeling their use to them, but also by the humans themselves when they adapt their linguistic potential to the limited options the talking buttons offer. This is particularly interesting because human language is so often described as distinct from animal communication specifically because of its abstract features, that is, grammar and syntax (e.g., Zuberbühler Reference Zuberbühler2019), which is exactly what seems to be given up first in the button-based communication. That Alexis Devine’s dad bye (transcript 2, line 24) is supposed to mean “dad is currently not at home and can therefore not play with you” is not understandable through the linguistic features of this utterance alone and can only be deduced from the interaction’s situational frame. In this, also the reductionist understandings of human language that are prevalent in the attempts to distinguish between human language and animal communication become apparent as human-human language-based interaction is just as dependent on its nonverbal and multimodal aspects as is the interaction of humans and animals. The inferences and interpretations necessary to establish the meaningfulness of button-based communication further resembles parent’s interactions with infants and infant-directed speech in which paraverbal aspects as well as the simple co-occurrence of words are central resources of meaning-making (e.g., Bryant and Barrett Reference Bryant2007; You et al. Reference You2021). The overall similarities between pet-directed and infant-directed speech have been pointed out numerous times (e.g., Burnham et al. Reference Burnham1998; Ben-Aderet et al. Reference Ben-Aderet2017).

The “place of intersection” that Weil (Reference Weil2012, 11) speaks of is not only located between language and communication but also placed somewhere between the human and the dog: their communication around the talking buttons is simultaneously an “animalization” of the human code and a “humanization” of the animal code. It involves anthropomorphizing assumptions of dogs’ cognition in that their use of the buttons is seen as them using “language” to express their thoughts and desires, while at the same time the humans in these interactions reduce their linguistic capacity to a string of lexical signs—which might be easier for the animal interlocutor to decipher than a linguistically well-formed sentence—and that obtains its meaningfulness through its nonlinguistic situatedness, that is, the assemblage of objects, bodies, places, and the relations between them. Assemblage here is understood to comprise “two segments, one of content, the other of expression” (Deleuze and Guattari Reference Deleuze[1987] 2005, 88). Even though it might be oversimplistic to assume content and expression to be clearly distinguishable and without interaction between them, following the Saussurean idea that any sign consists of a signifier and a signified, the concept of assemblages can nevertheless be fruitful in that they comprise both a “machinic assemblage of bodies, of actions and passions, an intermingling of bodies reacting to one another” and a “collective assemblage of enunciation, of acts and statements, of incorporeal transformations attributed to bodies” (Deleuze and Guattari Reference Deleuze[1987] 2005, 88). It is in this coming together of the “machinic” assemblage, made up of the human and canine bodies, the soundboard and its buttons, the spatiality of the room with its toys and doors, and the collective assemblage of enunciation consisting of the soundboard’s played-back words and phrases and the verbalizations and vocalizations of human and dog, that meaningful interaction arises. The assemblage of button-based human-dog interactions thus “necessarily acts on semiotic flows, material flows, and social flows simultaneously” (Deleuze and Guattari Reference Deleuze[1987] 2005, 22f.).

I would therefore argue that it is rather reductive to see these videos as evidence for canine language learning, given that it is questionable to which extent the human communication in them can be characterized as “real” language use, and so much more is evidently contributing to the cooperative meaning-making than just “language.” Instead, I would like to emphasize what actually takes place in these interactions: human and dogs engaging in a highly multimodal, multisensorial, situated cocreation of meaning. This communication is constructed through touch, gaze, verbal utterances, movement, and gestures, as well as through verbal utterances and the use of the talking buttons. It is this assemblage of lexical meaning, situated embodied practices, and objects that constitutes meaningful interaction based on the familiarity between the interactants in these videos. Stella’s outside come (transcript 1, lines 17–20) could, on a mere lexical level, be interpreted as her hearing someone approaching the residence from the outside just as much as expressing a desire to go outside. The situatedness of these two words and the related world knowledge—it is a dog who “says” these words, and we know that dogs go for walks and relieve themselves outside—favors the interpretation of those words as expressing a desire rather than a comment on a happening event. Stella then pressing the button “when”—even though it is impossible to know whether she understands the meaning of this interrogative pronoun or whether to her this button is rather a way to express urgency—supports this interpretation. Whether Stella actually understands Christina’s response outside come now (transcript 1, lines 30–41) as stating that they can in fact go outside now or whether she reacts to Christina subsequently getting up and walking out of the video frame cannot be deduced from the video alone.

Similarly, it is impossible to know whether Bunny pressing the buttons “dad” and “play” (transcript 2, lines 19–22) is actually meant as the expressed desire of playing together with her male owner, as Alexis Devine interprets it based on the situatedness and previous experience. It is equally unclear whether Bunny actually means that she wants to play tug when she presses the buttons “come” and “tug” (transcript 2, lines 41–42), given that it takes several attempts of Alexis saying let’s play tug and moving the piece of rope in front of Bunny’s face (lines 44–51) before the dog starts playing. The social meaning of these situations is thus assembled through their multimodality and multisensoriality in which bodies, language, and objects, as well as sound and touch, come together to create meaning in interaction. These interactions might therefore rather be described as semiotic assemblages (Pennycook Reference Pennycook2017) than as language-based conversations—which might be just as true for human-human interactions that are equally informed by their multimodality and situatedness. Based on Deleuze and Guattari’s (Reference Deleuze[1987] 2005) concept of assemblage, Pennycook’s emphasis on semiotic assemblages aims to expand beyond multimodality “to bring in the multisensorial nature of our worlds, the vibrancy of objects and the ways these come together in particular and momentary constellations” (Reference Pennycook2017, 11). In this perspective, it is not merely the static, spatial situatedness of interaction that makes up a semiotic assemblage but rather “the dynamic relations among objects, places and linguistic resources, an emergent property deriving from the interactions between people, artefacts and space” (11). As such, it is not just the humans, the dogs, the soundboard, and “the language” that create the semiotic assemblage of interaction through talking buttons but also the bodies’ movements around the soundboard, their touching the buttons, the sound and utterances from humans and the soundboard, the touch between human and dog, and the piece of rope that semiotically come together to assemble meaningful communication.

Conclusion

Stella and Bunny are neither social media versions of Clever Hans nor proof of dogs’ ability to actively use human language facilitated by speech technology. Were we to take FluentPet and Hunger for Words seriously in their claims that talking buttons are a way to teach dogs human language and a tool for pets “to tell us what they’re thinking,”Footnote 9 it would indeed be relevant to follow Weil’s (Reference Weil2012) application of Spivak’s question “Can the subaltern speak?” to the case of animals learning human language and ask if such a practice would always lead to animals saying what humans want to hear. The answer to this could only ever be yes, as the technology of talking buttons quite literally involves humans providing specific words they deem relevant for the dog to say.

Instead, and more interestingly, the videos have revealed a “place of intersection” between language and communication as well as between human and animal. In the button-based interaction, the human interactants give up precisely the one feature consequently deemed to be unique to human language—syntactical structures—and instead use strings of lexical items without considerations of word order, inflection, and other syntactical features. This reduced “quasi-language” is semiotically enriched and assembled through its multimodal and multisensorial situatedness and embodied forms of interaction. Even though Stella and Bunny do use the talking buttons in their interactions, these soundboard utterances become meaningful only through the social relationships between humans and dogs, through their situational framing, and through movement and touch. I therefore posit that these practices of communication through talking buttons might best be understood as semiotic assemblages, deriving meaning from “the dynamic relations among objects, places and linguistic resources” and from “the interactions between people, artefacts and space” (Pennycook Reference Pennycook2017, 11).

Concludingly, we might ask whether theorizing on animals’ capacity for language has rather misunderstood the point. In trying to establish thresholds that communication must reach in order to be considered “language,” such approaches have only ever constituted a debate among humans about human exceptionalism for which the uniqueness of human language is paramount. Rather than appreciating similarities in the nonverbal, embodied communication of human and nonhuman animals or asking what humans might be able to learn from other animals when it comes to the corporealities of languaging, linguistics has always taken as its starting point the superiority of the human and its mode of communication over other forms of being and communicating. Contemporary dog training, however, teaches us that it is the human just as much as the dog who needs to be trained in successful communication. Talking buttons thus function as a curious “place of intersection” (Weil Reference Weil2012, 11) that simultaneously appeals to human exceptionalism through anthropomorphization of the animal other (i.e., giving dogs “real language”) while ostensibly necessitating an “animalization” on the part of humans through “delanguaging” their language in terms of syntactic reduction and by reminding them that human languaging always is embodied practice too.

Footnotes

1. In the following, the verbs sound and sound out will be used to mean “pressing the buttons to play back a word.”

6. Ibid.

7. In transcript 1, C = Christina Hunger, S = Stella, and SB = soundboard; * * indicates embodied action Christina, ± ± indicates gaze Christina, + + indicates embodied action Stella, and & & indicates gaze Stella.

8. In transcript 2, A = Alexis Devine, B = Bunny, SB = soundboard; § § indicates embodied action Alexis, # # indicates embodied action Bunny, and ∞ ∞ indicates gaze Bunny.

References

Andics, Attila, Anna Gábor, Márta Gácsi, Tamás Faragó, Dora Szabo, and Adam Miklosi. 2016. “Neural Mechanisms for Lexical Processing in Dogs.Science 353 (6303): 1030–32.10.1126/science.aaf3777CrossRefGoogle ScholarPubMed
Bateson, G. 1972. Steps to an Ecology of Mind: Collected Essays in Anthropology, Psychiatry, Evolution, and Epistemology. Chicago: University of Chicago Press.Google Scholar
Baxter, Susan, Pam Enderby, Philippa Evans, and Simon Judge. 2012. “Barriers and Facilitators to the Use of High-Technology Augmentative and Alternative Communication Devices: A Systematic Review and Qualitative Synthesis.International Journal of Language and Communication Disorders 47 (2): 115–29.10.1111/j.1460-6984.2011.00090.xCrossRefGoogle Scholar
Ben-Aderet, Tobey, Mario Gallego-Abenza, David Reby, and Nicolas Mathevon. 2017. “Dog-Directed Speech: Why Do We Use It and Do Dogs Pay Attention to It?Proceedings of the Royal Society Biological Sciences 284.Google Scholar
Bryant, Gregory, and H. Clark Barrett. 2007. “Recognizing Intentions in Infant-Directed Speech: Evidence for Universals.Psychological Science 18 (8): 746–51.10.1111/j.1467-9280.2007.01970.xCrossRefGoogle ScholarPubMed
Burnham, Denis, Elizabeth Francis, Ute Vollmer-Conna, Christina Kitamura, Vicky Averkiou, Amanda Olley, Mary Nguyen, and Cal Paterson. 1998. “Are You My Little Pussy-Cat? Acoustic, Phonetic and Affective Qualities of Infant- and Pet-Directed Speech.” Fifth International Conference on Spoken Language Processing 1998. https://web.archive.org/web/20220802052915id_/https://www.isca-speech.org/archive/pdfs/icslp_1998/burnham98_icslp.pdf.10.21437/ICSLP.1998-374CrossRefGoogle Scholar
Cate, Carel ten, and Kazup Okanoya. 2012. “Revisiting the Syntactic Abilities of Non-human Animals: Natural Vocalizations and Artificial Grammar Learning.Philosophical Transactions: Biological Sciences 367 (1598): 1984–94.10.1098/rstb.2012.0055CrossRefGoogle ScholarPubMed
Chambers, Jaime, Marsha B. Quinlan, Alexis Evans, and Robert J. Quinlan. 2020. “Dog-Human Coevolution: Cross-Cultural Analysis of Multiple Hypotheses.Journal of Ethnobiology 40 (4): 414–33.10.2993/0278-0771-40.4.414CrossRefGoogle Scholar
Corballis, Michael C. 2007. “Recursion, Language, and Starlings.Cognitive Science 31:697–704.10.1080/15326900701399947CrossRefGoogle ScholarPubMed
Deleuze, Gilles, and Felix Guattari. (1987) 2005. A Thousand Plateaus: Capitalism and Schizophrenia. Reprint. Minneapolis: University of Minnesota Press.Google Scholar
Fitch, W. Tecumseh, Marc D. Hauser, and Noam Chomsky. 2005. “The Evolution of the Language Faculty: Clarifications and Implications.Cognition 97 (2): 179–210.10.1016/j.cognition.2005.02.005CrossRefGoogle ScholarPubMed
Franco, Natália, Edson Silva, Rinaldo Lima, and Robson Fidalgo. 2018. “Towards a Reference Architecture for Augmentative and Alternative Communication Systems.Brazilian Symposium on Computers in Education 29:1073–82.Google Scholar
Greenebaum, Jessica. 2010. “Training Dogs and Training Humans: Symbolic Interaction and Dog Training.Anthrozoös 23 (2): 129–41.10.2752/175303710X12682332909936CrossRefGoogle Scholar
Habscheid, Stephan. 2022. “Socio-Technical Dialogue and Linguistic Interaction: Intelligent Personal Assistants (IPA) in the Private Home.Sprache und Literatur 51 (126): 167–96.10.30965/25890859-05002020CrossRefGoogle Scholar
Haraway, Donna. 2008. When Species Meet. Minneapolis: University of Minnesota Press.Google Scholar
Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch. 2002. “The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?Science 298 (5598): 1569–79.10.1126/science.298.5598.1569CrossRefGoogle ScholarPubMed
Hoy, Matthew B. 2018. “Alexa, Siri, Cortana, and More: An Introduction to Voice Assistants.Medical Reference Services Quarterly 37 (1): 81–88.10.1080/02763869.2018.1404391CrossRefGoogle ScholarPubMed
Hunger, Christina. 2021. How Stella Learned to Talk. New York: William Morris.Google Scholar
Irvine, Leslie, and Laurent Cilia. 2017. “More-Than-Human Families: Pets, People, and Practices in Multispecies Households.Sociology Compass 11 (2): e12455.Google Scholar
Jackendoff, Ray, and Steven Pinker. 2005. “The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky).Cognition 97 (2): 211–25.10.1016/j.cognition.2005.04.006CrossRefGoogle Scholar
Janssens, Luc, Liane Giemsch, Ralf Schmitz, Martin Street, Stefan Van Dongen, and Philippe Crombé. 2018. “A New Look at an Old Dog: Bonn-Oberkassel Reconsidered.Journal of Archeological Science 92:126–38.10.1016/j.jas.2018.01.004CrossRefGoogle Scholar
Jewitt, Carey. 2009. The Routledge Handbook of Multimodal Analysis. New York: Routledge.Google Scholar
Kaminski, Juliane, Josep Call, and Julia Fischer. 2004. “Word Learning in a Domestic Dog: Evidence for ‘Fast Mapping.’Science 304 (5677): 1682–83.10.1126/science.1097859CrossRefGoogle Scholar
Kulick, Don. 2017. “Human-Animal Communication.Annual Review of Anthropology 46:357–78.10.1146/annurev-anthro-102116-041723CrossRefGoogle Scholar
Kulick, Don. 2021. “When Animals Talk Back.Anthropology Now 13 (2): 1–15.10.1080/19428200.2021.1971481CrossRefGoogle Scholar
Latour, Bruno. 2005. Reassembling the Social: An Introduction to Actor-Network-Theory. Oxford: Oxford University Press.10.1093/oso/9780199256044.001.0001CrossRefGoogle Scholar
Legewie, Nicolas, and Anne Nassauer. 2018. “YouTube, Google, Facebook: 21st Century Online Video Research and Research Ethics.Forum: Qualitative Social Research 19 (3): 1–21.Google Scholar
Mondada, Lorenza. 2018. “Multiple Temporalities of Language and Body in Interaction: Challenges Transcribing Multimodality.Research on Language and Social Interaction 51 (1): 85–106.10.1080/08351813.2018.1413878CrossRefGoogle Scholar
Mondémé, Chloé 2019. La socialité interspécifique: Une analyse multimodale des interactions homme-chien. Limoges: Lambert-Lucas.Google Scholar
Owens, Nicole, and Liz Grauerholz. 2018. “Interspecies Parenting: How Pet Parents Construct Their Roles.Humanity & Society 43 (2): 96–119.10.1177/0160597617748166CrossRefGoogle Scholar
Paul, Peter V. 2009. Language and Deafness. Sudbury, MA: Jones & Bartlett.Google Scholar
Pennycook, Alastair. 2017. “Translanguaging and Semiotic Assemblages.International Journal of Multilingualism 14 (3): 269–82.10.1080/14790718.2017.1315810CrossRefGoogle Scholar
Pepperberg, Irene M. 1981. “Functional Vocalizations by an African Grey Parrot (Psittacus erithacus).Zeitschrift für Tierpsychologie 55:139–60.10.1111/j.1439-0310.1981.tb01265.xCrossRefGoogle Scholar
Pepperberg, Irene M.. 1987. “Acquisition of the Same/Different Concept by an African Grey Parrot (Psittacus erithacus): Learning with Respect to Categories of Color, Shape, and Material.Animal Learning & Behavior 15 (4): 423–32.10.3758/BF03205051CrossRefGoogle Scholar
Pepperberg, Irene M.. 1994. “Numerical Competence in an African Gray Parrot (Psittacus erithacus).Journal of Comparative Psychology 108 (1): 36–44.10.1037/0735-7036.108.1.36CrossRefGoogle Scholar
Pinker, Steven. 1995. The Language Instinct: How the Mind Creates Language. New York: Harper Perennial.Google Scholar
Porcheron, Martin, Joel E. Fischer, Stuart Reeves, and Sarah Sharples. 2018. “Voice Interfaces in Everyday Life.” CHI ’18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. https://dl.acm.org/doi/10.1145/3173574.3174214.CrossRefGoogle Scholar
Pręgowski, Michal P. 2015. “Your Dog Is Your Teacher: Contemporary Dog Training beyond Radical Behaviorism.Society & Animals 23:525–43.10.1163/15685306-12341383CrossRefGoogle Scholar
Rafferty, Sean M. 2023. Misanthropology: Science, Pseudoscience, and the Study of Humanity. New York: Routledge.Google Scholar
Rooney, Nicola J., and Sarah Cowan. 2011. “Training Methods and Owner-Dog Interactions: Links with Dog Behavior and Learning Ability.Applied Animal Behaviour Science 132 (3–4): 169–77.10.1016/j.applanim.2011.03.007CrossRefGoogle Scholar
Savage-Rumbaugh, E. Sue, Duane M. Rumbaugh, and Sally Boysen. 1978. “Symbolic Communication between Two Chimpanzees (Pan troglodytes).Science 201 (4356): 641–44.10.1126/science.675251CrossRefGoogle ScholarPubMed
Spivak, Gayatri. 1988. “Can the Subaltern Speak.” In Marxism and the Interpretation of Culture, edited by Cary Nelson and Lawrence Grossberg, 271–313. Urbana: University of Illinois Press.Google Scholar
Terrace, Herb, L. A. Petitto, R. J. Sanders, and Thomas Bever. 1979. “Can an Ape Create a Sentence?Science 206 (4421): 891–902.10.1126/science.504995CrossRefGoogle ScholarPubMed
Waal, Frans de. 2016. Are We Smart Enough to Know How Smart Animals Are? New York: Norton.Google Scholar
Weil, Kati. 2012. Thinking Animals: Why Animal Studies Now? New York: Columbia University Press.Google Scholar
Wilson, Lindsay. 2021. “From Clever Hans to Bunny the TikTok Dog: An Exploration into Animal-to-Human Communication.Macksey Journal 2: art. 148.Google Scholar
Wittgenstein, Ludwig. 1986. Philosophical Investigations. Translated by G. E. M. Anscombe. 2nd ed. Reprint. Oxford: Blackwell.Google Scholar
Wolfe, Cary. 2003. Zoontologies. Minneapolis: University of Minnesota Press.Google Scholar
You, Guanghao, Balthasar Bickel, Moritz M. Daum, and Sabine Stoll. 2021. “Child-Directed Speech Is Optimized for Syntax-Free Semantic Inference.Scientific Reports 11:16527.10.1038/s41598-021-95392-xCrossRefGoogle ScholarPubMed
Zuberbühler, Klaus 2019. “Evolutionary Roads to Syntax.Animal Behaviour 151:259–65.10.1016/j.anbehav.2019.03.006CrossRefGoogle Scholar
Figure 0

Transcript 1.

Figure 1

Figure 1. Christina Hunger and Stella using the soundboard

Figure 2

Transcript 2.

Figure 3

Figure 2. Bunny using the soundboard