Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-01-12T18:47:50.713Z Has data issue: false hasContentIssue false

Figure Composition

Published online by Cambridge University Press:  01 January 2025

Charles H. P. Zuckerman*
Affiliation:
University of Sydney, Australia
*
Contact Charles H. P. Zuckerman at Department of Linguistics, A20–John Woolley Building, University of Sydney, Australia (charles.zuckerman@sydney.edu.au).
Rights & Permissions [Opens in a new window]

Abstract

Linguistic anthropologists have shown that the way a person reports speech or represents discourse—for example, whether they ‘put on an accent’ or merely repeat attributed words—is crucial for understanding what social action that person is undertaking. And yet, our tools for talking about the form of represented discourse are still crude. This paper offers a new tool in the notion of figure composition, defined as the formal semiotic elements that comprise a given voice. Reflecting on figure composition, alongside Agha’s (2005) notion of figure transparency, invites us to shift from asking what kind of represented discourse any given stretch of represented discourse is to asking (1) What elements of represented discourse appear to be coming from the quoted figure(s)? and (2) How are these elements used to produce interactional effects?

Type
Articles
Copyright
Copyright © 2021 Semiosis Research Center at Hankuk University of Foreign Studies. All rights reserved.

In 2010, I was researching the history of ethnic relations in Laos before the 1975 socialist revolution. During an interview, I asked Noy,Footnote 1 a Lao-American woman in her mid-60s, about inter-ethnic tensions in Laos. She began to tell me a story, popular when she was a child in the capital city of Vientiane. Its central conceit was that ethnic Hmong people don’t bargain like Lao people. Noy told the story through represented discourse and, in the process, gave voice to an imagined Hmong man. As the story unfolded, Noy’s figure of the Hmong man gradually emerged in higher resolution and became more vivid. When she first quoted him, the voice was barely differentiated from its narrative surround. Later, Noy used altered speech, her arms, and her head to enact a caricatured figure that stood out starkly against her own voice.

Over the past several decades, represented discourse like this has been a major topic in linguistic anthropology (see Lempert Reference Lempert2014). The superficially niche topic has borne large fruit, informing how we understand the relation between speakers and those whose words they ‘take on’ (Tannen Reference Tannen2009), and allowing us to see how and why people perform figures of identity and alterity when they do (Hastings and Manning Reference Hastings2004).Footnote 2 This work has shown that the way in which a person represents discourse—for example, whether they ‘put on an accent’ or merely repeat attributed words—is crucial for understanding what sort of social action that person is undertaking (Bakhtin Reference Bakhtin1981, 259; Hill Reference Hill1995, and many others). We know, for instance, to look toward Donald Trump’s shaking hands and gape-mouthed vowels to see that his enactment of the reporter Serge F. Kovaleski was not informative but mocking (see Hall, Goldstein, and Ingram Reference Hall2016, 86–88; Figure 1).Footnote 3 This work has also shown that represented discourse does not merely make use of ideologies of linguistic form, but circulates and (re)produces them, and thereby helps establish ties between kinds of language and kinds of persons, events, and interactional effects (Agha Reference Agha2005, 48).

Figure 1. Donald J. Trump mocks Serge F. Kovaleski

And yet, while much work has documented that the form of represented discourse is significant, our tools for talking about that form are still crude. What Goffman (Reference Goffman1974, 530) wrote in Frame Analysis remains true today: our competence in recognizing the importance of form in represented discourse “is far ahead of our capacity to explicate the practices involved.” Analytically, we are mostly left to sieve such form through the binary of ‘direct’ and ‘indirect speech,’ a typology that is so basic and so susceptible to complication that, in the end, it leaves the majority of what is interesting in represented discourse out of its scope. The result is a mismatch between our sense of the social utility of discursive form and our lack of analytic resources for exploring how people put various formal elements to work.

In this paper, I offer a tool for talking about the form of represented discourse in the notion of figure composition. Simply put, a figure’s composition is defined as the formal semiotic elements that comprise that figure. These elements are united by a shared, transposed origo.Footnote 4 When one person quotes another, she employs these formal elements to compose a figure.Footnote 5

The notion of figure composition brings diverse formal elements of represented discourse, many of which have been analyzed separately in the literature, under one conceptual umbrella. As such, it offers a clearer language for describing how speakers compose figures and clarifies the relationship between this question and the problem of what makes represented discourse recognizable as represented discourse—what Agha (Reference Agha2005, 43) has called the ‘transparency’ of figures. Together, figure composition and figure transparency provide a comparative framework for describing how speakers use represented discourse to social effect, one which clarifies many of the core issues which—since the work of Bakhtin, Voloshinov, and Goffman, among others—have made represented discourse such an alluring subject for linguistic anthropologists. Broadly put, the notion of figure composition redirects discussion about the form of represented discourse. It invites one to shift from asking what kind of represented discourse any given stretch of represented discourse is to asking (1) What elements of that represented discourse appear to be coming from the quoted figure(s)? and (2) How are these elements used to produce interactional effects?

In what follows, I begin by introducing the notions of figure composition and figure transparency, sketching their advantages over and against other accounts of represented discourse. I next offer a range of formal elements that frequently occur in figure compositions, a heuristic catalogue of what previous research has found. I then return to Noy’s story to demonstrate how the notion of figure composition can enrich analysis, as I also shift from discussing what compositions are comprised of to what compositions can achieve by way of social action. Captioning these achievements ‘composition effects,’ I show that the gradient resolution of Noy’s Hmong figure diagrammed a supposed Hmong irrationality, as it also further tied certain phonetic forms to Hmong people generally. Finally, I conclude that tracing similar ‘composition effects’—and related ‘transparency effects’—is the key task for those trying to understand the roles of represented discourse in social action.

Figure composition and figure transparency

For represented discourse to be construed as represented discourse, it definitionally must present some element of itself as coming from a distinct time, place, person, “footing” (Goffman Reference Goffman1979; Holt Reference Holt2007:57), “vantage point” (Clark and Gerrig Reference Clark1990, 786), et cetera. Deictics—which appear to mutate in the move from ‘indirect’ to ‘direct’ reported speech—make this transposition especially obvious,Footnote 6 but such transposition characterizes represented discourse more generally (Hanks Reference Hanks1990, 205; 212; 215; 222). This suggests a simple definition for figure composition: a figure’s composition is defined as those elements of a voicing event that share a transposed origo or indexical ground (Hanks Reference Hanks1990, 222; Haviland Reference Haviland1993, 40; Agha Reference Agha2005, 42; Agha Reference Agha2007, 52; see also Nakassis Reference Nakassis2020).

The notion of figure composition complements what Agha (Reference Agha2005, 43) has called the transparency of voicing contrasts. A figure’s transparency and its composition are related but distinct ideas (compare Hanks Reference Hanks1990, 222). Figure transparency characterizes the extent to which a represented figure is recognizable as such, and thus construable as distinct from the figure representing it. It is the obviousness or clarity of a figure’s transposed indexical ground, construed with reference to inference, metapragmatic signs, and the formal properties of the figure’s composition itself. The most recognizable example of a sign that makes a figure more transparent is the quotation mark (Klewitz and Couper-Kuhlen Reference Klewitz1999). A figure’s composition, in contrast, consists in the transposed elements that comprise it—the words written within such quotation marks, for example. A figure’s transparency is a function of how thickly drawn the boundaries showing where represented discourse begins and ends are; its composition is what lies ‘within’ those boundaries.

As many have noted in a different technical language, a figure’s transparency and a figure’s composition are often formally entangled with one another and functionally related. On the one hand, for a given figure composition to be recognized as represented discourse, it must be transparently transposed.Footnote 7 Sometimes this transposition is a function of signs distinct from the represented discourse. Verba dicendi are prototypical examples. In prefacing a quotation with a matrix clause that includes such a verbum dicendi, a speaker marks off what is to follow (see i in Table 1). Sometimes speakers achieve a similar transparency with prosodic shifts, such as intonation contours or pauses, that signal or ‘flag’ that reported speech may be coming or ending (Kvavik Reference Kvavik1986, 356; Klewitz and Couper-Kuhlen Reference Klewitz1999, 476; 479). Some of these transparency increasing signs even co-occur with the elements of the figure but in a different modality (see ii). Sidnell (Reference Sidnell2006, 390), for example, found that an interactant might signal a reenactment by simultaneously gazing away from coparticipants. Here, gaze works alongside the bodily machinery of reenacting to draw the boundaries between what is representing and what is represented.

Table 1. Figure Transparency and Figure Composition

On the other hand, a figure’s composition itself is often construable as a reflexive sign of that figure’s transparency (iii).Footnote 8 From noticeable shifts in prosody (Klewitz and Couper-Kuhlen Reference Klewitz1999, 482; Couper-Kuhlen Reference Couper-Kuhlen1999; Bolden Reference Bolden2004), to changes in facial positioning, posture, or gaze, to choices of vocabulary, utterance initial response cries, or linguistic code (Clark and Gerrig Reference Clark1990, 774), actors and analysts can look toward “the likeness or unlikeness of co-occurring chunks of text” as a sign of the “sameness or difference of speaker” (Agha Reference Agha2005, 40) and thus, reconstruct transposed origos left unstated (Hanks Reference Hanks1990, 216). This capacity in part accounts for the possibility of “unintroduced dialog” (Tannen Reference Tannen1986, 318–319; cf. Mathis and Yule Reference Mathis1994), that is, quotation without anything of type (i). Think of the puppeteer who moves his wooden puppet’s mouth as his own lips remain shut—those same moving wooden lips are both a part of the represented discourse’s composition and a clue that make it more transparent. Alongside these other mechanisms, construals of transparency can also depend upon inferences regarding patterns of turn taking, adjacency pairs, et cetera (iv; Mathis and Yule Reference Mathis1994, 66; Klewitz and Couper-Kuhlen Reference Klewitz1999, 483) and inferences regarding the nature of the narrated event, the narrating event, and the figures within them, i.e., type (v).

Beyond Typologies that Mix Figure Composition and Figure Transparency

In linguistic anthropology, several scholars studying represented discourse have created typologies of it that combine dimensions of transparency and composition (see, for example, Urban Reference Urban1989, 43, Hickmann Reference Hickmann1993 and Agha Reference Agha2005, 43). These typologies tend to be somewhat ambivalently overlaid on the distinction between direct and indirect discourse.Footnote 9 Scholars are ambivalent about this overlaying because most regard the direct and indirect discourse distinction as, in and of itself, essentially inadequate (Günthner Reference Günthner1997; Holt and Clift Reference Holt2006, 11; Good Reference Good2015, 573). Even classic statements in the study of reported speech—for example, Jespersen’s (Reference Jespersen2007, 290) discussion of two kinds of ‘indirect speech,’ or Voloshinov’s (Reference Voloshinov1973, 125–140) comments on ‘modifications’—have pointed out the distinction’s limits, suggesting that the “boundary between direct and indirect discourse is fuzzy” (Tannen Reference Tannen2007, 103). Most contemporary linguistic anthropologists would probably agree that between the two poles of these two kinds of report lies “a range of blended alternatives” (Lucy Reference Lucy1993, 95). Many have aimed to illuminate this range with finer grained distinctions in types, such as ‘free indirect discourse,’ and lists of the features that distinguish one type from another.

Notions of figure composition and transparency allow us to set aside the categories of direct and indirect speech entirely. This is worth doing for the obvious reason that has been noted repeatedly: the distinction does not capture how represented discourse works in the world. Instead, it forces us to reckon with evidence that betrays its inadequacy: innumerable hybrids and intermediaries (pace Partee Reference Partee1973, 411). When one looks closely, it becomes clear that all utterances that include ‘represented discourse’ can appear ‘direct’ in one respect, but ‘indirect’ in other respects. The notion of ‘direct’ with which we are then left becomes merely a shorthand for referring to transposition of an origo regarding one element or another.Footnote 10

Unlike the notion of composition, the notions of direct and indirect discourse have been used to describe utterances not elements. Taking the utterance as a unit is fundamentally misleading, and in a growing body of work the heterogeneity of represented discourse is as clear as crystal. A given stretch of represented discourse can have multiple elements that shift somewhat independently (Evans Reference Evans2012). For example, a speaker might be ‘directly reporting’ some dimension of another’s speech, while laughing through the quotation and thus inserting a commentary ‘in her own voice’ (e.g., Goodwin Reference Goodwin2007, 20). An element’s role in a composition can furthermore shift across and within events of represented discourse. For example, Klewitz and Couper-Kuhlen (Reference Couper-Kuhlen1999, 473–474) found that sometimes the “prosodic formatting of a voice may ‘evolve’ during the stretch of speech being reported,” say, beginning with a sudden shift to a high prosodic register, but dropping as it continues.

The notions of figure composition and transparency invite us to explore this heterogeneity, to trace how the elements of represented discourse combine and transform within and across interactions. From the perspective of these notions, resolving, categorizing, or purifying the innumerable hybrids of indirect and direct speech that linguistic anthropologists have uncovered appears pointless, only a consequence of having the terms in the first place (cf. Latour Reference Latour1993, 10–11). From their vantage, direct and indirect report reveal themselves as not categorical devices for represented discourse in isolation, but as fundamentally relational ideas, compelling because they capture the sense that some figure compositions involve not just transposition, but more transposition.

Common Elements of Compositions

Speakers can divide a quotation any way they are able to so long as they can get their addressees to recognize what they are doing.

—Herbert Clark and Richard Gerrig (Reference Clark1990, 779)

With this in mind, in this section I trace some of the multi-modal elements that have been shown to play roles in figure compositions cross-linguistically (Clark and Gerrig Reference Clark1990, 775; Stec, Huiskes, and Redeker Reference Stec2016, 5).Footnote 11 As I do so, I emphasize these elements’ relative independence from one another, because it is precisely that independence that has been obscured in descriptions of ‘direct’ versus ‘indirect’ discourse.

I offer these elements as heuristics meant to highlight a range of possibilities. That is, this list is not a limit on what may play a role in figure compositions but a suggestion as to what is likely to play such a role. When a researcher encounters a stretch of represented discourse, she might thus use this list as a guide for asking herself: how is the represented discourse in front of me composed? What are its significant elements?

A few further qualifications are in order. First, the elements that follow are not always capable of being cleanly distinguished one from another. Second, the medium of communication matters. People speaking in the Lao language while they stand face-to-face can represent discourse differently than people texting on phones or signing ASL through a glass window (see Jones and Schieffelin Reference Jones2009; Klewitz and Couper-Kuhlen Reference Klewitz1999, 471). Different media afford different figure compositions. Third, and relatedly, distinct languages have distinct grammatical forms, lexical items, and conventions of transposition that affect what figure compositions are possible (Coulmas Reference Coulmas1986, 14; Rumsey Reference Rumsey1990, 347; Evans Reference Evans2012).Footnote 12 For example, some verbs of speaking or acting, which are primarily resources for increasing transparency, nevertheless signal or constrain the compositions they control: some require deictic transpositions (Coulmas Reference Coulmas1986, 19); some ensure de dicto readings; and some prefigure certain kinds of spoken, gestural, or corporeal performances (e.g., Streeck Reference Streeck2002, 594–595). Fourth and finally, recognizing a certain element as part of a figure’s composition requires that element to meet a threshold of transparency, and this threshold is not always met. Researchers will thus inevitably find cases in which it is unclear to which figure a formal element belongs, or cases in which an element plausibly belongs to more than one figure (see Woolard Reference Woolard1998). These cases are not a problem for the notions of composition or figure transparency; rather, they are a particularly interesting part of social life that these notions help us identify and better characterize.

Deictics

Unlike many of the elements of represented discourse,Footnote 13 when deictic lexemes are transposed this tends to be obvious (I refer to deictic lexemes here, but some deictics, of course, are affixal, gestural, ocular, et cetera). This obviousness clearly relates to their role in reference (see Silverstein Reference Silverstein1981). That is, since deictic expressions such as this or I are used to “identify referential objects relative to indexical grounds” (Hanks Reference Hanks1990, 197), represented utterances that use them only correctly refer if those grounds are understood as transposed.Footnote 14 If people misconstrue these transpositions, the utterances can become unintelligible; such errors in transposition thus tend to make themselves known in failures to refer.

Perhaps it is because of the salience of deictic transpositions that scholars have treated deictics as something of a litmus test for whether a stretch of speech is ‘direct’ or ‘indirect.’ And yet, while many (e.g., Silverstein Reference Silverstein2014, 141) describe deictics as indexical anchors of the whole utterances in which they occur, utterances are not always cohesive ships to be anchored, and deictic transpositions can happen (or not happen) independently of other transpositions. Put simply, the deictics in a stretch of represented discourse (i) need not always be transposed at the same time (or in the same way), nor (ii) need they necessarily be transposed in order for other elements in the same stretch of represented discourse to be transposed. On the first point, sometimes “shifts in deictic origo do not map neatly onto clause boundaries” (Agha Reference Agha2005, 43 on ‘free indirect speech’) and some languages have conventions of split deixis—take Russian, for example, where “in indirect speech, the pronoun deixis [is] adjusted to the report situation while temporal deixis by means of tense forms keeps its pivot in the reported situation” (Coulmas Reference Coulmas1986, 19).Footnote 15 On the second point, it is easy to find examples where speakers, for instance, take on or simulate the prosody of another without transposed deictics. As Klewitz and Couper-Kuhlen (Reference Klewitz1999, 470) put it (citing Günthner Reference Günthner1997, among other works), “expressive prosodic marking cuts across the canonical, grammatically based distinction between ‘direct’ and ‘indirect’ speech… .” Contrary to how many scholars discuss deictics in relation to quotation, there does not seem to be an easy way to articulate a universal, implicational hierarchy in which deictics hold a privileged place in the transposition of represented discourse.

Suprasegmentals

As the last example from Klewitz and Couper-Kuhlen indicates, people can also transpose the suprasegmentals of their speech, shifting the indexical ground of a whole host of different formal features including loudness, duration, pitch, speech rate, timing, pausing, voice quality, stress, and phonetic realization of lexical tone.Footnote 16 That suprasegmentals can appear to be coming from the speech of another in cases of quotation has frequently been remarked upon and, more recently, studied in depth (e.g., Tannen Reference Tannen1986; Clark and Gerrig Reference Clark1990:776; Mitchell-Kernan and Cohen Reference Mitchell-Kernan2017, 390; Günthner Reference Günthner1997; Couper-Kuhlen Reference Couper-Kuhlen1999; Günthner Reference Günthner1999; Klewitz and Couper-Kuhlen Reference Klewitz1999). Sapir (Reference Sapir1949, 193), for instance, recounted that, “The Nootka Indians of one tribe frequently imitate the real or supposed speech peculiarities of those belonging to other Nootka tribes, the stress being primarily laid not so much on peculiarities of vocabulary and grammatical form as on general traits of intonation or sound articulation (cf. our New England ‘nasal twang’ and Southern ‘drawl’).” The role of such suprasegmentals is perhaps most patent in long narratives and performances, such as puppet theatre, where a single voice actor speaks as if he were a host of different characters with distinct pitch ranges, speech rates, and intonation contours (Gross Reference Gross1983, 300). This is also evident in Don Gabriel’s heteroglossic story of his son’s death, famously captured by Jane Hill (Reference Hill1995), in which Don Gabriel uses prosody, among other elements, to juxtapose different moral and biographical figures (Keane Reference Keane2011).Footnote 17 In both cases, these different voices flesh out the figures they depict as of certain kinds, as they also increase transparency as to who is speaking.Footnote 18

As these studies have found, suprasegmental marking is, in contradiction to what is often presumed, “rather widespread” on apparent ‘indirect reports’ (Klewitz and Couper-Kuhlen Reference Klewitz1999, 470–471; pace Jansen, Gregory, and Brenier Reference Jansen2001). For instance, Klewitz and Couper-Kuhlen (Reference Klewitz1999, 478) describe how a woman named Alina animates an “older guy” and a “young chick,” even as she is describing the “young chick” within otherwise non-transposed discourse:

The man’s voice … is accompanied by prosodic shifts to forte and allegro. The young girl’s speech (f1) in lines 16–17 coincides with a marked prosodic shift to high register, accompanied by a paralinguistic shift to nasal, breathy voice.

When we begin to inspect examples such as these, we take the distinction between so-called “direct” and “indirect” discourse to its limits. It leaves us with only tautologies and contradictions. If we try to maintain it, we must treat suprasegmentals as both constitutive features of the two kinds of report and variables that can occur within either kind.

Segmentals

As speakers can modify their suprasegmentals to achieve a transposed indexical ground, so too can they alter their segmentals. People do this frequently when they represent figures who they purport speak a different dialect or language. Take, for example, how one teenager portrayed the speech of a white youth appropriating Black speech (Bucholtz Reference Bucholtz2011, 258; Table 2):

Table 2. Example from Bucholtz 2011

Over the course of the portrayal, this young man altered his phonological palette away from his normal speech (along with the use of lexical, suprasegmental, and grammatical alterations). He, for instance, vocalized his postvocalic /r/ in “your ass” and realized the diphthong /aj/ in “my” as a monophthong [mɑ:].

Morpho-Syntactic

Morpho-syntax can also be transposed into figure compositions. Take gender indexicals in Kosati. Haas (Reference Haas1944, 145) described how when Koasati men quoted Koasati women they adopted women’s forms and vice-versa (see Agha Reference Agha2005:57–58). Years later, Kimball (Reference Kimball1987) found that Koasati speakers were now using those same ‘male’ forms, which they had otherwise set aside, only to report the speech of respected people from past generations, many of whom were deceased. Meek (Reference Meek2006, 100) likewise documents transposed morpho-syntax in her descriptions of representations of American Indian speech in American films (what Meek calls ‘Hollywood Injun English’). These depictions tend to exhibit unmarked tense and other non-standard English morpho-syntax—e.g., a character in Disney’s Peter Pan says, “Squaw no dance, squaw get-um firewood.”

Lexical

As Meek’s examples further demonstrate, speakers also use transposed lexemes in represented discourse (see Voloshinov Reference Voloshinov1973, 137; Hanks Reference Hanks1993, 136). When James Joyce, for instance, writes ‘moocow’ in the first page of Portrait of the Artist as a Young Man, the form is patently transposed into the mouth of the young narrator (see Banfield Reference Banfield1973, 32). Such transposed lexemes are often associated with the stances of particular individuals, kinds of persons, or linguistic registers (see Agha Reference Agha2005). Like the other elements of figure compositions, they can give figures flesh.

That lexemes can be transposed into figure compositions also underlies a classic philosophical distinction between de dicto and de re utterances (Partee Reference Partee1973). Banfield (Reference Banfield1973, 5) explains the distinction with the sentence, ‘Oedipus said his mother was beautiful.’ This sentence has at least two readings: “(a) that Oedipus said that some one person who the speaker reporting his speech identifies as Oedipus’ mother was beautiful, or (b) that Oedipus said something like, ‘My mother is beautiful.’” (a) is its de re reading. Following that reading, the person representing the speech might be characterizing a host of utterances: for example, ‘Jocasta is beautiful,’ ‘My wife is beautiful,’ ‘The mother of my children is beautiful’ (these examples are from Coulmas Reference Coulmas1986, 4). (b), in contrast, is the sentence’s de dicto reading, which implies that Oedipus had called Jocasta his “mother” in the original utterance. What is in question in deciding whether the meaning of the utterance is de re or de dicto is whether the lexeme mother is to be treated as originating from the figure of Oedipus himself. (Of course, the facts of the matter of what Oedipus actually said—presumably in Greek—are irrelevant here, as they are irrelevant in many discussions of represented discourse (Coulmas Reference Coulmas1986, 6; see Tannen, e.g., Reference Tannen2007, 17 on “constructed dialogue”).)

The possibility of de dicto indirect speech—as in “Oedipus said his mother was beautiful”—underlines another respect in which the category indirect speech, as traditionally understood, can contain forms indexically tied to the origo of the quoted figure.Footnote 19 And de dicto readings are not exceptional in discourse, but pervasive (see Coulmas Reference Coulmas1986). In fact, to repurpose Partee’s (Reference Partee1973, 415) contention that “the quoted sentence always has a de dicto interpretation,” we might say that the elements of a figure composition always have a de dicto interpretation. This de dicto interpretation is their essence as quoted elements, isomorphic with the fact that they are anchored in a given narrated event that is in some way distinct from the speech event.

The Body

That the non-sonically resonating parts of the body can also take part in figure compositions is well documented. Just how much the body can add to a representation of speech is apparent, for example, in the comedian Sarah Cooper’s impersonations of Donald Trump (figure 2). Cooper lip-syncs—but really, more accurately, eye-, eyebrow-, face-, hand-, and shoulder-syncs—Trump (and sometimes his interlocutors) as original audio from his speeches and interviews plays. The result is a vivid underlining of the absurd bits of Trump’s language, re-embedded in a new body.

Figure 2. Sarah Cooper impersonates Trump using audio of him speaking (Source: “How to person woman man camera tv”; https://www.youtube.com/watch?v=j8oaaP68i4s; accessed September 24, 2020)

Some studies still discuss represented discourse as if it were only either sonic or written, and many linguistic anthropologists continue to exclusively use audio recordings—rather than video recordings—of interactions, even when those interactions occur in environments where participants also have visual access to one another. But this is changing. In the last decade research on re-enactments, bodily-quoting, constructed action, and the multi-modality of reported speech has blossomed (e.g., Clark and Gerrig Reference Clark1990; Haviland Reference Haviland1993; Streeck Reference Streeck2002; Sidnell Reference Sidnell2006; Goodwin Reference Goodwin2007; Keevallik Reference Keevallik2010; Keevallik Reference Keevallik2013; Sandlund Reference Sandlund2014; Cormier, Smith, and Sevcikova-Sehyr Reference Cormier2015; Stec, Huiskes, and Redeker Reference Stec2015; Stec, Huiskes, and Redeker Reference Stec2016; Hodge and Cormier Reference Hodge2019). While some of this research draws a distinction between representations of bodily movements and spoken discourse, the concept of figure composition is in line with a growing consensus that “verbal and bodily quoting are essentially the same kinds of activities” (Keevallik Reference Keevallik2010, 402).

Here is a simple example that shows speech and the non-sonic body working together: a speaker composes a figure—she says, “No!”—and, as she does so, she raises her hand up, palm facing outward, in a “please stop”-like gesture (Streeck Reference Streeck2002, 193), quoting the hand movement alongside the speech (e.g., Haviland Reference Haviland1993, 28–29). This gesture takes a “character viewpoint” (McNeill Reference McNeill1992, 190), transposing the indexical ground of the body such that it is treated as if it were emerging from the enacted figure. Sometimes such character viewpoint compositions include much more than the hands. In Keevallik’s (Reference Keevallik2010) examples of bodily quoting, for instance, a dance instructor corrects a student by demonstrating “the wrong way of leading the sugar push.” Evident in these examples is the sense that a person’s body is often the best stand-in for a figure’s body (Sweetser Reference Sweetser2012, 13; cited in Stec, Huiskes, and Redeker Reference Stec2016, 3).

But interactants also have the capability of producing “observer viewpoint” gestures, in which parts of the body (especially the hands) compose distinct parts of figures (e.g., not just hands): such gestures “take place at arm’s length from the observer, as if the hands were detached from the body, self-sufficient organs of representation” (Streeck Reference Streeck2009, 207; citing Sauer Reference Sauer1999, 221). For example, as a speaker describes a blob rising up a drainpipe, he moves his hands up, iconically presenting the blob and its trajectory (McNeill Reference McNeill1992, 191). While the relation between observer and character viewpoints has often been described as analogous to the distinction between direct and indirect report (e.g., Parrill Reference Parrill2012, 104), note that what distinguishes the two viewpoints here is not whether the origo of action has been transposed—it has, in both cases—but whether the gesturer’s body is portraying the figure’s body. These different “viewpoints” are thus not fundamentally distinct, but rather they use distinct principles of composition that align with the narrating and narrated environments in different ways (see Russell Reference Russell2012; cf. Haviland Reference Haviland1993).

During interaction, speakers often alternate between these two perspectives, as they also compose dual viewpoint gestures or “chimeras” (McNeill Reference McNeill1992, 124; Parrill Reference Parrill2009). Some of these involve both observer and character viewpoint, while others involve compositions of multiple figures occurring simultaneously. Take the following example (McNeill Reference McNeill1992, 124; originally from McClave Reference McClave1991), in which two character viewpoints are represented simultaneously. Here the speaker points to his own body as he reports, “[you] had your doctor go over to check out that person’s claim.” In doing so, his pointing hand stands in for the hand of the figure doing the pointing, as his body stands in as the figure being pointed to.Footnote 20

What is remarkable about the research into the bodily dimensions of represented discourse—that is, the issue which examples such as the above make so astonishingly clear—is that not only can the body play an integral part in figure compositions, but the body itself is divisible into different elements. That is, some parts of the body may play a role in a figure’s composition while other parts of the body are playing no such role; or, two body parts may play different roles. To capture this, many who work on bodily communication have analytically divided the body up into different articulators, for example, the head, face, eyes, arms, and torso (Cormier, Smith, and Sevcikova-Sehyr Reference Cormier2015, 1). These physically defined articulators, in turn, have been shown to have unique affordances. The eyes, for instance, are the only human organ capable of, and construable as, both giving and receiving visual information.

These semiotic affordances of different parts of the body are especially clear in studies of sign language (see Stec, Huiskes, and Redeker Reference Stec2016, 1), where the most careful work on multi-modal represented discourse (under the umbrella of ‘role shift’ or ‘constructed action’) is done, and where the languages being studied have the most developed semiotic resources for using the non-sonic parts of the body as elements in figure compositions (see Cormier, Smith, and Sevcikova-Sehyr Reference Cormier2015; Stec, Huiskes, and Redeker Reference Stec2016; Hodge and Cormier Reference Hodge2019 for discussion of this literature and its relation to work on represented discourse in spoken languages). I might have thus been justified in separating this section with sub-sections on the eyes, the hands, the body, and so forth—as I did above regarding deictics, suprasegmentals, segmentals, et cetera.Footnote 21 If I were to have done so, the distinctions I drew among elements would have been as relatively arbitrary as those distinctions I drew above. In practice, whether it is worth distinguishing one element of a composition against others in relation to any stretch of empirical material always depends on the empirical facts, on whether these bodily components are being used in meaningful and relevant ways.

Recent work on the semiotic dimensions of the body shows the body’s import in many of the communicative environments linguistic anthropologists have studied. So much so that we might wonder what we have missed in classic studies of represented discourse. What, for instance, might Don Gabriel have done with his eyes, hands, and mouth as he gave voice to the many figures in his story of the death of his son? What might the two boys playing ping-pong that Hoyle (Reference Hoyle1993) describes and Agha (Reference Agha2005, 50) further analyzes have been doing with their bodies as they narrated their game as if they were sportscasters?

Orthography and Computer Media

Reflecting on the body’s role in figure compositions shows the flimsiness of the boundary between speech and other semiotic activities. As such, it opens our way toward thinking of how other modalities might afford represented discourse. The orthographies in which language is written provide an obvious example, as changes in medium, font, formatting, layout, and spelling, among other formal features, can all become elements of a figure’s composition in written or multi-modal discourse (Clark and Gerrig Reference Clark1990, 786; see Jones and Schieffelin Reference Jones2009; Hoffmann-Dilloway Reference Hoffmann-Dilloway2011). Emoji’s offer an obvious site of interest in this regard (Danesi Reference Danesi2016). We might also think of the represented discourse millions of people create in their own video productions that they later post to websites such as YouTube. From ticky-tacky effects filters that, say, make one’s face look like a racoon or adorable bear to deep fakes that appear to capture the whole essence of a person, computer mediated platforms offer a range of new possibilities for figure compositions with which people are currently experimenting.

Costume, Props, and Other Non-Corporeal Semiotics

Discussion of these less traditionally recognized elements of represented discourse also draws attention to a host of additional elements that can be used in compositions to make figures palpable. The use of physical props, makeup, and costumes is most obvious and well documented in theatre or film, but small little shows with costumes and special effects, so to speak, occur in ordinary interaction as well, as people adjust their glasses or use props like napkins or pieces of paper to compose figures (Hall, Goldstein, and Ingram Reference Hall2016, 85; Goffman Reference Goffman1974). There are also extreme cases where what is an element in a figure composition and what is an element of a person are physically indistinguishable. When Christian Bale prepared to portray Dicky Eckland in the movie The Fighter, he shadowed the real life Eckland to adopt his “distinctive mannerisms and speech patterns,” what some who knew Eckland called “Dickynese” (Lim Reference Lim2010). But he also lost a third of his weight to portray Eckland’s gaunt body, hollowed out from drug addiction. Is this latter weight loss an element of Bale’s composition of Eckland? One could argue for or against the idea, but the question brings to mind a host of other questions regarding not how we define the analytic of composition per se, but regarding how compositions integrate with social life. Bale’s case invites us to think of many of the things that people do which can blur the line between altering oneself and portraying another.

One Variety of Composition Effect: The Gradient Resolution of a Figure

The notion of figure composition is useful not just because it allows us a better language for describing the form of represented discourse, but because it offers a vantage from which we can inspect the relation between that form and social and semiotic action. To capture the various interactional entailments a figure’s composition can have I use the term composition effect. In this section, I return to Noy’s story to describe one variety of such an effect, in which a series of compositions gives the sense, intertextually derived, of the gradient resolution of a figure. Here the cross-modal architecture of figure compositions across events of represented discourse makes some figures appear lower and others higher resolution. Such gradient resolution of figures can serve as a diagrammatic icon of something else—the arc of a story, for instance. In fact, as many have found, narrators often incorporate more robust compositions at the end of a narrative, increasing vividness (Mathis and Yule Reference Mathis1994, 67). In Don Gabriel’s narration of his son’s death, for instance, he moves from less to more reported speech, and begins, as Hill (Reference Hill1995, 115) describes, to incorporate more and more “direct” reports (cf. Hymes Reference Hymes1981, 321 on “vocal realization”). A gradient figure might also be used as a diagrammatic icon of the competency or lack of competency at some skill: Keevallik (Reference Keevallik2010, 420), for instance, describes how dance instructors portray the incorrect stiff dancing of students and correct dancing with very different compositions: they move less stiffly, more fluidly in the correct demonstrations, accompanying their moves with on-time snapping and singing with a breathier and more passionate voice (see also Weeks Reference Weeks1996, 274). Or the effect might be used to contrast different figures in a story, whose compositions model dimensions of their characteristics. In her description of the story that a young medical resident told about his day working in the emergency room, for example, Tannen (Reference Tannen2007, 123–124) writes that:

The paralinguistically exaggerated role-play of Billy’s voice, and the slightly less marked animation of his friends’ voice, both emotion-filled, contrast sharply with the relatively ordinary quality of the voice in which the speaker/hospital staff dialogue is represented. These contrasting voices create the dramatic tension between the unreasonable behavior of ‘these three drunk guys’ and the reasonable behavior of the speaker/staff. This contrast highlights as well the central tension in the story: that the visual display of blood and the extremity of the boys’ emotional display were out of proportion to the severity of the wound.

The notion of gradient resolution is thus an umbrella term for a broad range of composition effects. As such, it reminds us that figures of personhood are not always treated interactionally as monolithic types (Agha Reference Agha2005), but gradiently evocable, and that this gradience can itself be a tool for effective semiotic action.

This is exactly what happens in Noy’s story of the Hmong vendor: the gradient resolution of his figure composition comes to underline the structure of his irrationality.

Noy’s Story

Noy, along with most Lao now living in America, was a refugee in the early 1980s. Fleeing from Laos, her young children in tow, she moved into a series of camps before finally settling in the United States. At the time of the interview, she, along with a few other members in the Lao-American community, was teaching me to speak Lao, which gave the interview, conducted mostly in Lao, a tacit pedagogical frame.

Her story of the Hmong vendor in the market was situated against a background of inter-ethnic tension in Laos, both at the time she lived there and at the time of the interview (see Baird Reference Baird2010). Noy was sensitive to this tension, and voiced genuine sympathy for Hmong people, even as her story further circulated discourses of Hmong as irrational, uneducated, and linguistically incompetent. For example, she told me that when she was a child in Vientiane, Laos’s capital, “If a Lao and a Hmong person fought, it was always the Hmong person who was blamed.” She also preferred to use the term Hmong, rather than the offensive ethnonym “Meo,” which was common when she was a child. When she used the latter to represent her speech in the past, she corrected herself on a few occasions by repeating the ethnonym Hmong afterwards.

Noy vividly remembered Hmong marching into markets in Vientiane when she was a child, in single-file lines with baskets tied to their backs filled with brooms for sale. When young Lao children heard that the Hmong were coming, they would get excited, and Noy demonstrated this by inhabiting the figure of an ebullient child: she smiled, shook her arms to mimic running, and called out, “Come see the Meo! Come see the Meo!” She also remembered Lao children playing with the sound of “Meo,” which is similar to both the Lao word for cat and the onomatopoeia for a cat’s vocalization. The children would meeow meeow like cats at the Hmong broom sellers marching into the market. Most of the Hmong broom-sellers would walk by solemnly, but Noy remembered one young Hmong man who took a broom from his basket and hit one of the children.

It is such scenes of inter-ethnic exchange and tension that form the background of her story’s punchline: “When you go buy brooms from Hmong broom sellers, be careful!” The story purports that Hmong broom-sellers cannot understand a kind of bargaining we might call “generalized negotiation.” Generalized negotiation is the repetition and multiplication of a bargain already made. For example, imagine that a merchant agrees to sell two brooms for four dollars. If generalized negotiation is holding, then one could also buy four brooms for eight dollars or 400 brooms for 800 dollars: the deal scales. Hmong people, according to the story, neither allow nor understand this scaling. Two brooms for four dollars means only two broom for four dollars. Any other brooms you might want to purchase would require further haggling. The heart of the bigoted story, as Noy elaborated, is that Hmong people were too literal minded, under educated, and hard to understand when they spoke Lao to practice generalized negotiation.

The Emerging Figure of a Hmong Man

Noy’s narrative begins in earnest with a representation of dialog between a figure of herself and the—as of then—minimally described generic Hmong person. She says, “Let’s say you ask Hmong people the price of a single broom, and they (khacaw4)Footnote 22 say that, for example, one broom is three kip [Lao currency)].”

Transcript 1Footnote 23

For Transcript [1] there are two signs of transparency, signs that Noy is speaking from the figure of a Hmong broom-seller in a market: a verbum dicendi that just precedes line [1a], and the contextual fit of the referential content—i.e., the broom-seller is expected to be the one providing the price of brooms. There are neither transposed deictics nor non-transposed deictics. Rather, the figure’s composition is comprised of the lexical choice of “3 kip.” Notably, the choice of price itself is anchored to the narrated scene’s origo, as it is historically a more likely price for brooms from the time when Noy was a child in Laos. At the time of the interview, one would not be able to find a broom for sale for 3,000 kip, let alone 3.

This line, “3 kip” is, in comparison to the compositions to come later in the story, low-resolution. Notice, for instance, that Noy’s co-speech gesture represents the narrator’s action instead of the Hmong figure (Goffman Reference Goffman1979, 151). As she utters the word kiip5—as part of the phrase meaning ‘3 kip’—in [1a], she lifts both her hands up, palms facing upwards, with her arms slightly outstretched from the sides of her body. This gesture, a shrug, represents the arbitrariness of the number that Noy, in the role of narrator, has chosen. She follows it by saying that 3 kip is just “an example,” and then repeats “3 kip” and shrugs again.

In the utterance in line [5a] of Transcript 2, Noy introduces the Hmong figures’ interlocutor, some apparition of herself, the potential buyer of brooms in the story, with a frustrated response cry: “Qooj!” (Goffman Reference Goffman1978). The cry clearly has a transposed origo, emanating from the emotional state of the figure of Noy in the market, not Noy the narrator sitting across from me during the interview. It brings this new figure, the Hmong man’s interlocutor, into focus.

Transcript 2

Immediately after this, in line [5b], the figure then addresses the Hmong man, using the term phòò1 siaw1, which creatively indexes him as a male and further ethnicized figure: as Noy explains to me, the term phòò1 siaw1—meaning ‘close friends’ father’—is how Hmong men prefer to be addressed.Footnote 24 As Noy says it in [5b], she also represents the emotional state of the figure of herself with a gesture: her right hand, with her index finger extended, moves up from the top right of her gesture space to the bottom left, like a whip, bringing to life the stance of a frustrated negotiator, who, as it were, “crosses out” the Hmong man’s suggested price.

In comparison to Transcript [1] above, the figure of the Hmong man in Transcript [2] is more elaborately composed. In Transcript [3], this gradual elaboration continues.

Transcript 3

Transcript [3] begins immediately after the figure of Noy asks the Hmong man if he would make her a deal and lower the price of the brooms from ten kip to five (confusingly, Noy apparently forgot the price of brooms in her story and changed them from three to ten kip each). In [17], the figure of the Hmong man responds to the figure of Noy’s suggestion of price with a definitive no. The Hmong man’s figure emerges with a robust array of semiotic resources: first, Noy represents his drawn-out, hesitancy-indexing response cry “oh!” and then she represents his speech, saying “[I] can’t [sell at that price]” [17]. As she says this, she shakes her head [17], iconically paralleling the negation in the Hmong figure’s words and producing an image of his head, moving back and forth in the hypothetical market.

After line [17], the narrator’s voice enters again for a moment and Noy clarifies that this is the Hmong man that she is quoting through two semi-redundant verba dicendi [18a-18b].Footnote 25 These verba dicendi reintroduce the figure of the Hmong man [18-18a], who repeats that “no, [he] will not sell his brooms for five kip.” Noy voices him twice in succession. As she does so, she uses first-person pronouns [19a-19b: haw2], and character viewpoint co-speech gestures—i.e., as she says one broom, she raises her left hand with one index finger extended, paralleling the number one in the figure of the Hmong man’s words [19b-19c].

In comparison to Transcript [1] above, Noy’s composition of the figure of the Hmong man is at something of a high resolution here: the response cry [17], the first-person pronouns [19a-19b], the head-shaking [17], and the raised finger [19b-19c] cross-modally voice the figure of the Hmong man and together comprise an elaborate semiotic architecture. In Transcript [4], she adds yet another layer to this figure composition by enacting and then inhabiting a “Hmong accent,” comprised of altered segmental and suprasegmental forms.

Transcript 4

The crucial moment here is the segmental and suprasegmental disjuncture between lines [27] and [29]. In line [27], Noy voices the figure of the Hmong man much as she had done before,Footnote 26 with her head movement again paralleling the negation in the figure’s speech. In contrast, in [29], Noy makes segmental alterations, and her stress and pitch shift and her vocal cords creak [Figure 3]. As Noy shifts her voice, she moves her head and her hands to emphasize these shifts, bobbing them up and down, paralleling the oscillations in the sound from her mouth [Figures [4] and [5]]. After this display, Noy explains what she had just done with her demonstration: “They [i.e., Hmong people] would speak Lao incorrectly” [30]. As she says this, she moves her hand to her mouth and out again, creating a physical image of the sounds of Hmong speech.

Figure 3. The Pitch Contours of Lines [27] and [29]

By line [29], Noy puts the form of her language on display, highlighting the texture of the imagined Hmong man’s voice and emphasizing the palpability of his language. She does this through juxtaposing line [27] with line [29]—which have very similar lexical content, but quite different figure compositions. Following Roman Jakobson’s classic discussion of the poetic function, this is a moment where the hierarchy of linguistic functions has been reordered. Whereas the referential function is most dominant in [27], in [29] the poetic function—characterized by an orientation towards the form of language (or, in Jakobson’s vocabulary, the message)—is thrust into focus (Jakobson Reference Jakobson1956; Jakobson Reference Jakobson1960). To play on the title of Dell Hymes’s (Reference Hymes1981) classic paper, line [29] is a “breakthrough into poetics,” a moment where the contrastive individuation of the figure’s composition (Agha Reference Agha2005, 54) is made a thematic focus and brought to attention in and of itself.

This poetic breakthrough happens through both aural and visual modalities, which work in concert to stress the sound-shape of Noy’s language. Noy’s hand and her head move alongside her words, bouncing and emphasizing the rhythm and texture of her speech. In contrast to the beginning of [27], where Noy’s head movements emphasize the referential content of the Hmong figures negation, and thus form a part of the composition of that figure, by the end of line [27], Noy has already changed the primary function of her body’s movements. They are now mediators of attention (Streeck Reference Streeck2009): her head traces the final tonal contour of the last word of the line, thùn2. Noy’s corporeal poetics become still more exaggerated in line [29].

In Figures [4] and [5], I have mapped out the relationship between Noy’s speech and her bodily movements in line [29]. Figure [4], which was roughly traced from a still of the video and modified slightly to protect Noy’s identity, shows the axes on which Noy’s hands and head move. The small hands in Figure [5] likewise represent her alternately raised and lowered hand, with her pinky outstretched. The face on Figure [5] represent Noy’s up-and-down head movements. Notice that her head and her hand moved more or less in opposite directions, and that the correlation between the movements and the sound is inexact, both in the Figure and in the video. The movements happen in proportions of either 1:1 or 2:1 to the syllables and are relatively regular until the two final syllables khaat5 thùn2.

Figure 4. The axes along which Noy moves her body during line [29]:Footnote 27

Figure 5. Noy’s Pitch Contours and Gestures

After line [29], when Noy characterizes the segmental and suprasegmental elements she has just performed—“They would speak Lao incorrectly,” line [30]—she thereby typifies her breakthrough into poetics as a contrastive enactment, in which putatively “correct” and “incorrect” pronunciation are juxtaposed cross-modally, and where the latter forms are explicitly tied, for me, the novice speaker, to both this specific represented Hmong figure in the story and Hmong people, generically. In this small moment, we see how the poetic underpinnings of figure compositions can be formed and circulate from person to person—here from Noy to me, a then novice Lao language learner. Such acts lay the groundwork for these formal features to later be presupposed in less obvious ways (Agha Reference Agha2005, 55) and used in figure compositions.

A High-Resolution Figure and a Second Story

In fact, throughout the rest of her story, Noy continues to fade in and out of the phonetic alteration demonstrated in her “breakthrough into poetics” as she voices the figure of the Hmong man. The “accent” is characterized by a slower rhythm, irregular creak, and lengthened and stressed syllables at the end of intonation units, as was the case in line [29].Footnote 28

With the addition of the phonetic element to the figure of the Hmong man’s voice, the composition of the figure is now in a relatively sustained high resolution: with a dense clustering of transposed deictics, character perspective gestures, and a differentiated phonetic form. As the story progresses, the two characters finally reach an agreement on the price of brooms: two brooms for eight kip. It is then that the figure of Noy tries to buy six brooms for 24 kip, following a kind of “generalized negotiation,” and the Hmong man emphatically rejects her offer. This is the denouement of the story, the point where the joke emerges: “Hmong people don’t get it.”

Transcript 5

Notice the richness with which Noy composes the Hmong figure in lines [98-100b]. Every intonation unit is patterned with a co-speech gesture that represents the bodily movements and the content of the figure of the Hmong man. In addition, the segmentals and suprasegmentals of her speech are altered, contrasting starkly with her voice qua narrator.

After she tells the story, Noy explains to me that, until recently, she never believed its premise. Instead, she always thought that Lao people told it because they were racist and because they hated Hmong people. But when she bargained with a Hmong woman in an American market, to her surprise, she witnessed the stereotype come true.

As above, Noy tells this second story through represented discourse. The composition of the Hmong woman in this story, however, has lost the phonetic weight that the figure of the Hmong man had at the other story’s conclusion. In addition, Noy uses some co-speech gesture, but it is done with less vigor. The figure of the Hmong woman is less elaborate, less ornamented, in a lower resolution.

In this story, Noy bargains with the woman for bundles of lemongrass instead of brooms. The figure of the Hmong woman says that she will sell three bundles for two dollars. Noy agrees and takes nine bunches of lemongrass, planning to buy them for six dollars (following the logic of generalized negotiation). As she does this, her husband warns her that Hmong people do not bargain like that, and she tells him to quiet. “They’re in America already,” she says and begins to hand the Hmong woman six dollars. At this moment, the figure of the Hmong woman seems to break out from Noy’s body: “No, I won’t sell [at that price].” The voice of the Hmong woman is brought back to the resolution it had in the previous story, albeit at a slightly higher pitch to match the figure’s new gender. The final syllables of Noy’s speech are lengthened. There are elaborate co-speech gestures and exaggerated prosodic contours. The figure’s composition is, in comparison to the figure at the beginning of this second story, a collage of heterogenous elements.

It is no surprise that Noy’s high-resolution voice of the Hmong figures reappears at the same time that the supposedly illogical bargaining does. Her representation of the Hmong figures’ communication in that moment in both stories is hyper-contrastive with the voice of the figure of herself in the market and even more contrastive with her restrained narrative voice. This radical contrast between Noy’s Hmong figures and her other figures models the narrative arc of both of her stories. They are about how Hmong people are different: they talk differently, and they think differently; likewise, the figure of the Hmong man and the Hmong women become something different, a foil for the “reasonable” Lao person. After the joke, Noy explains: “Hmong people are literal people, if you agree on something, they really stick to it.”Footnote 29

Composition and Transparency Effects

The gradient resolution of a figure is just one cluster of composition effects common enough to label,Footnote 30 but analytics of transparency and composition help us distinguish a host of such effects. For instance, take the composition effect that Voloshinov (Reference Voloshinov1973, 134) called particularized direct discourse, representations in which “the traits the author used to define a character cast heavy shadows on his directly reported speech.” In these quotidian cases, the elements of a figure composition are taken to be notably characteristic of figures (see Goffman Reference Goffman1974, 534–536 on “mockeries and say-fors”). Narrators can use these elements to project those figures “into particular social roles by putting particular emblems into their mouths” (Wortham and Locher Reference Wortham1994, 11; see also Couper-Kuhlen Reference Couper-Kuhlen1999, 15; Couper-Kuhlen Reference Couper-Kuhlen2007, 119). In their analysis of Trump’s mockery of his opponents, Hall, Goldstein, and Ingram (Reference Hall2016, 85) capture good examples of this, where Trump’s compositions are constructed with elements that are metonymic of their targets: Hilary Clinton’s bookishness is alluded to by Trump’s representation of her face buried into a piece of paper, Mitt Romney’s boring seriousness is indexed by a stiff body, and “low energy” Jeb Bush is characterized with his hands folded under his cheek as if he were falling asleep.Footnote 31 These characterizations through composition often work in concert with characterizations made by means other than represented discourse, such as explicit descriptions (e.g., ‘he’s a loser’), but they are formally distinct from these other means, as they also afford different kinds of social and semiotic action. As Clark and Gerrig (Reference Clark1990, 793) put it, “Many things are easier to demonstrate than describe.”

Analytics of composition and transparency also allow us to disentangle composition effects from transparency effects. For instance, they point the way toward distinguishing and thus better understanding two kinds of ‘double voicing’ that have been discussed in the literature. One is compositional, where the composition is so clearly selected—so, for example, patently parodic (Goffman Reference Goffman1974, 537; Voloshinov Reference Voloshinov1973, 136–137) or emotionally motivated—that the embedding indexical ground—namely the indexicality of speakership associated with the animator of the utterance (or, perhaps, some other responsible entity; see Irvine Reference Irvine1993)—comes to the fore. The other is a transparency effect, where who exactly is speaking or acting is less clear, and where the question is often whether represented discourse is happening at all.

Sometimes this latter opacity of origo is by design. That is, at times speakers aim to muddy the waters so as to incorporate a figure’s style into their own language. In her description of novice Nepali Sign Language (NSL) learners studying visual depictions of new signs, Hoffmann-Dilloway (Reference Hoffmann-Dilloway2020, 127) shows that as signers become better at using these signs of NSL, they also seem to shift to “performing them in ways aimed at yielding identification with the portrayed figures of personhood” in the images from which they learned. But there are also cases of transparency based double-voicing where the line between performance and performer becomes porous in spite of efforts to portray the represented figure as distinct. How far, as Goffman (Reference Goffman1974, 539) put it, can something be mimicked “without the mimic becoming suspect?” How much taboo language can someone employ without becoming responsible for that taboo language in the first place? In the 2008 movie Tropic Thunder, in which Ben Stiller and Robert Downey Jr. both play actors playing actors, this issue comes to the foreground. In one scene, Ben Stiller’s character narrates how diving deeply into performing a mentally disabled person affected him— leaking into the way he brushed his teeth and the way he rode the bus, for instance. Robert Downey Jr.’s character responds that the role was a risky career move that may have dimmed Stiller’s character’s chances of winning an Oscar. “Everybody knows you never go full retard,” Downey Jr.’s character says. This scene, in a movie about making a movie, for which Downey Jr. himself was nominated by the Academy Awards for Best Supporting Actor, has itself become the subject of controversy on exactly these same lines. As Downey Jr.’s character, a White Australian actor, talks to Stiller’s character, he is in fact wearing blackface and speaking with altered suprasegmentals, syntax, and vocabulary—performing his role as a Black soldier. The most prominent comments on the YouTube video at the time of this writing referenced this blackface and reflected on whether it was fundamentally offensive or satirically funny, and how it should, should not, or might have led to Downey Jr.—the real, living actor—being “canceled” (for a related discussion, see Chun Reference Chun2004).Footnote 32 That this conversation is occurring highlights how some elements of figure compositions can be treated as unperformable for some classes of individuals even when done through represented discourse or alongside other metapragmatics efforts at containment (Irvine Reference Irvine2011). When an actor such as Downey Jr. puts on blackface to satirize a character using blackface, he is liable to be, in effect, construed as merging in responsibility with the repugnance of the character he is representing in the film. In this way, in contemporary American culture, blackface is treated as what Fleming (Reference Fleming2011) calls a ‘rigid performative,’ a form that keeps its effects no matter how people attempt to contextualize their uses of it.Footnote 33 In their rigidity, such performatives deny full transparency, at least insofar as responsibility is concerned.

As linguistic anthropologists have shown again and again, even language that is not marked-off as any kind of represented discourse often leaves open the question of which figure is speaking. It is this double-voiced dimension of much discourse that, as Bakhtin (Reference Bakhtin1981, 330) put it, “can never be exhausted … never extracted fully from the discourse—not by a rational, logical counting of the individual parts, nor by drawing distinctions between the various parts of a monologic unit of discourse (as happens in rhetoric), nor by a definite cut-off between the verbal exchanges of a finite dialogue, such as occurs in the theater.” It is also this same characteristic which gives Goffman’s (Reference Goffman1979) account of footing its expansiveness, as it involves not just cases of clearly demarcated represented discourse, but anything we say that keys different figures, participation frameworks, and production formats in ordinary discourse itself. Broadly, the subject of represented discourse shows us that everything has a bit of this capacity, that ordinary transpositions are an essential part of the fabric of normal discursive construction, whether anyone is quoting anyone else transparently or not.

Composition as Analytic

When one begins to consider figure transparency it can feel as if the floor comes out from under the notion of represented discourse entirely. Recognizing the pervasive heteroglossic opacity of speech foregrounds a fundamental instability as to who is speaking at any point, not just when someone is quoting another. But this uncertainty is core to semiotic processes generally, and the apparently solid floor of semiosis is always built on sand liable to shift. As many have shown, this instability comprises an especially interesting part of social life.

Figure composition is valuable in part because it offers a vocabulary for clarifying this instability, for specifying how a stretch of discourse might fail to meet a certain threshold of transparency. But the analysis of figure composition takes the interpretive instability of the indexical ground of represented discourse as a starting point, not its focus. Figure composition, as I sketch it here, is a tool for analyzing the form of represented discourse when this instability is less prominent, passing the threshold of transparency in some uncontroversial way. It is worth taking up because it allows us to think through how compositions of represented discourse effectuate social action in a manner that is tidier and more exacting than the dichotomy of ‘direct’ and ‘indirect’ report, because it enables us attend to the multi-modality of represented discourse as we encounter it, and because it offers a tool for exploring the general finding that the form and semiotic organization of represented discourse—not the mere fact that it has happened—is key for understanding represented discourse’s role in social action.

Footnotes

This paper has long been in the works. It was originally written as a term paper for Judith Irvine’s course at the University of Michigan, then reworked under the guidance of Michael Lempert for the Society of Linguistic Anthropology’s Annual Graduate Student Essay Competition, for which it received an honorable mention. Both Irvine and Lempert offered exceptional guidance in these early days. Constantine Nakassis’s comments as discussant for a version of the paper at the 2012 Michicagoan conference also aided me in boiling down some of the paper’s key ideas. Since then, many have helped me formulate my argument. Thanks especially to Kimberly Ang, Meghanne Barker, Niko Besnier, Jillian Cavanaugh, Nick Enfield, Cynthia Gordon, Webb Keane, Paul Kockelman, Alaina Lemon, Barbra Meek, Susan Philips, Kamala Russell, Joshua Shapero, and Deborah Tannen. Most recently, the paper has benefitted from the advice of two anonymous reviewers and the journal’s editor, Asif Agha. Jack Sidnell pointed me toward several of the examples I use, and Jamie Roux created the image of Noy in Figure 4. Finally, I must thank Noy, and apologize that out of all the wonderful things she has said to me, I chose to analyze this singularly ugly story. All mistakes are my own.

2. For examples of contemporary work exploring figures of personhood, see Barker (Reference Barker2019) and Prentice (Reference Prentice2020).

4. I use origo here as shorthand. As I do so, I am mindful of Hanks’s (Reference Hanks1990, 42) criticism of origo as an overly egocentric notion: indexical grounds are best understood as relational wholes, which can involve complex embeddings obscured by normal ways of talking about the-here-and-now. As Sweetser (Reference Sweetser2012, 11) writes, “our everyday construal of personal viewpoint is a blend. It is a blend that is so common that it is hard to notice it” (see also Hanks Reference Hanks1992).

5. After Agha (Reference Agha2005, 39), I use “figure,” rather than voice, dialogue, or quotation, to emphasize that these elements are multi-modal, that is, that they can be segmental, suprasegmental, gestural, or even sartorial. This paper thus follows the lead of those who have suggested a more comprehensive, multi-modal understanding of represented discourse (e.g., Clark and Gerrig Reference Clark1990, 772; Goodwin Reference Goodwin2007; Stec, Huiskes, and Redeker Reference Stec2015 and the many citations below). I use the somewhat less satisfactory term “represented discourse” because it is difficult to find a suitable substitution. Many have pointed out that the difference between enactment, reported speech, etc. is merely one of modality, and have struggled to find a subsuming term. For my use of represented discourse, I stipulate that “discourse” includes all semiosis.

6. Hanks (Reference Hanks1990, 38) helpfully captures how deictics work with the following formulation: “‘the x in relation R to y’… where x is the object referred to, and y is the indexical ground.” Different deictics encode different Rs. In cases of deictics used in transparent represented discourse, the indexical ground (y) is transposed (or decentered, in Hanks’s vocabulary) and the referent is determined in relation to that new transposed ground.

7. Agha (Reference Agha2005, 43) offers four features that can make a voice more transparent: (a) metrically contrastive text segments; (b) segments that are linked / demarcated by a clause boundary; (c) segments that differ in deictic / indexical origos; and (d) segments with matrix clause NPs that denote voiced participants.

8. Hanks (Reference Hanks1990, 217) represents this distinction in the diagrams in his book by using a dotted arced line for the latter ‘unstated transpositions’ and a solid arced line for the former ‘referential projections.’

9. On this distinction, Jespersen (Reference Jespersen2007, 290) wrote that when one wishes to quote another, she has two major options: (1) to give or report to give “the exact words of the speaker (or writer): direct speech (oratio recta)”; or (2) to “adapt[…] the words according to the circumstances in which they are now quoted: indirect speech (oratio obliqua).”

10. For example, Rumsey (Reference Rumsey2020, 2) writes that, in direct representations, “the indexical elements that are normally used to ground an utterance in the here and now are alternatively used to ground it in some other real or imagined speech situation that is being represented, or ‘reported’ in the present one.”

11. My approach differs in two obvious ways from parallel approaches. First, the elements I offer are defined in relation to their form, not in relation to how they are perceived; and second, I distinguish composition from transparency and dispense with the distinction between direct and indirect report, as other approaches have not done.

12. So must we take this into account as we look at the sorts of compositional effects I discuss below: as we determine what is rote or unnoticed and what is a motivated by immediate interactional needs (see Coulmas Reference Coulmas1986, 14).

13. Hanks (Reference Hanks1990, 66) writes “There is a strong parallel between deictic reference, on the one hand, and the asymmetric duality between Figure and Ground, on the other. The discreteness, individuation, definiteness, and singularity that are the hallmarks of deictic reference are all typical Figure characteristics. The diffuseness, variability, and backgrounded character of the indexical ‘zero point’ is due to its being, in fact, the Ground upon which the referential Figure is defined.”

14. Excluding fortuitously coherent examples, of course.

15. See also Evans’s (Reference Evans2012) discussion of bisperspectival forms.

16. For certain analyses it might make sense to make finer distinctions and consider some of these as different elements; in other cases, these elements work together. In one study, for instance, Klewitz and Couper-Kuhlen (Reference Klewitz1999, 468) found that suprasegmental shifts tended to happen in clusters—shifts in pitch, for instance, rarely occurred in isolation. When such shifts did it occur, furthermore, they tended to occur over intonation phrases or longer stretches rather than words (Reference Klewitz1999, 462).

17. Hill (Reference Hill1995, 126) writes that “in all Mexicano narratives I have examined, the ‘faithful’ reproduction of intonational contours is a focus of the representation of reported speech. Mexicano storytellers attend to intonation as much as English speakers attend to pitch …”

18. On the latter point, many have noted that represented discourse without verba dicendi tends to have figures with more suprasegmentally elaborated compositions (Tannen Reference Tannen1986; Hanks Reference Hanks1993, 139; Mathis and Yule Reference Mathis1994). But others have found this does not account for their own data (Klewitz and Couper-Kuhlen Reference Klewitz1999, 469; see also Kvavik Reference Kvavik1986, 356; Günthner Reference Günthner1999, 691).

19. However, if the possessive pronominal is transposed and becomes ‘my mother,’ that seems to force a de dicto reading to the entire noun phrase.

20. For an example of how viewpoint can produce composition effects, see Sauer (Reference Sauer1999) and Streeck (Reference Streeck2009, 207).

21. That the eyes are frequently an element of figure compositions has been carefully documented. Scholars have found that people “use gaze to portray the gaze of the participants in the original event,” and “to visually designate their recipients to stand in for characters in the original event” (Thompson and Suzuki Reference Thompson2014, 26). The eyes have also been shown to be a mechanism for non-composition based transparency, for “parsing the larger telling into interactionally relevant units” (Sidnell Reference Sidnell2006, 394).

22. The pronoun Noy uses to refer to “Hmong people,” [khacaw4], is normally glossed as a third person plural pronoun, but Noy uses it to refer to the singular Hmong man (Enfield Reference Enfield2007, 77). As Enfield writes, “The [Lao] system of definite person reference is not a closed one. It is highly flexible, and permeable” (Enfield Reference Enfield2007, 78). This permeability in person reference aids Noy in characterizing both the singular figure of the Hmong man and Hmong people generically (on generics, see Zuckerman (Reference Zuckerman2020; Reference Zuckerman2021a; Reference Zuckerman2021b).

23. To represent Lao, I am using Enfield’s (Reference Enfield2007) system, in which numbers identify lexical tones. In my transcriptions, the first column represents the name of the animator of the utterance—here either Noy or me (Zuckerman). The second column is the line number (and rows with the same line number represent intonational unit breaks, roughly demarcated (Chafe Reference Chafe1987)). The third column is my transcription of the original speech, the fourth column is my free translation of this speech, and the fifth column is a description of any co-speech corporeal movement. The movement occurs during underlined segments of the original speech (Column 2).

24. Noy continues to use this term throughout the imagined interaction and uses mèè1 siaw1 (“friend’s mother”) when addressing the figure of herself from the perspective of the Hmong man. After the interview, I saw Noy using these terms with Hmong people in a non-imagined market context. For more on the notion of siaw1, see Zuckerman (Reference ZuckermanForthcoming).

25. Notice, however, Noy’s alternation from phòò1 siaw1 to Meo.

26. It is clear that the pronoun, khacaw4, is introducing and classifying the figure of the Hmong man, the author and principal of the reported speech, but it is ambiguous as to whether it is the subject of the following clause—thereby meaning, “They can’t sell at that price, [they] won’t make a profit”—or if it is a nominal in the utterance’s “extraclausal Left Position” (Enfield Reference Enfield2007, 161). If it is an extraclausal nominal, the utterance would be something more like classic direct discourse: “Them: [I] can’t sell at that price, [I] won’t make a profit.” This ambiguity is possible because in Lao the verb is not marked for the subject and “zero-forms” are acceptable in subject position.

27. Jamie Roux prepared this image, traced from a video still.

28. In line [29] the difference is insignificant. The line’s final syllables was 0.42 seconds long, whereas the final syllable in line [27] was 0.38 seconds long. The difference, however, becomes more stark later in the interaction. For example, kiip5 is used by both Noy’s figure of herself and the figure of the Hmong man in final position. The figure of Noy’s token is 0.14 seconds long, the Hmong figure’s token is more than twice the length at 0.29 seconds.

29. Noy says that Hmong people are “straight” (sùù1) people.

30. One could distinguish, for instance, intra-compositional effects, in which the effect is a product of the elements in a given composition and their relation to one another, from inter-compositional effects, such as those involving gradient figures, in which the effect is the product of the relation among compositions of the same or different figures across time.

31. These representations are furthermore done in a constrained gesture space that contrasts with Trump’s use of “gestural excess to convey the impression he is a new kind of politician, unconstrained by petty rules and competent at accomplishing daunting tasks” (Hall, Goldstein, and Ingram Reference Hall2016, 85).

33. As the YouTube comments show, of course, there is popular debate about the extent to which this is the case. The use of blackface has achieved particular notoriety and salience as a racist element of figure compositions—compare, for instance, its reception with the more mild reception that non-black speakers’ use of stereotypically Black speech has often received (see Bucholtz and Lopez Reference Bucholtz2011 on ‘linguistic minstrelsy’).

References

Agha, Asif. 2005. “Voice, Footing, Enregisterment.Journal of Linguistic Anthropology 15 (1): 38–59.10.1525/jlin.2005.15.1.38CrossRefGoogle Scholar
Agha, Asif. 2007. Language and Social Relations. Cambridge: Cambridge University Press.Google Scholar
Baird, Ian G. 2010. “The Hmong Come to Southern Laos: Local Responses and the Creation of Racialized Boundaries.Hmong Studies Journal 11: 1–38.Google Scholar
Bakhtin, Mikhail M. 1981. The Dialogic Imagination: Four Essays, translated by M. Holquist. Austin, Texas: University of Texas Press.Google Scholar
Banfield, Ann. 1973. “Narrative Style and the Grammar of Direct and Indirect Speech.Foundations of Language 10 (1): 1–39.Google Scholar
Barker, Meghanne. 2019. “Dancing Dolls: Animating Childhood in a Contemporary Kazakhstani Institution.Anthropological Quarterly 92 (2): 311–343.10.1353/anq.2019.0017CrossRefGoogle Scholar
Bolden, Galina. 2004. “The Quote and beyond: Defining Boundaries of Reported Speech in Conversational Russian.Journal of Pragmatics 36 (6): 1071–1118.10.1016/j.pragma.2003.10.015CrossRefGoogle Scholar
Bucholtz, Mary. 2011. “Race and the Re-Embodied Voice in Hollywood Film.Language & Communication 31 (3): 255–265.10.1016/j.langcom.2011.02.004CrossRefGoogle Scholar
Bucholtz, Mary, and Qiuana Lopez. 2011. “Performing Blackness, Forming Whiteness: Linguistic Minstrelsy in Hollywood Film.Journal of Sociolinguistics 15 (5): 680–706.10.1111/j.1467-9841.2011.00513.xCrossRefGoogle Scholar
Chafe, Wallace. 1987. “Cognitive Constraints on Information Flow.” In Coherence and Grounding in Discourse: Outcome of a Symposium, Eugene, Oregon, June 1984, edited by R. S. Tomlin, 21–57. Philadelphia: J. Benjamins Pub. Co.10.1075/tsl.11.03chaCrossRefGoogle Scholar
Chun, Elaine W. 2004. “Ideologies of Legitimate Mockery: Margaret Cho’s Revoicings of Mock Asian.” Pragmatics 14 (2–3): 263–289.Google Scholar
Clark, Herbert H., and Richard J. Gerrig. 1990. “Quotations as Demonstrations.Language 66 (4): 764–805.10.2307/414729CrossRefGoogle Scholar
Cormier, Kearsy, Sandra Smith, and Zed Sevcikova-Sehyr. 2015. “Rethinking Constructed Action.Sign Language & Linguistics 18 (2): 167–204.10.1075/sll.18.2.01corCrossRefGoogle Scholar
Coulmas, Florian. 1986. “Reported Speech: Some General Issues.” In Direct and Indirect Speech, edited by Florian Coulmas, 1–28. Berlin: Mouton de Gruyter 10.1515/9783110871968CrossRefGoogle Scholar
Couper-Kuhlen, Elizabeth. 1999. “Coherent Voicing: On Prosody in Conversational Reported Speech.” In Coherence in Spoken and Written Discourse: How to Create It and How to Describe It, edited by Wolfram Bublitz and Uda Lenk, 11–32. Amsterdam: Benjamins.10.1075/pbns.63.05couCrossRefGoogle Scholar
Couper-Kuhlen, Elizabeth. 2007. “Assessing and Accounting.” In Reporting Talk: Reported Speech in Interaction, edited by Elizabeth Holt and Rebecca Clift, 81–119. Cambridge, UK: Cambridge University Press.Google Scholar
Danesi, Marcel. 2016. The Semiotics of Emoji: The Rise of Visual Language in the Age of the Internet. London: Bloomsbury Publishing.Google Scholar
Enfield, Nick J. 2007. A Grammar of Lao. Berlin: Mouton.10.1515/9783110207538CrossRefGoogle Scholar
Evans, Nicholas. 2012. “Some Problems in the Typology of Quotation: A Canonical Approach.” In Canonical Morphology and Syntax, edited by Dunstan Brown, Marina Chumakina, and Greville G. Corbett, 66–98. Oxford: Oxford University Press.10.1093/acprof:oso/9780199604326.003.0004CrossRefGoogle Scholar
Fleming, Luke. 2011. “Name Taboos and Rigid Performativity.Anthropological Quarterly 84 (1): 141–164.10.1353/anq.2011.0010CrossRefGoogle Scholar
Goffman, Erving. 1974. Frame Analysis: An Essay on the Organization of Experience. Cambridge: Harvard University Press.Google Scholar
Goffman, Erving. 1978. “Response Cries.Language 54 (4): 787–815.10.2307/413235CrossRefGoogle Scholar
Goffman, Erving. 1979. Footing. In Forms of Talk. Pp. 124–159. Philadelphia: University of Pennsylvania Press.Google Scholar
Good, Jeffrey S. 2015. “Reported and Enacted Actions: Moving beyond Reported Speech and Related Concepts.Discourse Studies 17 (6): 663–681.10.1177/1461445615602349CrossRefGoogle Scholar
Goodwin, Charles. 2007. “Interactive Footing.” In Reporting Talk: Reported Speech in Interaction, edited by Elizabeth Holt and Rebecca Clift, 16–46. Cambridge: Cambridge University Press.Google Scholar
Gross, Joan. 1983. “Creative Use of Language in a Liège Puppet Theater in Puppets, Masks, and Performing Objects from Semiotic Perspectives.” Semiotica La Haye 43 (1–4): 281–315.Google Scholar
Günthner, Susanne. 1997. “The Contextualization of Affect in Reported Dialogues.” In The Language of Emotions: Conceptualization, Expression, and Theoretical Foundation, edited by Susanne Niemeier and René Dirven, 247–276. Amsterdam: John Benjamins.10.1075/z.85.19gunCrossRefGoogle Scholar
Günthner, Susanne. 1999. “Polyphony and the ‘Layering of Voices’ in Reported Dialogues: An Analysis of the Use of Prosodic Devices in Everyday Reported Speech.Journal of Pragmatics 31 (5): 685–708.10.1016/S0378-2166(98)00093-9CrossRefGoogle Scholar
Haas, Mary R. 1944. “Men’s and Women’s Speech in Koasati.Language 20 (3): 142.10.2307/410153CrossRefGoogle Scholar
Hall, Kira, Donna M. Goldstein, and Matthew Bruce Ingram. 2016. “The Hands of Donald Trump: Entertainment, Gesture, Spectacle.HAU: Journal of Ethnographic Theory 6 (2): 71–100.10.14318/hau6.2.009CrossRefGoogle Scholar
Hanks, William F. 1990. Referential Practice: Language and Lived Space among the Maya. Chicago: University of Chicago Press.Google Scholar
Hanks, William F.. 1992. “The Indexical Ground of Deictic Reference.” In Rethinking Context: Language as an Interactive Phenomenon, edited by Alessandro Duranti and Charles Goodwin, 43–76. Cambridge: Cambridge University Press.Google Scholar
Hanks, William F.. 1993. “Metalanguage and Pragmatics of Deixis”. In Reflexive Language: Reported Speech and Metapragmatics, edited by John A. Lucy, 127–57. Cambridge: Cambridge University Press.10.1017/CBO9780511621031.008CrossRefGoogle Scholar
Hastings, Adi, and Paul Manning. 2004. “Introduction: Acts of Alterity.Language & Communication 24 (4): 291–311.10.1016/j.langcom.2004.07.001CrossRefGoogle Scholar
Haviland, John B. 1993. “Anchoring, Iconicity, and Orientation in Guugu Yimithirr Pointing Gestures.Journal of Linguistic Anthropology 3 (1): 3–45.10.1525/jlin.1993.3.1.3CrossRefGoogle Scholar
Hickmann, Maya. 1993. “The Boundaries of Reported Speech in Narrative Discourse: Some Developmental Aspects.” In Reflexive Language: Reported Speech and Metapragmatics, edited by John Lucy, 63–90. Cambridge: Cambridge University Press.10.1017/CBO9780511621031.006CrossRefGoogle Scholar
Hill, Jane H. 1995. “The Voices of Don Gabriel.” In The Dialogic Emergence of Culture, edited by Dennis Tedlock and Bruce Mannheim, 97–147. Urbana: University of Illinois Press.Google Scholar
Hodge, Gabrielle, and Kearsy Cormier. 2019. “Reported Speech as Enactment.Linguistic Typology 23 (1): 185–196.10.1515/lingty-2019-0008CrossRefGoogle Scholar
Hoffmann-Dilloway, Erika. 2011. “Writing the Smile: Language Ideologies in, and through, Sign Language Scripts.Language & Communication 31: 345–355.10.1016/j.langcom.2011.05.008CrossRefGoogle Scholar
Hoffmann-Dilloway, Erika. 2020. “Figure (of Personhood) Drawing: Scaffolding Signing and Signers in Nepal.Signs and Society 8 (1): 35–61.10.1086/706770CrossRefGoogle Scholar
Holt, Elizabeth. 2007. “‘I’m Eyeing Your Chop up Mind’: Reporting and Enacting.” In Reporting Talk: Reported Speech in Interaction, edited by Elizabeth Holt and Rebecca Clift, 47–80. Cambridge: Cambridge University Press.Google Scholar
Holt, Elizabeth, and Rebecca Clift. 2006. “Introduction”. In Reporting Talk: Reported Speech in Interaction, edited by Elizabeth Holt and Rebecca Clift, 1–15. Cambridge: Cambridge University Press.10.1017/CBO9780511486654CrossRefGoogle Scholar
Hoyle, Susan M. 1993. “Participation Frameworks in Sportscasting Play: Imaginary and Literal Footings.” In Framing in Discourse, edited by Deborah Tannen, 114–145. Oxford: Oxford University Press.Google Scholar
Hymes, Dell H. 1981. ‘In Vain I Tried to Tell You’: Essays in Native American Ethnopoetics. Philadelphia: University of Pennsylvania Press.10.9783/9781512802917CrossRefGoogle Scholar
Irvine, Judith T. 1993. “Insult and Responsibility: Verbal Abuse in a Wolof Village.” In Responsibility and Evidence in Oral Discourse, edited by Jane H Hill and Judith T. Irvine, 105–134. Cambridge: Cambridge University Press.Google Scholar
Irvine, Judith T.. 2011. “Leaky Registers and Eight-Hundred-Pound Gorillas.Anthropological Quarterly 84 (1): 15–39.10.1353/anq.2011.0011CrossRefGoogle Scholar
Jakobson, Roman. 1956. “Metalanguage as a Linguistic Problem.” In The Framework of Language, edited by Irwin R. Titunik, 81–92. Ann Arbor: Michigan Studies in the Humanities.Google Scholar
Jakobson, Roman. 1960. “Closing Statement: Linguistics and Poetics.” In Style in Language, edited by T. Sebeok, 350–77. Cambridge: MIT Press.Google Scholar
Jansen, Wouter, Michelle L. Gregory, and Jason M. Brenier. 2001. “Prosodic Correlates of Directly Reported Speech: Evidence from Conversational Speech.” In ISCA Tutorial and Research Workshop (ITRW) on Prosody in Speech Recognition and Understanding.Google Scholar
Jespersen, Otto. 2007 [1924]. The Philosophy of Grammar. London: Routledge.Google Scholar
Jones, Graham M., and Bambi B. Schieffelin. 2009. “Enquoting Voices, Accomplishing Talk: Uses of Be+ like in Instant Messaging.Language & Communication 29 (1): 77–113.10.1016/j.langcom.2007.09.003CrossRefGoogle Scholar
Keane, Webb. 2011. “Indexing Voice: A Morality Tale.Journal of Linguistic Anthropology 21 (2): 166–178.10.1111/j.1548-1395.2011.01104.xCrossRefGoogle Scholar
Keevallik, Leelo. 2010. “Bodily Quoting in Dance Correction.Research on Language and Social Interaction 43 (4): 401–426.10.1080/08351813.2010.518065CrossRefGoogle Scholar
Keevallik, Leelo. 2013. “The Interdependence of Bodily Demonstrations and Clausal Syntax.Research on Language & Social Interaction 46 (1): 1–21.10.1080/08351813.2013.753710CrossRefGoogle Scholar
Kimball, Geoffrey. 1987. “Men’s and Women’s Speech in Koasati: A Reappraisal.International Journal of American Linguistics 53 (1): 30–38.10.1086/466041CrossRefGoogle Scholar
Klewitz, Gabriele, and Elizabeth Couper-Kuhlen. 1999. “Quote–Unquote? The Role of Prosody in the Contextualization of Reported Speech Sequences.Pragmatics 9 (4): 459–485.Google Scholar
Kvavik, Karen H. 1986. “Characteristics of Direct and Reported Speech Prosody: Evidence from Spanish.” In Direct and Indirect Speech, edited by Florian Coulmas, 333–360. Berlin: Mouton de Gruyter.Google Scholar
Latour, Bruno. 1993. We Have Never Been Modern. Cambridge: Harvard University Press.Google Scholar
Lempert, Michael. 2014. “Imitation.Annual Review of Anthropology 43: 37–51.10.1146/annurev-anthro-102313-030008CrossRefGoogle Scholar
Lim, Dennis. 2010. “Letting His Role Do the Talking.” The New York Times, December 3. https://www.nytimes.com/2010/12/05/movies/05bale.html, accessed September 25, 2020.Google Scholar
Lucy, John A. 1993. “Metapragmatic Presentationals: Reporting Speech with Quotatives in Yucatec Maya.” In Reflexive Language: Reported Speech and Metapragmatics, edited by John A. Lucy, 91–125. Cambridge: Cambridge University Press.10.1017/CBO9780511621031.007CrossRefGoogle Scholar
Mathis, Terrie, and George Yule. 1994. “Zero Quotatives.Discourse Processes 18 (1): 63–76.10.1080/01638539409544884CrossRefGoogle Scholar
McClave, E. 1991. Intonation and Gesture. Unpublished doctoral dissertation, Georgetown University.Google Scholar
McNeill, David. 1992. Hand and Mind: What Gestures Reveal about Thought. Chicago: University of Chicago Press.Google Scholar
Meek, Barbra A. 2006. “And the Injun Goes ‘How!’: Representations of American Indian English in White Public Space.Language in Society 35 (1): 93–128.10.1017/S0047404506060040CrossRefGoogle Scholar
Mitchell-Kernan, Claudia, and Yehudi A. Cohen. 2017. “Signifying and Marking: Two Afro-American Speech Acts.” In Human Adaptation, edited by Yehudi A. Cohen, 395–406. London: Routledge.Google Scholar
Nakassis, Constantine V. 2020. “Deixis and the Linguistic Anthropology of Cinema.” Semiotic Review 9(S.I.): Available at: <https://semioticreview.com/ojs/index.php/sr/article/view/65>. Date accessed: 20 apr. 2021.Google Scholar
Parrill, Fey. 2009. “Dual Viewpoint Gestures.Gesture 9 (3): 271–289.10.1075/gest.9.3.01parCrossRefGoogle Scholar
Parrill, Fey. 2012. “Interactions between Discourse Status and Viewpoint in Co-Speech Gesture.” Viewpoint in Language: A Multimodal Perspective, edited by Barbara Dancygier and Eve Sweetser, 97–112. Cambridge: Cambridge University Press.10.1017/CBO9781139084727.008CrossRefGoogle Scholar
Partee, Barbara Hall. 1973. “The Syntax and Semantics of Quotation.” In A Festschrift for Morris Halle, edited by S. Anderson and P. Kiparsky, 410–418. New York: Holt.Google Scholar
Prentice, Michael M. 2020. “Old Spirits of Capitalism: Managers and Masculine Alterity in/as the Korean Office.Anthropological Quarterly 93 (2): 89–118.10.1353/anq.2020.0027CrossRefGoogle Scholar
Rumsey, Alan. 1990. “Wording, Meaning, and Linguistic Ideology.American Anthropologist 92 (2): 346–361.10.1525/aa.1990.92.2.02a00060CrossRefGoogle Scholar
Rumsey, Alan. 2020. “Reported Speec.” In The International Encyclopedia of Linguistic Anthropology, edited by James M. Stanlaw. Hoboken, New Jersey: Wiley-Blackwell.Google Scholar
Russell, Kamala. 2012. Form and Function: Character Viewpoint Gestures in Dialogic Narrative. Unpublished BA Thesis, University of Chicago.Google Scholar
Sandlund, E. 2014. “Prescribing Conduct: Enactments of Talk or Thought in Advice-Giving Sequences.Discourse Studies 16 (5): 645–666.10.1177/1461445614539065CrossRefGoogle Scholar
Sapir, Edward. 1949. “Abnormal Speech Types in Nootka.” In Selected Writings of Edward Sapir in Language, Culture and Personality, 179–96. Berkeley: University of California Press.Google Scholar
Sauer, Beverly. 1999. “Embodied Experience: Representing Risk in Speech and Gesture.Discourse Studies 1 (3): 321–354.10.1177/1461445699001003003CrossRefGoogle Scholar
Sidnell, Jack. 2006. “Coordinating Gesture, Talk, and Gaze in Reenactments.Research on Language and Social Interaction 39 (4): 377–409.10.1207/s15327973rlsi3904_2CrossRefGoogle Scholar
Silverstein, Michael. 1981. The Limits of Awareness. Sociolinguistic Working Paper 84.Google Scholar
Silverstein, Michael. 2014. “Denotation and the Pragmatics of Language.” In The Cambridge Handbook of Linguistic Anthropology, edited by N. J. Enfield, Paul Kockelman, and Jack Sidnell, 128–57. Cambridge: Cambridge University Press.10.1017/CBO9781139342872.007CrossRefGoogle Scholar
Stec, Kashmiri, Mike Huiskes, and Gisela Redeker. 2015. “Multimodal Analysis of Quotation in Oral Narratives.Open Linguistics 1: 531–554.10.1515/opli-2015-0018CrossRefGoogle Scholar
Stec, Kashmiri. 2016. “Multimodal Quotation: Role Shift Practices in Spoken Narratives.Journal of Pragmatics 104: 1–17.10.1016/j.pragma.2016.07.008CrossRefGoogle Scholar
Streeck, Jürgen. 2002. “Grammars, Words, and Embodied Meanings: On the Uses and Evolution of So and Like.Journal of Communication 52 (3): 581–596.10.1111/j.1460-2466.2002.tb02563.xCrossRefGoogle Scholar
Streeck, Jürgen. 2009. Gesturecraft: The Manu-Facture of Meaning. Amsterdam: John Benjamins.10.1075/gs.2CrossRefGoogle Scholar
Sweetser, Eve. 2012. “Introduction: Viewpoint and Perspective in Language and Gesture, from the Ground Down.” In Viewpoint in Language: A Multimodal Perspective, edited by Barbara Dancygier and Eve Sweetser, 1–22. Cambridge: Cambridge University Press.Google Scholar
Tannen, Deborah. 1986. “Introducing Constructed Dialogue in Greek and American Conversational and Literary Narrative.” In Direct and Indirect Speech, edited by Florian Coulmas, 311–332. Berlin: Mouton de Gruyter.Google Scholar
Tannen, Deborah. 2007. Talking Voices: Repetition, Dialogue, and Imagery in Conversational Discourse. Cambridge: Cambridge University Press.10.1017/CBO9780511618987CrossRefGoogle Scholar
Tannen, Deborah. 2009. “Abduction, Dialogicality and Prior Text: The Taking on of Voices in Conversational Discourse.” Plenary Address Presented at the 84th Annual Meeting of the Linguistic Society of America. Baltimore, MD.Google Scholar
Thompson, Sandra A., and Ryoko Suzuki. 2014. “Reenactments in Conversation: Gaze and Recipiency.Discourse Studies 16 (6): 816–46.10.1177/1461445614546259CrossRefGoogle Scholar
Urban, Greg. 1989. “The ‘I’ of Discourse”. In Semiotics, Self, and Society, edited by Benjamin Lee and Greg Urban, 27–51. Berlin: De Gruyter Mouton.Google Scholar
Voloshinov, V. N. 1973. Marxism and The Philosophy of Language, translated by Ladislav Matejka. New York: Seminary Press.Google Scholar
Weeks, Peter. 1996. “A Rehearsal of a Beethoven Passage: An Analysis of Correction Talk.Research on Language and Social Interaction 29 (3): 247–290.10.1207/s15327973rlsi2903_3CrossRefGoogle Scholar
Woolard, Kathryn A. 1998. “Simultaneity and Bivalency as Strategies in Bilingualism.Journal of Linguistic Anthropology 8 (1): 3–29.10.1525/jlin.1998.8.1.3CrossRefGoogle Scholar
Wortham, Stanton, and Michael Locher. 1994. “Implicit Moral Messages in the Newsroom and the Classroom: A Systematic Technique for Analyzing ‘Voicing.’.” In Educational Resources Information Center, 1–25.Google Scholar
Zuckerman, Charles H. P. 2020. “‘Don’t Gamble for Money with Friends’: Moral-Economic Types and Their Uses.American Ethnologist 47 (4): 432–446.10.1111/amet.12981CrossRefGoogle Scholar
Zuckerman, Charles H. P.. 2021a. “Introduction to The Generic Special Issue.” Language in Society 50 (4).10.1017/S0047404521000300CrossRefGoogle Scholar
Zuckerman, Charles H. P.. 2021b. “On the unity of types: Lao Gambling, Ethno-Metapragmatics, and Generic and Specific Modes of Typification.” Language in Society 50 (4).10.1017/S0047404521000385CrossRefGoogle Scholar
Zuckerman, Charles H. P.. Forthcoming. “‘Friends Who Don’t Throw Each Other Away’: Friendship, Pronouns, and Relations on the Edge in Luang Prabang, Laos.” In Signs of Deference, Signs of Demeanour: Interlocutor Reference and Self-Other Relations across Southeast Asian Speech Communities (Preliminary Title), edited by Dwi Noverini Djenar and Jack Sidnell. NUS Press.Google Scholar
Figure 0

Figure 1. Donald J. Trump mocks Serge F. Kovaleski

Figure 1

Table 1. Figure Transparency and Figure Composition

Figure 2

Table 2. Example from Bucholtz 2011

Figure 3

Figure 2. Sarah Cooper impersonates Trump using audio of him speaking (Source: “How to person woman man camera tv”; https://www.youtube.com/watch?v=j8oaaP68i4s; accessed September 24, 2020)

Figure 4

Figure 3. The Pitch Contours of Lines [27] and [29]

Figure 5

Figure 4. The axes along which Noy moves her body during line [29]:27

Figure 6

Figure 5. Noy’s Pitch Contours and Gestures