1 Introduction
This chapter offers a toolbox of Methods for Gesture Analysis (MGA).Footnote 1 It addresses gesture analysis from the point of view of hand gestures and starts from an analysis of gestures as temporal forms. MGA tackles the multimodality of language use as a dynamic process that happens along different timescales. The methods were originally developed in the context of research on emerging protolinguistic structures in co-speech gestures (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfBressem, Ladewig, & Müller, 2013; Reference Müller, Bressem, Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, Bressem, & Ladewig, 2013; Reference Müller, Ladewig, Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller, Ladewig, & Bressem, 2013). The present version of MGA differs from earlier publications (Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfBressem, 2013a, Reference Bressem2021) in offering sets of tools for gesture analysis that adapt flexibly to different research questions, that can be extended by future researchers, and that work with various analytical frameworks. In its present form, MGA encourages the researcher to look at gestures from various angles, against the backdrop of diverse frameworks, and to select the tools and theoretical approach that best fit the researcher’s specific interest.
Sections 1.1–1.2 offer an overview of MGA and sketch basic assumptions. Section 2 introduces an example analysis that serves as a point of reference throughout the chapter. In Section 3, variable tools for gesture form analysis are described, and Section 4 outlines different approaches to gesture context analysis.
1.1 Methods for Gesture Analysis: Overview
MGA distinguishes microlevel and macrolevel analysis. The baseline for MGA is a microanalysis that entails some account of the gesture as temporal form, and some analysis of how a gesture, a series of gestures, or a multimodal sequence is placed in an unfolding context-of-use (Figure 8.1).

Figure 8.1 Baseline of MGA: Microlevel analysis
MGA’s toolbox offers a flexible set of tools for descriptive analyses of hand gestures as temporal forms. Temporality is relevant to analysis on multiple scales: on the microlevel of a single gesture or gesture sequence and on the macrolevel of the unfolding of gesture(s) along the temporal dynamics of a discourse or conversational interaction (Figure 8.2).

Figure 8.2 Different kinds of macrolevel analyses
Which collection of tools is selected depends upon the researcher’s interest and the theoretical framework adopted. The focus of this chapter lies at the microanalytic level. Microanalytic tools constitute the baseline for macrolevel analysis of gesture dynamics. While an introduction of macroanalytic procedures would extend the scope of this chapter beyond its limit, references below point to examples of macroanalytic studies.
Macrolevel analysis may address temporal dynamics of gesture use as it unfolds across a discourse event or conversational interaction or across historical time spans. Such studies may, for example, concern returns and extensions of gestures along an interactive setting, such as a dance class (Reference Müller and KappelhoffMüller & Kappelhoff, 2018; Reference Müller, Ladewig, Borkent, Dancygier and HinnellMüller & Ladewig, 2013); or the study of historical changes of gestures as processes of stabilizations on different timescales. Examples of macrolevel studies are investigations of recurrent gestures and emblems (Reference KendonKendon, 2004, Ch. 13; Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014, this volume; Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014b, Reference Müller2017a). Macroanalysis of gestures may furthermore contribute to comparative studies of gesture and sign (Reference KendonKendon, 2004, Ch. 15; Reference Kendon and Allen2015; Reference MüllerMüller, 2019a; Reference WilcoxWilcox, 2009; Reference Wilcox, Shaffer, Jarque, Segimon, Pizzuto and RossiniWilcox et al., 2000), of gestures across different cultures (Reference Bressem, Stein and WegenerBressem, Stein, & Wegener, 2017; Reference Bressem and WegenerBressem & Wegener, 2021; Reference KendonKendon, 1981; Reference Morris, Collett, Marsh and O’ShaughnessyMorris, Collett, Marsh, & O’Shaughnessy, 1979; Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014b) or across species (Reference Müller, Liebal, Müller and PikaMüller, 2007).
1.2 Methods for Gesture Analysis: Basic Assumptions
Essential starting points for MGA are an understanding of hand gestures as temporal forms embedded in a dynamically unfolding context; and an understanding of context that itself varies with the adopted framework. Depending on a given research question and its respective theoretical framework, procedures deemed relevant for a given study can be selected from the toolbox. MGA thus offers a flexible set of tools, designed to adjust to variable analytic perspectives and theoretical frameworks. Prerequisite for this flexibility are some basic assumptions concerning the nature and character of gestures outlined in this section.
(1) The term “gesture” refers to movements of the body that people use in conjunction with spoken and signed languages to talk about and act within their life-worlds. Gestures are used to articulate thoughts as much as feelings and they are vital in coordinating communicative actions in social interaction and in the flow of discourse.
(2) Articulatory gestures. MGA focuses on gestures as hand movements, but can serve as a starting point for the inclusion of other body parts. The focus on hand movements responds to the articulatory, enactive, and mimetic complexity of the hands, which are humans’ foremost tool to act upon the world (Reference StreeckStreeck, 2009, Ch. 3). It is not by accident that movements of the hands play a central role in signed languages. The articulatory freedom of the hand rests upon a physiological flexibility of the human hand that is an evolutionary achievement of highest importance to the development of human culture including language – in whatever modality expressed (Reference Corballis, Müller, Cienki., Fricke, Ladewig, McNeill and TeßendorfCorballis, 2013; Reference Leroi-GourhanLeroi-Gourhan, 1964/1993). The hands display a unique articulatory freedom and richness comparable only to the mouth as a tool and locus of fine articulatory movement; hence, the reason why some scholars speak of spoken language as articulatory gestures (Reference Armstrong, Stokoe and WilcoxArmstrong, Stokoe, & Wilcox, 1995).
(3) Gestures are temporal forms. As Kendon points out: “When a person speaks there is always some movement in the body besides the movements of the jaws and lips that are directly involved in speech production” (Reference Kendon and Ritchie KeyKendon, 1980, p. 207). He put forward a description of gestures as movements that unfold in time and whose movement phases typically go from preparation – to stroke – to retraction (Reference Kendon and Ritchie KeyKendon, 1980; Reference Kendon2004, Ch. 7). Conceiving of gestures as a temporal form has since become “common sense” in the field of gesture studies; advancements of the systematics include Reference Bressem and LadewigBressem and Ladewig (2011); Reference Kita, van Gijn, van der Hulst, Wachsmuth and FröhlichKita, van Gijn, and van der Hulst (1998). Starting the analysis of gestures from their temporal nature makes a lot of sense because the recognition of gesture boundaries (i.e. specifying where a gesture begins and where it ends) determines units of analysis in the first place: In quantitative studies, it enables reliable counting, and, in qualitative studies, it creates the object of analysis that then becomes subject to further descriptive (i.e. interactional, conceptual, semiotic, semantic, pragmatic, syntactic, etc.) analysis.
(4) Gestures are embedded in dynamic contexts-of-use. As temporal forms, hand gestures are integrated in the contexts in which they are used. No matter how the notion of context is conceived (cf. Section 4), they are temporally unfolding phenomena; in short, they are dynamic. MGA takes account of these temporal qualities: Note, however, that the separation of gestures as temporal forms from the temporal nature of contexts-of-use is purely an analytic one (Figure 8.1). Gesture, speech, and whole-body movements emerge as one multidimensional gestalt when a person is talking and engaging in an interaction. Gesture analysis decomposes an experiential unity. Coparticipants in a conversation perceive and experience gestures and speech as dynamic and multidimensional gestalts, just as film viewers see the acting of an actor, a landscape, or a race evolving in time as an orchestrated movement image (Reference MüllerMüller, 2019b; Reference Müller and KappelhoffMüller & Kappelhoff, 2018). The type of decomposition of this gestalt is a consequence of the analytical focus and theoretical approach adopted. It is important to note this because any analysis will at some point have to reflect on how the decomposed aspects of gesture forms and contexts relate to the multidimensional unity of experience that characterizes speaking, gesturing, and understanding.
(5) Why microanalysis of the form of gestures is necessary. Why not, if gestures are part and parcel of holistic gestalts, simply skip the laborious analyses of gestures as temporal forms? The answer is that not only does a close form analysis provide a solid ground for seeing gestures as immersed in multidimensional gestalts, but it also prevents researchers from reading meanings ‘into’ the gestures with no substantiation apart from intuition. For example, some analyses of multimodal constructions start from a linguistic form and look for co-occurring gestures without carrying out systematic analyses of the gestural form (Reference SchoonjansSchoonjans, 2018). Often, a gesture form analysis is deemed not necessary because the meaning of the gesture is treated as obvious. To highlight the potential relevance of a close gesture analysis, Reference Bressem and MüllerBressem and Müller (2017) have suggested a gesture-first approach to the analysis of multimodal constructions, that is, starting from recurring gesture forms to identify potential multimodal constructions (see also Reference Mittelberg, Gonzalez-Marquez, Mittelberg, Coulson and SpiveyMittelberg, 2007). However, no matter whether language-first or gesture-first, a close analysis of gesture forms is an essential baseline and provides firm grounds for the analysis.
(6) What is considered “context” for gesture analysis. In gesture studies, “context” or “context-of-use” is mostly used in a rather nonspecific way (exceptions are Reference Kendon, Duranti and GoodwinKendon, 1992; Reference StreeckStreeck, 2009). Moreover, often quite different understandings of context-of-use are implied when using terms such as “multimodality,” “multimodal communication,” “multimodal interaction,” “multimodal utterance,” and so on. The term “multimodality” covers a wide range of different theoretical frameworks with sometimes mutually exclusive concepts of gestures and their contexts. Against this backdrop, specifying the particular theoretical framework and its associated concept of “context” appears as a useful analytic step that directs the researcher’s attention to a critical reflection upon their research focus and the approach they have adopted.
In a nutshell. MGA as a toolkit enables the researcher to work out which kind of form analysis is appropriate and which understanding of context is relevant. It also encourages the researcher to step back and consider the theoretical approach adopted as one of several possible perspectives on gesture analysis.
2 Point of Reference: An Example
The toolbox for gesture analysis is illustrated throughout the rest of this chapter mainly with reference to the example of a story told as part of a conversation between two friends. Returning to this example over and over reveals how changing the analytic perspective and the according analytic tools uncovers different dimensions of gestures and the multimodal utterances they contribute to. It illustrates how there is no such thing as the one and only method of gesture analysis, and how the choice of tool depends upon the analytic perspective and theoretical framework adopted for gesture analysis.
The example comes from a conversation between a German speaker (Paul) and a Spanish speaker (Luis) (the names have been changed). Recorded in Berlin in the early 1990s, the conversation took place at a coffee table in a private apartment and was carried out in Spanish. Having lived for a long period of time in Spain, the German speaker was fluent in Spanish. The two young men knew each other quite well. No specifications concerning the topic of their conversation were given in advance. They were asked to simply chat about whatever they liked. Once the camera was started, they were left alone for 30 minutes.
Luis’ contribution to the discussion of Spanish politics is a family story that took place in 1980/81. A wood-carved portrait of the Spanish King Juan-Carlos hung on the wall in the apartment of Luis’ parents when his family lived in Venezuelan exile. The plot of the story is that this wooden portrait of the king fell off the wall, all by itself, a few days before an attempted coup d’état in Spain, as if foretelling the troubling event.
This first description of the gestural forms that Luis uses is not an interpretation of the gestural meaning in the context of the story. Rather, it offers a simple description of the movements, shapes, actions, and locations of his gestures. Gestures are numbered in Figure 8.3 in their order of use. With his first gesture G1, Luis loosely forms a roundish shape with a molding movement of his two hands. The round shape is vertically oriented and located in the center of the speaker’s gesture space. After a while, the speaker performs G2, positioned in the same location of the gesture space as G1 and also oriented vertically. It is thus spatially connected with G1. It differs from G1 in that it sketches an oval shape with a repeated outlining movement of the index fingers. A one-handed gesture G3 immediately follows G2, staying within the same place in the center of the gesture space and acting as if holding and placing a small object on top of the just outlined round object. G2 and G3 are thus connected spatially and temporally. A moment later in the story-telling, G4 follows. It differs from the first three gestures in that it is located at the left-hand side of the gesture space and, instead of molding or outlining a shape, something is represented by a flat left hand, held high up. Yet G4 still connects with the molding and the outlining gestures (G1 and G2) through the vertical orientation of the shape. G5 is then a quick downward movement of the loosely extended right hand, once again located in the center of the gesture space.

Figure 8.3 Gestures performed alongside the story-telling
A crucial analytical decision to be made is how to transcribe the spoken part of the gesture–speech ensemble. Why is this crucial? Because every notation is already an analysis. It highlights some aspects of the spoken utterances and backgrounds others (see Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfBressem, 2013b, this volume). The following extract (Figure 8.4: extract [1]) shows the story-telling between Luis (L) and Paul (P) transcribed and translated into English.

Figure 8.4 Extract 1: Transcription and translation
We now know that Luis’ gestures are part of a story about a family memory in which a portrait of the King plays a major role. Let us take a look at how the gestures deployed are integrated into the story and how, through their precise placement in relation to the spoken utterance, meaning can emerge from their contextual embedding (Figure 8.5: extract [2]). The extract is made as a combination of a simplified Reference McNeillMcNeill (1992) style of gesture annotation, which brackets the gesture phrase from preparation (P) to rest-position (rp) in the speech line, bold type indicating the stroke (S), (R) indicating the start of the recovery or retraction. Instead of short gesture form descriptions, drawings of the gesture strokes are placed above the bold type annotation.

Figure 8.5 Extract (2): Gestures with speech as multimodal temporal form
The round shape gesture (G1) occurs with a search for an appropriate term for the rather unusual portrait of the King (Figure 8.3, Figure 8.5: extract [2]) and accompanies the attribute of an unspoken noun (“a small ___”) and continues through a speech pause. With this placement, the gesture is likely to depict the topic of the story, some kind of round object. What then follows is a verbal description of the object in question: a picture of the King. The next gestures, G2 and G3, follow only in line 7, when a detailed description of the topic of the story that lacks an appropriate Spanish term is given: “with a round frame [G2’s outlining of an oval shape] and a small crown on top [G3’s holding and placing gesture].” Because of their temporal synchronization with this verbal description, the oval shape G2 comes to depict the picture frame, and G3 is seen as placing an imagined crown on top of the ephemeral frame. In line 9, finally, the story-telling moves on toward the plot: here the speaker enacts the spatial setting of the story: While the family was in the dining room, the wooden portrait was hanging in the living room (“she had it there in the other room”). G4 is temporally coordinated with the deictic “there” and only because of this temporal synchronization can the flat hand, held up at the upper edge of the gesture space, be seen as a representation of the wooden portrait of the King hanging in another room. G5 – a downward movement – is coordinated with “fell down” and thus depicts how the portrait fell down from the wall.
Our first encounter with this example shows how, out of the six gestures performed in close succession and integrated with the narrative, five relate to the most important object in the story: the carved portrait of the King. Their formal difference and their subtle integration with speech indicate how meaning emerges from the temporal coordination of gestures with speech: The story-telling is multimodal and temporal. As we proceed, the example will be used to illustrate how different foci of form analysis and different theoretical frameworks may uncover further aspects of the multimodal story-telling.
3 Gesture Form Analysis
In this section, microanalytic tools for gesture form analysis are introduced. The tools address the complexity of gesture forms and assume that even minor changes in gestural forms might be significant in terms of what is being depicted or expressed. When somebody is gesturally depicting how to open a window, it is crucial to recognize which hand shape is used and which direction the hand is turned; when somebody molds or outlines a shape of a picture frame, the ephemeral forms depicted vary in accuracy of shape depiction; when somebody sketches a shape with a delicate or a harsh movement quality, the expressive quality of the gestural movement varies; and when gestures become conventionalized, holistic gestalts may decompose and hybrids of stabilized and idiosyncratic gestural forms emerge (Reference MüllerMüller, 2017a).
Four tools are presented (Figure 8.6), of which the first is indispensable and the others optional: the first one identifies the unit of analysis as a temporal form. The choice of the other three types of tools depends upon one’s research interest and the theoretical framework adopted. Tool (1) determines gesture boundaries and their phase structure, that is, where a gesture begins, where it ends, and whether it is a single gesture or a sequence of gestures. Tool (2) offers a form-based two-way distinction between types of gestures. Tool (3) analyzes depictive and pragmatic gestures in terms of as-if actions. Tool (4) includes three subtools to analyze hands as movement: aspects of kinesic form, gestures as motion events, and gestures as expressive movement.

Figure 8.6 Set of tools for gesture form analysis
Combining the microanalytic tools for gesture form analysis allows variable degrees of detail. Which ones to choose and which degree of fine-grained analysis is appropriate depends upon one’s framework of research; an essential element of the research design. Let us return to Luis’ story to illustrate how these different tools may be applied.
3.1 Establishing the Temporal Unit of Analysis
The first tool concerns gesture boundaries, for example, it addresses the establishment of units of analysis. Deciding where a gesture begins and ends may, at first sight, seem obvious. Consider, for example, the “round shape gesture” (G1), from Luis’ story. Here the boundaries of the gesture are clear-cut: the gestural movement unfolds from a rest position, where both hands are resting on the speaker’s lap, moves upward in a preparation phase, performs a stroke (e.g. the molding round shape movement) and then, with a phase of retraction, returns back to the rest position on his lap (Figure 8.7).

Figure 8.7 G1 as temporal unit
However, even with this apparently simple temporal gestural movement, questions arise as to what is included and excluded. In Kendon’s influential systematics of “gesture units, gesture phrases and the phases of gestural action” (Reference KendonKendon, 2004, p. 111), the temporal unit that is considered as the gesture would only comprise part of the movement described above (G1). In McNeill’s notation system, applied in extract (2) (Figure 8.5), the temporal unit would include the return to rest position. Kendon makes a distinction between gesture phase (preparation, stroke, recovery), gesture phrase, and gesture unit. A Kendonian gesture phrase is defined as preparation plus stroke (including poststroke holds) and does not include the phase of recovery and return to rest position (Reference KendonKendon, 2004, p. 112). Gesture units may thus contain sequences of gestures (e.g. gesture phrases). Figure 8.8 shows a gesture unit as a sequence of two gesture phrases: a succession of outlining (G2) and placing (G3) movements from the story-telling.

Figure 8.8 A Kendonian gesture unit with two gesture phrases
Between outlining a round frame (G2) and placing a crown on top (G3), one hand returns to rest position. The right hand stays up and moves directly to the position in the gesture space where the stroke (e.g. the placing movement) is made. After that, the right hand returns to rest position too. In Reference KendonKendon’s (2004, p. 112) terms, the gesture unit thus entails two gesture phrases (G2) and (G3), each characterized by a succession of preparation (hand moves into the gesture space) and stroke phase (the phase where the movement reaches its apex and is most clearly articulated, in terms of effort and shape). The concepts of effort and shape are taken from Laban Movement Analysis (Reference Bartenieff and LewisBartenieff & Lewis, 1980; Reference Kennedy, Müller, Cienki., Fricke, Ladewig, McNeill and TeßendorfKennedy, 2013). Applying the tool “establishing the temporal unit of analysis” reveals the internal linear complexity of gestures as temporal forms, which is central to the coordination of gesture and speech.
3.2 Distinguishing Pointing from Depictive and Pragmatic Gestures
For an analysis of different kinds of gestures, different analytic tools might become relevant. Therefore, a basic two-way distinction is made between pointing versus depictive and pragmatic gestures. This distinction refers to level 2 in Figure 8.6.
What motivates the distinction between pointing and other gestures in the first place? Without going into the details of gesture classification systems, it can be seen that a distinction between pointing (or deictics as per Reference EfronEfron, 1941/1972; Reference Fricke, Müller, Cienki, Fricke, Ladewig, McNeill and BressemFricke, 2014, this volume; Reference McNeillMcNeill, 1992, p. 18) and other kinds of gestures appears to be widely accepted in gesture studies. Overviews and discussions of gesture classifications have been offered by several researchers (Reference AndrénAndrén, 2010, pp. 96–105; Reference FrickeFricke, 2007, pp. 156–181; Reference GullbergGullberg, 1998, pp. 47–51; Reference KendonKendon, 2004, Ch. 2; Reference McNeillMcNeill, 1992, Ch. 3; Reference MüllerMüller, 1998b, pp. 91–113). The distinction between depictive and pragmatic gestures is informed by Reference KendonKendon (2004), although “depictive gestures” is used in MGA instead of his “representational gestures” (Reference KendonKendon, 2004, p. 160; Reference StreeckStreeck, 2009, Ch. 6). The term “pragmatic gestures” is used in Reference KendonKendon’s (2004, pp. 158–159) sense. The distinction between depictive and pragmatic gestures is basic because it reflects two essentials of language: reference to the world talked about and communicative action. In using one or the other type of gesture, interactive attention is drawn either to communicative action or to depiction (this holds, notwithstanding that every depiction implies acting communicatively) (Reference MüllerMüller, 2015).
Note that in the McNeillian tradition, depictive gestures are called “iconics.” “Metaphorics” in the McNeillian tradition are depictive gestures used metaphorically as well as pragmatic gestures. Other classifications, however, suggest a further distinction between gestures used to depict some abstract concept and gestures used with a pragmatic function. With respect to depictive gestures used metaphorically, Reference EfronEfron (1941/1972) speaks of ideographic, Reference MüllerMüller (1998b) of abstract referential, Cienki and Reference MüllerMüller (2008a, Reference Cienki, Müller and Gibbs2008b) of metaphoric, and Reference StreeckStreeck (2009, Ch. 7) of ceiving gestures. When describing pragmatic gestures in Kendon’s sense, by contrast, Reference StreeckStreeck (2009, Ch. 8) speaks of speech-handling, and Reference MüllerMüller (1998b) of performative and discursive gestures.
Distinguishing pointing from other kinds of gestures reflects a fundamental semiotic distinction. While pointing gestures are primarily based on indexicality, for other gestures, iconicity plays an important role; although they typically also involve indexical elements (see Fricke, this volume). Mittelberg’s work on the entanglement of metonymic (as indexical) and metaphoric (as iconic) elements of gesture forms disentangles this complexity both theoretically and as a way to approach gesture form analysis (Reference MittelbergMittelberg, 2019b). In semiotic theory, indexicality and iconicity are complex and widely debated issues and are highly relevant to gesture analysis (cf. Reference FrickeFricke, 2007; Reference Mittelberg, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMittelberg, 2014).
Distinguishing pointing gestures from depictive and pragmatic ones responds to differences in kinesic form and in embodied motivation. Only depictive and pragmatic gestures can be accounted for as as-if actions, that is, in terms of their embodied motivation.
The form-based distinction of pointing as opposed to depictive and pragmatic gestures rests upon characteristics of hand shape and movement. Typical pointing hand shapes are the extended index finger, the palm lateral hand, pointing with the little finger, or pointing with the lips (Reference FrickeFricke, 2007; Reference KendonKendon, 2004, pp. 199–200; Reference KitaKita, 2003; Reference SherzerSherzer, 1973). Reference KendonKendon (2004, p. 200) describes how pointing gestures tend to show a characteristic movement pattern “in which the body part carrying out the pointing is moved in a well-defined path, and the dynamics of the movement are such that at least the final path of the movement is linear. Commonly, but not always, once the body part doing the pointing reaches its furthest extent, it is then held in position briefly.” Sometimes, however, pointing may only involve extending the index finger fully. In short, distinguishing pointing from depictive and pragmatic gestures guides further analytic procedures.
While all gestures can be analyzed with regard to aspects of their kinesic form (cf. Section 3.4.1), depictive as well as pragmatic gestures are special in that they are enactments of different kinds of as-if actions (cf. Section 3.3). Attempting to reconstruct which as-if action is performed with a given gesture offers a path not only to the semiotic grounding of the gesture but to an experiential base of the intersubjectivity of embodied understanding (Reference Müller, Zlatev, Sonesson and KonderakMüller, 2016, Reference Müller2019b).
Considering the example above, Luis uses no pointing gestures in his story-telling. In the following, the six gestures he uses illustrate how the different tools for gesture analysis reveal diverse aspects of gestural movements for depictive gestures.
3.3 Analyzing Depictive and Pragmatic Gestures as As-If Actions
This set of tools for gesture form analysis addresses level 3 in Figure 8.6. It concerns the gesturing hands as as-if actions. Experience has shown that the set of tools figuring under the rubric of “Gestures as As-If Actions” offers an excellent starting point for analyzing depictive as well as pragmatic gestures. Often this works best when turning off the sound and trying to answer the question: What kind of as-if action is the gesture carrying out?
The MGA toolbox distinguishes four basic kinds of as-if actions that are applicable to reconstruct a heuristic of the experiential base of depictive and pragmatic gestures: The hands act as if performing a practical action with or without an imagined object, the hands act as if molding an ephemeral object, the hands act as if drawing the shape of an object or the line of a path, the hands act as if they were an object (Figure 8.9).
In Luis’ narration all four are applied: In G1 the speaker acts as if molding an ephemeral round object, in G2 he acts as if drawing a round shape, in G3 he acts as if placing a small object on top of the ephemeral round shape just outlined, in G4 he acts as if his flat hands were some flat object located vertically at the edge of the gesture space, in G5 he acts as-if the hand were some object falling down quickly.
These four different as-if actions involve different kinds of bodily experiences: acting, molding, and drawing are based on haptic manual experiences, and often involve imagined objects; in contrast, when the hands are used as if they were an object, this transformation more likely involves visual perception. We can only mention in passing that these different embodied motivations involve different conceptual viewpoints (Reference Dancygier and SweetserDancygier & Sweetser, 2012; Reference McNeillMcNeill, 1992; Reference Parrill, Lavanty, Bennett and KlcoParrill, Lavanty, Bennett, Klco, & Demir-Lira, 2018; Reference Stec, Sweetser, Dancygier, Wei-Iun and VerhagenStec & Sweetser, 2016). Acting as-if involves a character viewpoint, whereas molding and drawing may imply a character or an observer viewpoint, depending on the focus of the action or the depicted object. Hands representing an object typically involve the perspective of an observer (Reference Bressem, Ladewig, Müller, Hübl and SteinbachBressem, Ladewig, & Müller, 2018). Note that, in most cases, the characterization of an experiential base will be a heuristic assumption, but, if longer segments of talk are considered, the analyst may be able to observe how gestures emerge from practical actions through abstraction and schematization over the course of communicative interactions.
To analyze gestures in terms of as-if actions addresses the iconicity of gestural form in depictive and pragmatic gestures in terms of their embodied motivation. This perspective on gesture form analysis connects with a theoretical perspective that grounds intersubjectivity of understanding (Cuffari, this volume) in the embodied perception of the moving body (Reference MüllerMüller, 2019b). I have discussed this aspect of gesture form analysis in publications over the past two decades under different labels: iconicity (Reference Müller and SantiMüller, 1998a), modes of representation (Reference MüllerMüller, 1998b, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014a, Reference Müller2017a), and modes of mimesis (Reference Müller, Koch, Voss and VöhlerMüller, 2010, Reference Müller, Zlatev, Sonesson and Konderak2016). The different terms reflect different perspectives on the nature of as-if actions and their grounding of gestural meaning in different kinds of common manual actions.
Approaching the iconicity of gestures resonates with reflections on iconicity of signs in Sign Language Linguistics (Reference Mandel and FriedmanMandel, 1977; Reference TaubTaub, 2001) and with proposals of signed language classifiers (Reference Müller and MalmkjaerMüller, 2009). In gesture studies it connects with Kendon’s discussion of gesture and sign (Reference KendonKendon, 2004, Ch. 15), with Mittelberg’s cognitive-semiotic approach to gesture form (Reference MittelbergMittelberg, 2019a, Reference Mittelberg2019b) and with Reference ZlatevZlatev’s (2014a, Reference Zlatev2014b) work on mimesis, among other things.
Conceiving of gesture forms as as-if actions draws analytic attention to the fact that both depictive and pragmatic gestures operate upon the mode of “as-if.” What is so special about the mode of “as-if?” Instead of performing the action of opening a window, speakers act as if their hands opened a window. The window handle is an imagined handle; the hand shape and movement mime the actual action. Instead of performing the action of physically showing some object on the open hand, speakers act as if some argument were displayed on their open hand. This transformation from object manipulation to acting as if and upon virtual objects characterizes depictive as well as pragmatic gestures. It makes them suitable forms to communicate in the absence of things, actions, and events being talked about.
Some scholars describe pragmatic gestures as “metaphoric” (Reference McNeillMcNeill, 1992) or “speech-handling” (Reference StreeckStreeck, 2009). This terminology risks masking the fact that both depictive and pragmatic gestures are as-if actions (Reference Cienki, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemCienki & Müller, 2014). Both act upon virtual objects and both are recognizable as as-if action through their abstracted and abbreviated performance of a manual action. Moreover, as we know from metaphor research, gestures are frequently used to depict abstract concepts. Accordingly, the term “metaphoric gesture” is reserved here/in MGA for depictive gestures used to express the bodily base of abstract concepts (Cienki & Reference MüllerMüller, 2008a; Reference MüllerMüller, 2008). Consider the following example: In an interview, former US president Obama characterizes the story of America as a kind of “battle” between different ideas of democracy and while saying this he moves his hands and arms as if boxing. His gesture thus depicts the abstract concept of “battle” in terms of a boxing match. Such subjective bodily imaginations are considered here as a bodily base of an abstract concept: Here the gesture metaphorically depicts a political battle as boxing.
When considering pragmatic gestures more closely, we see that it makes sense to conceive of them as as-if actions. Often, experiential roots in practical actions of the hands are quite obvious. Streeck, for example, describes speech-handling gestures as as-if actions (Reference StreeckStreeck, 2009, Ch. 8), where the hands act as if manipulating some imagined object and function as handing over the speech to the next speaker. Pragmatic gestures thus present communicative actions as if they were actions operating upon imagined objects. Examples are the palm-up-open hand (e.g. presenting something as obvious, by acting as if some abstract object sat on the open palm, visible to everybody; cf. Reference Müller, Müller and PosnerMüller, 2004) and the family of “throwing away” gestures (e.g. a dismissive movement of the hand, acting as if throwing away some middle-sized object; cf. Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemBressem & Müller, 2014a, Reference Bressem, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Bressem2014b; Reference MüllerMüller, 2017a; Reference Teßendorf, Müller, Cienki, Fricke, Ladewig, McNeill and BressemTeßendorf, 2013).
To recap: regarding gestures as as-if actions applies to depictive and pragmatic gestures. While they share the mode of as-if, they differ regarding their communicative function: Depictive gestures embody what is talked about, pragmatic gestures incorporate and perform communicative action.
3.4 Analyzing Hand Gestures as Movement
Analyzing gesture forms as movement concerns depictive, pragmatic, and pointing gestures as well. It addresses the fact that, no matter what, hand gestures are movements of the body. To analyze hand gestures as movement presupposes, however, a specific understanding of body movement. Therefore, the toolbox of MGA offers three different conceptions of accounting for the hands as body movements: kinesic form, motion events, and expressive movement. The three perspectives capture different facets of the gestural movement and are not mutually exclusive in the analytic process. Importantly, the selection is not deemed exhaustive. It complements approaches such as Boutet’s kinesiology of gesture and sign (Boutet, this volume; Reference Boutet, Morgenstern and CienkiBoutet, Morgenstern, & Cienki, 2016, Reference Boutet, Morgenstern, Cienki, Cienki and Iriskhanova2018) and may be supplemented further. This set of tools for gesture form analysis targets level 4 in Figure 8.6.
3.4.1 Kinesic Aspects of Gesture Form
We distinguish four aspects of kinesic form that are simultaneously articulated: hand shape, orientation, movement, and location (level 4 in Figure 8.6, Figure 8.10). Accounting for these aspects of kinesic form targets a level of gesture analysis that is comparable to proposals for the description of sign language phonology. It is inspired by observations concerning sublexical structures of signs (Reference StokoeStokoe, 1960; Reference Wilcox and OcchinoWilcox & Occhino, 2016, p. 4). We know from signed languages how important even minor changes of one of these articulatory characteristics for the meaning and function of a sign may be. But how does this apply to gesture analysis?

Figure 8.10 Aspects of kinesic form that are simultaneously articulated
Let us reconsider the gestures used in Luis’ story-telling. Quite clearly, the speaker does not draw upon a lexicon of gesture “lexemes.” For example, five of the gestures show an interesting difference with regard to the location, that is, where in the gesture space (Reference FrickeFricke, 2007; Reference McNeillMcNeill, 1992) they are performed. While most gestures (G1, G2, G3, G5) are located in front of the speaker’s body, G4 is performed at the far upper left periphery of the gesture space. This is interesting and may trigger further research questions concerning how the body space is used in gesturing. In this example, it is clear that there is a difference between a semantically relevant location of gestures and a neutral gesture space. While the periphery of the gesture space evokes the actual location of the picture in another room and is thus part of a narrative space, the placement of the other three gestures is not part of the narrative space: For example, the picture in the story was not located in front of the speaker’s chest nor did it fall down in front of his body. Instead, the space is used as a kind of neutral space to depict the shape and falling-down of the picture.
The set of tools marked as “Aspects of kinesic form” may thus become relevant to the analysis of gestures created on the spot (singular gestures) but it applies also to more or less stabilized gestural forms (recurrent and emblematic gestures; cf. Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfBressem, 2013a; Ladewig, this volume; Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014b, Reference Müller2017a). Applying these tools to gesture analysis opens up pathways for a comparative analysis of co-speech gestures, cosign gestures, and historical changes in conventionalized gestures and signs (Reference MüllerMüller, 2019a).
3.4.2 Gesture Form in Terms of Motion Events
This perspective addresses gesture form as a bodily conceptualization of motion in a cognitive linguistic sense (see 4.1 below). There is quite a body of research that shows how gestures contribute to multimodal expression of lexicalized motion events (Reference Talmy and ShopenTalmy, 1985, Reference Talmy1991). Talmy’s typological analysis of lexicalization patterns of motion verbs as conceptual structures, and the motion event as a conceptual structure, inspired a large body of cross-linguistic investigations including on gestural expressions of motion events (Reference DuncanDuncan, 2002, Reference Duncan2006; Reference KitaKita, 1997; Reference MüllerMüller, 1998b, Reference Müller2015; Reference Özyürek, Kita, Hahn and StonessÖzyürek & Kita, 1999; Reference Özyürek, Kita, Allen, Brown, Furman and IshizukaÖzyürek et al., 2008). The basic idea is that languages differ typologically in how they lexicalize motion-event structure. For example, German and Spanish differ in where they lexicalize the path of a motion event: in a satellite of the verb (such as a prepositional phrase) or the verb itself, respectively. Kita’s work on Japanese and English use of gestures in motion-event description revealed that gesture usage may reflect those kinds of typological differences. This observation inspired a large body of comparative motion-event based research on gesture and speech, contributing to researching linguistic relativity; the question of linguistic worldviews and their impact on thinking while speaking (Reference Slobin, Levinson and GumperzSlobin, 1996).
Conceiving of gestures as expressing aspects of motion event as a conceptual structure – for example, as bodily performance of lexicalized motion events – may be systematized as follows (Reference MüllerMüller, 1998b). Hand-gestures may enact motion only, for example, when somebody talks about “going to New York” and performs a lax movement with a loose hand. Or they may express motion and path of motion, as when a hand moves down and up to describe the motion and path of somebody going down into a subway station and back up onto the street. Gestures may also express motion and manner of motion, as when somebody moves rotating hands forward to depict how a ball rolls down a street. In Luis’ story, G5 – the gesture that is used to depict the falling-down of the picture – is a gesture that contributes to such a multimodal expression of a motion event: The stroke coincides with the motion verb se cayó (“it fell”), depicting motion and a downward path of the lexicalized motion event.
A further, quite basic, dimension of gestural expressions of motion events is that every gestural movement is a motion event in itself. As such, every gestural performance is either a bounded or unbounded movement. In the example above, a clear case of a gestural expression of a bounded motion event is G5. The movement accelerates and has an accentuated endpoint. (For an elaborate kinesiological framework to gestures as movements of the body, see Reference BoutetBoutet, 2001, Reference Boutet2010; Boutet & Cienki, this volume; Reference Cienki and IriskhanovaCienki & Iriskhanova, 2018.) This characteristic of gestures as body movements may play out in the embodied conceptualization of (lexical and grammatical) aspects as bounded or unbounded event structures (Reference MüllerMüller, 1998b, pp. 158–167). Further views on this matter are offered by Reference DuncanDuncan (2002) and Reference Parrill, Bergen and LichtensteinParrill, Bergen, and Lichtenstein (2013). In a comparative study of French, German, and Russian, this was applied to the multimodal expressions of aspect (Reference Cienki and IriskhanovaCienki & Iriskhanova, 2018). In this case, a particularly clear correlation between the grammatical distinction of perfective and imperfective events and the bounded and unbounded performance of coarticulated gestures was found in French speakers.
Summing up, the tools for motion event analysis described above rest upon a cognitive-semantic understanding of language use and offer the possibility for semantic cross-linguistic studies of multimodal language use.
3.4.3 Gesture Form in Terms of Expressive Movement
Analyzing gestures as expressive movements takes account of the affective quality of body movements. As Bühler pointed out in his theory of expression almost a century ago, whenever someone makes a gesture, they perform it with a certain quality of movement: A speaker may outline a picture frame with a sloppy, tender, cautious, energetic, harsh, or explosive quality of movement (Reference BühlerBühler, 1933; Reference MüllerMüller, 1998b, Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and Teßendorf2013), as we see with Luis’ sequence of gestures: The molding round shape gesture G1 is made with a sloppy and lax movement quality, while the outlining round shape gesture G2 is carried out with great care and precision. It is possible then to describe the affective unfolding of multimodal discourse events as an intercorporeal, interaffective process of felt understanding between coparticipants in an interaction (Reference Horst, Boll, Schmitt, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemHorst et al., 2014).
This set of tools rests upon an experiential understanding of gesture usage and conceives of multimodal utterances as multidimensional experiential gestalts whose meanings emerge in a process of embodied perception (Reference Müller and KappelhoffMüller & Kappelhoff, 2018, Ch. 2). Conceiving of gestures as expressive movements implies felt understanding and addresses a quality that gesture and multimodal interaction share with film images, one in which gestures and film images are not seen as sequences of static images but as movement images or as movement gestalts (Reference MüllerMüller, 2019b; Reference Müller and KappelhoffMüller & Kappelhoff, 2018, Ch. 9). These movement images emerge in the perception of the viewers as an embodied experience of time, or as time-images (Reference DeleuzeDeleuze, 2008a, Reference Deleuze2008b; Reference Eisenstein and TaylorEisenstein, 1924/1998; Reference Plessner, Dux, Marquard and StrökerPlessner, 1925/1982). As such, they constitute felt intersubjectivity (Reference Horst, Boll, Schmitt, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemHorst, Boll, Schmitt, & Müller, 2014; Reference Kappelhoff, Curtis, Koch and SiegelKappelhoff, 2013, Reference Kappelhoff, Campe and Weber2014; Reference Kappelhoff and MüllerKappelhoff & Müller, 2011; Reference MüllerMüller, 2019b; Reference Müller and KappelhoffMüller & Kappelhoff, 2018, Ch. 9). Analyzing gesture forms as expressive movement is not restricted to one single gesture; rather, it may entail gesture units of variable complexity (Reference Müller and KappelhoffMüller & Kappelhoff, 2018, Ch. 2).
Tools for describing gestures as expressive movements thus open up a perspective on gestures as movement images and on multimodal utterances as multidimensional experiential gestalts. They allow for studying the affective qualities of gestures and for reconstructing processes of meaning-making as dynamic intercorporeal processes of feeling and perceiving (Reference MüllerMüller, 2019b).
3.5 Summary
The set of tools offered as methods for gesture form analysis addresses four basic aspects of gestural forms (Figure 8.6): (1) the boundaries of gestures as temporal forms, (2) a distinction of pointing from depictive and pragmatic gestures based on hand shape and movement, (3) the embodied motivation of depictive and pragmatic gestures as “as-if” actions, and (4) three possibilities to approach hands as body movement, that is, aspects of kinesic form, hand gestures as motion events, and hand gestures as expressive movements.
Note that the decision as to which aspects of gestural forms are deemed relevant for a given analysis of gestural forms depends upon the research question and the theoretical framework against which it has been formulated. Depending upon the research question, the analytic process might start from a gesture form analysis without sound (Reference Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfBressem’s 2013a and Reference Bressem2021 work on repetition are examples of this procedure) and only include speech analysis in a second analytic step. Another option is to start with gesture analysis and then move back and forth between a close analysis of the gesture forms and their embedding in contexts-of-use: Examples of this type of procedure are what Reference Bressem and MüllerBressem and Müller (2017) termed “gesture-first analysis of multimodal constructions,” or the analysis of multimodal metaphors (their dynamics and foregrounding, cf. Reference MüllerMüller, 2008; Reference Müller and TagMüller & Tag, 2010). Section 4 illustrates how the theoretical frameworks chosen influence the interpretation of gestural forms as contributions to multimodal utterances.
4 Context Analysis: Gestures in Multimodal Utterances
In this section, the second basic set of tools of MGA is introduced. Although in gesture studies the concept of “context” and the formula “context-of-use” are often used in a rather nonspecific way, it is nevertheless crucial in the analytic process to be explicit about one’s particular understanding of context and to reflect how this relates to the research focus one adopts and its theoretical framework.
Context analysis as an analytic step addresses the linkage of gesture forms with spoken utterances. The specific temporal relation of a gesture form with its context is vital to address “how gestures mean.” It is the synchronization of gestural forms with the flow of speech that is essential to disambiguate and specify the local meaning of gestures. Molding a round shape can be used to express all kinds of round objects, concepts, or even actions, and can take over different communicative functions depending on how the gesture is coordinated with the verbal part of a multimodal utterance. It is an essential of the analytic process of MGA to be explicit about the notion of the context applied and the theoretical framework adopted. To illustrate this essential relation between context-of-use and the meaning and functions of gestures, six possible ways of approaching context analysis are outlined (Figure 8.11). Box 7 indicates that further theoretical approaches to context analysis are possible.

Figure 8.11 Possible frameworks for context-analysis
The selection presented in this section is not exhaustive but describes the more frequent approaches to context in the current field of gesture studies. Each of these types of context analysis is described briefly, with reference whenever possible to Luis’ story (for context analysis as specific methodology, see Reference StreeckStreeck, 2009, Ch. 2).
4.1 Cognitive Linguistics: Context as Usage-Based Grammar and Semantics
A cognitive-linguistic analysis leads to context being considered from the point of usage-based grammar and cognitive semantics (Reference BybeeBybee, 2010; Reference Cienki, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfCienki, 2013, Reference Cienki2017; Reference Croft and CruseCroft & Cruse, 2004; Reference FillmoreFillmore, 1985; Reference LangackerLangacker, 2008). Here, the notion of context may refer to specific semantic contexts such as lexicalization patterns of motion events (Reference Talmy and ShopenTalmy, 1985, Reference Talmy1991) but may also address questions of viewpoint (Reference Dancygier and SweetserDancygier & Sweetser, 2012), conceptual blending (Reference Parrill and SweetserParrill & Sweetser, 2004), or metaphor and thought (Reference MüllerCienki & Müller, 2008b). Furthermore, protoforms of grammaticalization in gesture have been subject to a cognitive linguistic interpretation of context-of-use (Reference BressemBressem, 2021; Reference Ladewig, Müller, Cienki, Fricke, Ladewig, McNeill and BressemLadewig, 2014; Reference Müller, Müller and PosnerMüller, 2004, Reference Müller2017a; Reference Müller, Ladewig, Borkent, Dancygier and HinnellMüller & Ladewig, 2013; Reference Müller, Ladewig, Bressem, Müller, Cienki, Fricke, Ladewig, McNeill and TeßendorfMüller et al., 2013), as has the gestural expression of aspectuality in terms of event structure (Reference Cienki and IriskhanovaCienki & Iriskhanova, 2018). Context-of-use in cognitive linguistic terms could also include the grammatical integration of gestures in multimodal utterances (Reference LadewigLadewig, 2020). For a cognitive-semiotic framework for gesture analysis within a larger cognitive-linguistic usage-based approach to context, see Mittelberg’s work (Reference MittelbergMittelberg, 2017, Reference Mittelberg2019a, Reference Mittelberg2019b), or, taking a slightly different angle, Reference AndrénAndrén’s (2010) developmental study.
In the analysis of “hand gestures as movements” (Section 3.4.2), the motio-event perspective is formulated within a cognitive semantic framework (Reference Talmy and ShopenTalmy, 1985). The relevant “context” for gesture analysis is here a typological analysis of lexicalization patterns apparent in the motion verbs that are coarticulated with gestural expressions of motion events. This would lead to seeing Luis’ falling-down gesture G5, which is temporally synchronized with the motion verb se cayó, as expressing a path (downward). Analyzed with a cognitive semantic understanding of “context,” the gesture would be analyzed as following the specific lexicalization pattern of Spanish motion verbs which merge motion and path.
Another potentially relevant context within a cognitive-linguistic framework would consider the grammatical notion of “aspect” as a conceptualization of an event-structure that distinguishes bounded from unbounded, or perfective from imperfective, events. Luis’ gesture G5 would then be considered as a case of a bounded gesture. The analytic focus for the gestural movement would be its marked endpoint. When “aspectuality” is taken as the relevant usage-context for gesture analysis, a given gestural movement can have various functions at the same time. It can express a downward path and at the same time be performed in a bounded or unbounded manner.
These examples illustrate that form and context-analysis merge and are informed by the specific theoretical frameworks chosen; here, Talmy’s cognitive-semantic typology (Reference Talmy and ShopenTalmy, 1985) and a cognitive-linguistic understanding of aspectuality (Reference Cienki and IriskhanovaCienki & Iriskhanova, 2018).
4.2 Conversation Analysis: Context as Social Action
Considering gesture form analysis from the point of view of conversation analysis implies an understanding of context as social (inter)action. It highlights the temporal character of gestures as parts and parcels of sequentially structured interactive processes of conversations’ social organization (Reference Mondada, Müller, Cienki., Fricke, Ladewig, McNeill and TeßendorfMondada, 2013a, Reference Mondada, Müller, Cienki., Fricke, Ladewig, McNeill and Teßendorf2013b). It reveals how gestures participate in the taking of turns (Reference BohleBohle, 2007; Reference SchmittSchmitt, 2005) in cooperative actions (Reference GoodwinGoodwin, 2018) and shows how they contribute in a wide range of embodied communicative activities (Reference StreeckStreeck, 2009, Reference Streeck2017). Connecting gesture form analysis with conversation analytic perspectives may help to answer questions such as: Why does the speaker use the gesture “tracing a round frame” to depict this object at that very moment in the conversation? Why doesn’t he simply present the object on his open hand or not use a gesture at all, given that he provides similar “information” verbally: “a round frame”?
Considering the structure of turn-taking as a social activity, we see that both gestures (G2, G3) are coordinated with the core piece of the conversational turn: the turn-constructional component (Reference Sacks, Schegloff and JeffersonSacks, Schegloff, & Jefferson, 1974). Positioned at another site in the conversational sequence, they could function as turn-entry or turn-exit devices. At the beginning of the speaker’s turn, they would indicate the wish to become the next speaker; at the end of the turn, they would complete the turn, for instance, by filling up a speech-pause. If they were placed in the transition space between two conversational turns, they could indicate the wish to maintain the right for the succeeding turn. Furthermore, the fact that the two gestures are placed in synchrony with the verbal turn-constructional-component indicates that they are part of the most relevant part of his turn. By doing this, the speaker draws interactive attention to particular semantic aspects: the roundness of the frame and the fact that it carries a small crown, marking this information as particularly relevant for his attending coparticipant (Reference Müller and TagMüller & Tag, 2010).
Extending the scope of analysis and taking into consideration the larger conversational unit in which this little sequence of gestures is embedded, it becomes clear that the two gestures are part of a conversational activity of explaining something. Notably, this explanatory sequence is a consequence of a longer process of word searches (on the side of the speaker) and requests for clarification on the side of his interlocutor. In this sense, the extremely precise verbal and gestural description of the kind of picture frame, that figures as “main topic” in a narrative of the speaker, is a consequence of the conversational activities preceding them. The foregrounding activity therefore appears as an interactional consequence.
A conversation analysis of gesture forms reveals why the speaker places his gestures at this moment and place in his utterance. It focuses on gestures as communicative social activities and uncovers the concatenation of cooperative activities of coparticipants. Context for gesture form analysis is here the social organization of language use.
4.3 Discourse Dynamics: Context as Dialogic Process of Creating Mutual Understanding
Gestures often take part in dynamic processes of meaning-making. This concerns very small sequences of gestures, such as the succession of G2 and G3 (round-frame drawing followed by crown-placing), but it may also concern the use of gestures over larger time spans, as our example of Luis’ story shows. Used in a loose succession, each of the five gestures displays a different perspective on the same object and participates in different ways in creating the storyline: G1, G2 and G3 introduce the key object in the story (the royal portrait) by forming its shape and character as a royal portrait; G4 prepares the plot, for example, it locates the portrait in another room: and G5 is part of the plot, for example, it depicts how the portrait fell down all by itself. With these gestures, a salience structure on the level of the narrative is established, highlighting new rather than given information; information is foregrounded that is relevant and central to ensuring mutual understanding of the story told in a conversation.
Conducting gesture analysis from a discourse dynamics point of view starts from an understanding of context as a dialogic process of attempted mutual understanding. Cameron’s metaphor-led discourse analysis (Reference Cameron, Maslen, Todd, Maule, Stratton and StanleyCameron et al., 2009) offers a theoretical and methodological background for a form-based gesture analysis (Reference Kappelhoff and MüllerKappelhoff & Müller, 2011; Reference Müller and KappelhoffMüller & Kappelhoff, 2018). Another example for a discourse-based concept of context in gesture studies is McNeill’s adaptation of “communicative dynamism,” which addresses narrative structures as dynamically unfolding within discourse (Reference McNeillMcNeill, 2005, p. 55).
In short, analyzing gesture forms from the perspective of discourse dynamics addresses the contribution of gestures to various dialogic processes of building mutual understanding.
4.4 Expressive Movement: Context as Multidimensional Experiential Gestalt
An analysis of gestures as expressive movements starts from the assumption that the multimodal orchestration of speech and body movement forms multidimensional experiential gestalts which arise in the process of felt perception of interlocutors (Reference Müller and KappelhoffMüller & Kappelhoff, 2018). Contexts-of-use from this perspective are shared realms of experience, or experiential frames, which ground embodied meaning (Reference Müller, Zlatev, Sonesson and KonderakMüller, 2016, Reference Müller2019b) and, when they recur within a community, may become the stabilized “meaning” of a gestural movement (Ladewig, this volume; Reference MüllerMüller, 2017a).
Analyzing gestures in their contexts-of-use from a perspective of expressive movement assumes an understanding of gestures as temporal forms immersed in dynamically evolving contexts-of-use. As outlined and illustrated above (Section 3.4.3), conceiving of contexts as multidimensional experiential gestalts opens up a way of analyzing gesture forms as facets of embodied interaction where intersubjectivity arises from intercorporeal and interaffective understanding (Reference MüllerMüller, 2019b; Reference Müller and KappelhoffMüller & Kappelhoff, 2018). Such a perspective is in line with ecological approaches to gesture analysis (Reference Cuffari, Jensen, Müller, Cienki, Fricke, Ladewig, McNeill and BressemCuffari & Jensen, 2014; Reference Jensen and GreveJensen & Greve, 2019) or approaches to intercorporeality (Reference Meyer, Streeck and JordanMeyer, Streeck, & Jordan, 2017) and interaffectivity (Reference Fuchs, Meyer, Streeck and JordanFuchs, 2017; Reference Horst, Boll, Schmitt, Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemHorst et al., 2014; Reference Koch, Fuchs, Summa and MüllerKoch, Fuchs, Summa, & Müller, 2012).
Conceiving of gestures as expressive movements connects with Kendon’s analysis of gestures as movement phases, phrases, and units, that is, as temporal structures (Reference KendonKendon, 2004, p. 112). To define the “stroke” of the gestural movement, Kendon draws on Laban’s dance theoretical analysis of body movements as expressive movements (cf. Reference Kennedy, Müller, Cienki., Fricke, Ladewig, McNeill and TeßendorfKennedy, 2013). The concept of expressive movement has played important roles in the history of philosophical reflections of gesture: Reference WundtWundt (1973) developed a monadic, solipsistic understanding of the concept; Reference BühlerBühler (1933) saw expressive movements as communicative actions; and Reference Plessner, Dux, Marquard and StrökerPlessner (1925/1982) and later Reference Merleau-PontyMerleau-Ponty (1945/2005) underline the intercorporeal and interaffective nature of expressive movements. Granted by bodily perception, understanding is grounded in feeling movement and in moving together. Notably, this notion of expression was central to modern dance as well as to film theory, both facing an inherently temporally structured medium of expression (Reference Kappelhoff and MüllerKappelhoff & Müller, 2011; Reference MüllerMüller, 2019b; Reference Müller and KappelhoffMüller & Kappelhoff, 2018).
Such theories offer paths toward reconstruction of the “meaning” of gestures from expressive movements as multidimensional gestalts in dynamic contexts-of-use. It remains fascinating how, through Kendon’s original analysis of manual co-speech gestures, the temporality of gestural form has created a common ground for analysis in the field of gesture studies.
4.5 Metaphor Research: Context as Multimodal Expression of Metaphoricity
Including gestures in metaphor analysis involves a semantic perspective on the coarticulated verbal utterance. In verbo-gestural metaphors, gesture and speech work together in expressing metaphoricity (Cienki & Reference MüllerMüller, 2008a, Reference Cienki, Müller and Gibbs2008b; Reference MüllerMüller, 2008, Reference Müller and Hampe2017b). Gestures often enact the source domain of a verbal metaphoric expression. For example, when Barack Obama speaks about those American people who look back and cling to the past, and points backward over his right shoulder, the gesture locates “looking back in time” in the space behind the speaker, that is, “Back in Time” is seen as “Back in Space.” In this context-of-use, the gesture is an expression of time as space, where the past is associated with what is behind a speaker and the future is what lies ahead, as is typical for western European languages. Reference Nùnez and SweetserNùnez and Sweetser (2006) have documented how such conceptualizations of time may vary across cultures: For Aymara speakers of the high Andes, they demonstrated that the spatial location of future and past is reversed, and this is reflected in their gesturing.
A careful analysis of gesture forms and their temporal unfolding along an interaction may reveal that sometimes metaphoric verbal expressions evolve from gestural body movements performed in the absence of speech. For example, in a dance training study, the notion of the dancer as standing in the center of a coordinate system emerges as an embodied conceptualization of balance and anchoring (Reference Müller and KappelhoffMüller & Kappelhoff, 2018, Ch. 11).
Analyzing gestures in the context of multimodal expressions of metaphoricity uncovers how metaphors are experienced while speaking, whether – at a given point in time in a discourse – they are foregrounded, “alive and awake” for a given speaker, whether they receive interactive attention, and whether they are taken up and unfold within the dynamics of discourse (Reference Cameron and GibbsCameron, 2008; Reference Cameron, Maslen, Todd, Maule, Stratton and StanleyCameron et al., 2009; Reference MüllerMüller, 2008; Reference Müller and KappelhoffMüller & Kappelhoff, 2018; Reference Müller, Ladewig, Borkent, Dancygier and HinnellMüller & Ladewig, 2013).
4.6 Pragmatics: Context as Communicative Action
Applying a pragmatic perspective to gesture form analysis assumes an understanding of context as communicative action where language use is conceived of as performance of speech acts (Reference AustinAustin, 1962; Reference SearleSearle, 1969). Although speech-act theory has received a lot of criticism, the fact that we “do things with words,” as Austin put it, has inspired much gesture research. Kendon characterizes gestures as “visible actions” that are used as utterances and addresses the pragmatic dimensions of gestures in manifold ways (Reference Kendon, Duranti and GoodwinKendon, 1992; Reference Kendon2004, Chs 9, 10, 12). For example, his pragmatic analysis of the ring gesture (index and thumb forming a ring shape) focuses on the illocutionary force or more generally on the communicative actions performed with a given gestural movement (Reference KendonKendon, 2004, Ch. 12).
For the ring gesture, this means that only because the gesture is performed in the context of a communicative action, the action of picking up some tiny object can be transformed into a communicative action of making precise arguments (Reference MüllerMüller, 2017a). When such a gestural form is repeatedly used within similar contexts-of-use, these repetitions may stabilize and become a conventionalized gestural form, an emblem, such as is the case with the ring gesture used to express excellence and perfection in many Western cultures (Reference Müller, Müller, Cienki, Fricke, Ladewig, McNeill and BressemMüller, 2014b). Another example is the palm-up-open-hand gesture that may be used to depict the handing over of small objects lying on the palm of the hand, or may be used to hand over speaking turns (Reference Streeck, Hartge, Auer and di LuzioStreeck & Hartge, 1992), and thus have a pragmatic function (Reference Müller, Müller and PosnerMüller, 2004). This means that the analytic decision about whether such an as-if movement is used referentially or pragmatically depends upon its placement in a given context-of-use.
Connecting gesture form analysis with a pragmatic perspective addresses gestures as communicative action. It opens up a path to reconstructing the embodied history of some of them, a kind of “etymology” of conventionalized gestures as in the ring or the palm-up-open-hand gesture. It may document the spontaneous emergence of gestural communicative actions from manual actions (Reference StreeckStreeck, 2009, Reference Streeck2017), reveal processes of decomposition of gestural forms (Reference MüllerMüller, 2017a), and show how gestures contribute to multimodal constructions with a pragmatic meaning (Reference Bressem and MüllerBressem & Müller, 2017; Reference LadewigLadewig, 2020).
4.7 Summary
Section 4 has illustrated how context analysis implies a decision to work within a specific theoretical framework. To illustrate potential (and currently common) understandings of context for gesture analysis, a selection of frameworks has been briefly presented: (1) cognitive linguistics, (2) conversation analysis, (3) discourse dynamics, (4) expressive movement, (5) metaphor research, and (6) pragmatics. Some of the tools for gesture form analysis are intrinsically connected with a specific framework: For example, describing gesture form as a motion event implies a cognitive-linguistic understanding of context, whereas analyzing gesture form as expressive movement implies a notion of context where gestures, other body movements, and speech merge in a multidimensional gestalt. Other “tools” go along with several theoretical frameworks: For example, noting down aspects of kinesic forms can be relevant for all theoretical approaches; together with the description of hands as performing as-if actions, they allow for a form-based heuristics of gestural meaning and a gesture-first approach to the analysis of multimodal utterances. The framework chosen can be an entryway to analyzing different aspects and kinds of gestural meaning. Whether a certain gesture form is to be considered as depicting the concrete or the abstract, or whether it is used pragmatically, depends upon the context-of-use with which it is entangled. Applying different frameworks in form-based analysis thus uncovers different ways in which gestures contribute to multimodal utterances in unfolding interactions.
5 Summary and Conclusion
In this chapter, we have offered a toolbox of Methods for Gesture Analysis, and presented a set of possible takes on the analysis of hand gestures. The baseline of any analysis is an explicit account of the gesture form as a temporal form and its temporal integration in the dynamic unfolding of a discursive event. MGA provides the researcher with sets of tools that can and must be flexibly adapted to a given research interest. It offers researchers tools to creatively design their specific procedures and helps to make explicit the different potential frameworks for gesture analysis. Rather than prescribing a fixed set of procedures, or advocating a single specific theoretical approach to gesture analysis, MGA offers a flexible set of analytic tools that encourages critical reflection upon the insights one can gain from analyzing gestures in multimodal communication and interaction. It opens up a way to think about gesture analysis as creative and theoretically variable.
MGA encourages the researcher to creatively select, combine, and further develop tools such as those presented here. Acquiring more expertise with the tools will probably be associated with a need for more refined analytical tools for gesture form analysis and excite curiosity for a deeper understanding of the theoretical frameworks and their respective notions of context. The toolbox also allows extending the analytic scope to macrolevel analysis.
Further questions that are essential for conducting empirical research on gestures in multimodal utterances, but which have not been discussed in this chapter, include technical aspects of video technology related to data collection (gesture phase analysis depends upon the resolution of the video image, annotation systems for speech and gesture, software annotation tools, and motion-capture technology (Boutet & Cienki, this volume; Bressem, this volume; Trujillo, this volume).
Applying MGA to Luis’ story has revealed how analyzing the same piece of data from different methodological and theoretical viewpoints highlights different facets of the same little story. It shows that multimodal interaction is a highly complex phenomenon which cannot be accounted for by applying one and only one tool for gesture form analysis and one particular understanding of context. Whatever we select as a descriptive tool will reveal different facets of this fascinating phenomenon. The goal of the toolbox of MGA is to enable researchers to be explicit about their chosen perspective and to underline the relativity of any analytic attempt to understand this complex “thing” that we call gesture and speech, multimodal utterance, multimodal communication, or multimodal interaction.