Lip-Synching as an Account of Oneself: Digital Music Videos and the Voice–Gesture Relationship

DANIEL DE ANDRADE LIMA

doi:10.1017/rma.2024.27

Lip-Synching as an Account of Oneself: Digital Music Videos and the Voice–Gesture Relationship

Published online by Cambridge University Press: 20 January 2025

DANIEL DE ANDRADE LIMA

Article contents

Abstract
Part I
Part II
Footnotes
References

Rights & Permissions

Abstract

Digital-era music videos are a crucial part of singers’ mediatic performances. Lip-synching is often central to such products, supplying situations in which singers can mouth their voices while dislodging themselves from the struggles of singing. Looking into music videos by focusing on their lip-synching practices, this paper aims to understand the part voice takes on in the medium while also investigating how gestural lip-sync performances work as accounts of oneself that produce a musical subject, sometimes updating or overcoming social regulations. In this sense, lip-synching is theorized as a way of framing music videos’ gestural labour.

Type: Article
Information: Journal of the Royal Musical Association , Volume 149 , Issue 2 , November 2024 , pp. 291 - 314

DOI: https://doi.org/10.1017/rma.2024.27 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of the Royal Musical Association

Part I

Towards the Intimacy of Lip-Synching

In the music video Rager Teenager! Footnote ¹, Troye Sivan pops up on the screen just in time to casually start lip-synching. ‘Hey, you! Where you been hanging out lately?’, he mouths, lying in a bathtub, his shirt slightly opened and his head laying lazily on a wall, while his digitalized voice sings through an almost spoken vocal mode. Throughout the images, we can see Troye playing with what seems to be an engagement ring and slightly oscillating from seemingly more optimistic humour into an angry one. He sometimes misses some of the vocal lines, reminding us that he is not actually singing but jamming sadly to his own voice. He addresses us through the camera at some points, highlighting parts of the lyrics through his lip-sync performance in an evocative way: ‘why are you acting like a stranger?’. He even appears to scream in one specific moment through a blurred image while a digital sound loudly cries in the back of the song’s main vocals. ‘Would this strange digital sound be heard as a scream had Troye not lip-synched it as one?’, I ask myself.

Less than a month before, Troye Sivan had released his music video for ‘Easy’, in which — once again — he appears mostly alone, lip-synching his voice through an intimate and moody mode.Footnote ² Switching between subtle vocal fries and slightly nasalized melodies, Troye’s soft and low voice highlights the break-up lyrics of his songs. At the same time, his video performances help to reassemble his recorded voice into his body through lonely, sad, and sometimes crying gestures. A brief switch from YouTube — the online platform on which both videos were released — into his profile on Instagram could quickly make fans aware that he had ended his public relationship with his boyfriend and returned to his parents’ home in Australia during the Covid-19 outbreak of 2020. Singing about break-ups through lonely videos directed by himself, Troye seemed to stage a confession of some kind by performing an intimate relationship with his own recorded voice. In this sense, as in many other music video performances, the fact that the singer is performing through lip-synching and not simply singing seems to matter.

What Troye Sivan’s videos help introduce is how lip-synching in music videos has been used as an expressive device that sometimes projects senses of intimacy or takes part in a confessional statement. At first, Sivan’s videos seem to appear as intimate accounts of his own emotions because of the way they are aesthetically made, assembling his lonely image with close-ups and subtle lip-sync acts. Of course, this intimacy also has to do with how digital media have impacted how we experience music: we can watch Troye’s videos on our cellular devices, holding his face close to ours, caressing the screen due to its tactile technology, listening to his voice soar directly into our ears. In a way, music videos are important media products that seem to project the real bodies of the artists into the listener’s own space. As scholar Thiago Soares has claimed, ‘If we take as a central issue the principle that a music video “informs” a mediatic semblance, that is, it generates a body in the media, there is nothing more symptomatic than recognizing that, since the audiovisual itself is a body, it obviously belongs to the artist who stars on the video.’Footnote ³

Given this context, this essay is interested in approaching music videos of the digital era by highlighting how they often project a sense of publicization of the self. To do so, I focus on the effect that lip-synching has on the way an artist’s recorded singing is reinscribed onto their bodies. It is understood, after all, that, through playback technology, singers can often establish gestural relationships with their voices that are not connected with the efforts of singing. Stressing the relations between visible and vocal gestures, I am interested in understanding how lip-synching and its performative engagements work as accounts that deal with artists’ broader narratives. In other words, I will argue that framing lip-synching as multiplying artists’ self-narration can help emphasize music videos’ performative struggles. By following this path, I take into consideration perspectives that investigate how artists’ personae are constituted through an assemblage of different media and texts. To analyse music videos’ gestures, after all, we need to be attentive to how a video’s particular performative efforts can establish connections to wider biographical narratives — as Troye Sivan’s post-breakup videos helped introduce.

Drawing from Judith Butler’s theory on accounts of oneself,Footnote ⁴ Brazilian authors Jeder Janotti and André Alcântara have examined how music videos, especially in the digital era, work as accounts through which musical subjects negotiate music cultures’ ethical norms.Footnote ⁵ Moving their arguments forward, it is my intention to emphasize the audiovisual aspects of such accounts. In this sense, when analysing music videos through a lip-sync lens, I will demonstrate that lip-synching can become an account of oneself on at least two levels. On the first one, all the gestural efforts employed in the lip-sync act can already be understood as an account that projects a persona. After all, the particular ways through which voices and visible gestures are assembled can stand for different things in each specific music video. On a second level, the way music videos are produced in terms of narratives and how they connect with other performances in media also appears as an account in one broader sense. To put it differently: when framing lip-synching as accounts of oneself, we are both analysing a video’s specific gestures and investigating how such gestures are connected to the discourses that sustain artists’ larger mediatic apparitions.

To develop such arguments, this paper is organized into two parts. The first part builds the theoretical path required to investigate lip-synching in music videos, discussing music videos’ post-TV status, theories on musical personae, and the expressive potentials of lip-synching. In the second part, a small collection of music videos will be analysed to demonstrate the importance lip-synching can take in singers’ self-narrations. In this sense, an analysis of Beyoncé’s Grown Woman (Bonus Video) helps posit music videos’ biographical enterprise while exploring how lip-synching stresses that music videos’ gestures are mediatic ones. Clarice Falcão’s Survivor highlights the connection between lip-synching and vocal expressivity, supplying materials to further develop the relationship between playback technologies and gendered notions of intimacy. Lastly, Janelle Monáe’s Cold War takes us to the moment when lip-synching collapses, exposing lip-synching tropes and their racialized and gendered connotations. Conclusively, I argue for lip-synching as a critical tool to investigate artists’ accounts of themselves due to how such performances update, tackle or reinforce particular aspects of artists’ wider personae.

Post-MTV Music Videos

Most discussions on music videos highlight the importance MTV and other music television networks had on the marketing and aesthetics of such audiovisual products. In his book Music Video After MTV, for example, Matthias Bonde Korsgaard criticizes the often-spread idea that music videos started on MTV by addressing previous cinematic and film experiences. He cites media such as silent films, musical and animation movies, and the films contained in visual jukeboxes of the 1940s.Footnote ⁶ Thiago Soares, who mainly discusses television-era music videos and their resonances in digital works, also investigates practices from the beginning of the twentieth century.Footnote ⁷ These include recorded jazz performances of the 1920s and other promotional videos accompanying musical releases in the following decades. Such studies help us understand that music videos ‘came into existence only gradually and as the offspring of many different precursors’.Footnote ⁸ Both Soares and Korsgaard, however, seem to agree that MTV helped to establish a way of marketing singles through audiovisual media and to cement the kind of importance music videos would come to have in artists’ careers. Of course, taking music videos as a medium also requires framing them in relation to how they remediate other media. In this scheme, music videos are constantly being altered by both new and old media while also altering them interchangeably.

Most of the time, when understanding music videos as a medium, we are considering corporate audiovisual products that have been officially produced as music videos. Another approach would be to frame videos that might be taken as music videos by the way they are used by audiences that attribute them the function of one, even though they were not necessarily produced by an official studio or by an artist. Carol Vernallis considers this possibility by addressing vernacular videos, parodies, remixes, film trailers, and various videos that could be taken as music videos.Footnote ⁹ She seems less interested in defining music videos and more in investigating how different clips can project, update, emphasize and overcome sensibilities that seem to be spread with the rise of digital platforms and their specific interfaces. Korsgaard also follows this lead by assembling what he names ‘post-music video’, referring to interactive videos, remixes, remakes, user generated content, among other possibilities, but mainly analysing videos that are somehow officially connected with bands or artists.Footnote ¹⁰

While Vernallis’s and Korsgaard’s discussions revolve around aesthetic effects evoked by music videos, others have addressed the medium by approaching the digital environment in which they circulate more extensively. In her work on music videos, Simone Pereira de Sá engages with the way videos circulate and create connections through digital platforms instead of analysing the aesthetic audiovisual features of an ensemble of products. Her definition of a post-MTV music video is on point, for it refers to

a heterogeneous set of productions that circulate preferably on the YouTube platform, spreading through other environments; and which covers a set of audio-visual fragments of heterogeneous origins ranging from a video of a concert posted by a fan, going through the infinity of parodies, tributes, and reaching the ‘professional’ videos that promote the new songs of singers with (more or less) established careers.Footnote ¹¹

Framing such ‘post-MTV music videos’, Pereira de Sá often investigates how music videos are approached, discussed, and shared in different digital environments, discussing how such mediations are part of the music experience with new media. In this context, it is important to consider YouTube’s archival properties, which allow the circulation of both old and new audiovisual products, reconfiguring their lifetime. In one sense, the different audiovisual products that can be taken as music videos’ precursors are archived and constantly revived on YouTube in a way that they end up working as digital-era music videos — which reiterates how music videos both remediate and are remediated by other media. In another sense, YouTube also assembles interviews, film scenes, and other audiovisual materials that become intertwined with our experiences of listening to music.

Given this panorama, it is necessary to consider how mediations that reconfigure music videos in the digital era impact the way music videos are produced and watched. It is important to consider such mediations even when discussing ‘official’ or ‘corporate’ music videos. The sense of intimacy we might feel by watching Troye Sivan’s music videos, for example, seems to be highlighted by how these audiovisual products are inscribed in digital platforms. Swapping from platform to platform, users connect quotidian photos and videos on Instagram, interviews and tabloid news (reposted and discussed on different social media), and music videos (fully uploaded on YouTube, but with bits and pieces scattered through other platforms). Official music videos can sometimes be taken as revealing due to how they relate to other mediatic apparitions on interviews and social media that allow fans to engage with narratives about relationships, coming-of-age struggles, and solitary feelings, to cite some. In this context, it is this assemblage of media materials that connects us with artist’s subjectivities.

Framing Personae

Different authors have discussed the music video’s particular potential to build the subjectivities of an artist. Thiago Soares has, as mentioned before, posited a ‘mediatic semblance’ that is projected from the music video performance, understanding it as a way of building an artist’s body on the surfaces of media; it is in this sense that a music video can, after all, ‘perform’ a song. Other approaches revolve around the notion of ‘persona’; Philip Auslander has discussed the theme in his work on music videos, suggesting a ‘performed role that is somewhere between a person’s simply behaving as themselves and an actor’s presentation of a fictional character’.Footnote ¹² In this sense, artists build personae both in and off musical performances.

Lip-synching plays an important part in how music videos can present personae. Indeed, Auslander argues precisely that lip-synching acts ‘provide performers with good opportunities to define and extend their personae’.Footnote ¹³ After all, to think about how singers perform their own voice in music videos is to investigate prolific performative moments of how they express themselves vocally and gesturally. This is true when a music artist is trying to conceal their lip-synching and convince us that they are actually singing, but also when they are intentionally dislodging their gestural performances from their previously recorded sounds for some reason.

In any case, we need to acknowledge that all music videos that showcase performances of ‘singing’ — unless they originated from recordings of live performances — are lip-synched. In fact, the way videos have been dealing with lip-synching has been often playful and relates to many ways of approaching voices in audiovisual media. In some videos, like in Lady Gaga’s 911 or Lizzo’s Truth Hurts, voices travel through different mouths to emphasize specific narrative situations.Footnote ¹⁴ In these cases, lip-synching helps create a dialogue or give life to different characters by going from one person to another. In other videos, like in BTS’s and Halsey’s Boy With Luv or Blackpink’s Kill This Love, each singer lip-syncs their own voice, claiming their vocal authorship and helping the public discern to whose voices they are listening.Footnote ¹⁵ Sometimes, only some particular song parts are lip-synched, highlighting precise words or vocal efforts. This happens in varied genres, from Brazilian Indie band Guma’s Jugular to countertenor Jakub Jozef Orlinski’s Vivaldi: Stabat Mater (Official trailer). Footnote ¹⁶ In each of these cases, it is not only the fact that artists are lip-synching that matters but also how voices are claimed through particular and expressive gestural performances that assemble different audiovisual resources.

In this sense, the ways artists lip-sync in music videos can stand for different facets of how they articulate themselves with their personae. When Troye Sivan appears sadly jamming to his own voice in his post-breakup music videos, for example, he detaches his oral gestures from the act of singing in a way that projects loneliness and lack of vitality. This gestural approach differs significantly from other ways of dislodging mouth gestures from vocal sounds. In Beyoncé’s Single Ladies (Put a Ring On It), for example, the singer’s vivacious dancing is matched with a lip-synching that claims her ownership of the main vocals in the song without, however, portraying believable efforts of singing.Footnote ¹⁷ In this sense, neither Beyoncé nor Sivan attempts to convince us of the authenticity of their ‘live’ singing, but each lip-synching projects different expressivities. Sivan’s gloomy mood helps reinforce his sadness, while Beyoncé’s fierce dancing projects her bodily vitality.

That said, when discussing artists’ personae, we cannot lose sight of how such narratives are built from a network that connects a plurality of discourses, especially when investigating personae in digital music videos, which hold connectivity as a principle. We can understand Beyoncé’s dancing lip-synching as ‘fierce’ and Sivan’s bored mouthing as ‘gloomy’, after all, because we have access to how these artists present themselves in other situations and in relation to their different musical styles. In this sense, their gestures become legible through a plethora of culturally understood gestures and signs. Sivan is often connected to indie pop’s introspective traditions, while Beyoncé came from a dancing pop and R&B background that helped build her pop diva persona. This way, our expectations about specific music genres can be contemplated or surpassed by specific music videos’ performances as we are informed by how such videos are articulated with a broader group of texts. Different scholars have approached the issue through varied theoretical enterprises.

William Echard’s approach, for example, invests in Julia Kristeva’s theory of intertextuality to understand how artists’ identities are constituted by the agency of both subjects and texts.Footnote ¹⁸ His perspective is particularly interesting because he posits personae as subjectivities that are unstably forged through the dialogical connections between heterogeneous texts. Similarly, but adding a different theoretical path to Echard’s scheme, Kai Arne Hansen also understands that personae are multiply constructed, arguing that ‘any text can serve to augment a listener’s knowledge of the persona, regardless of the listener’s intention to approach the text for that purpose’.Footnote ¹⁹ His argument involves taking a transmedial approach to the persona theory, given the ways artists’ mediatic apparitions are spread through different platforms and media — as Troye Sivan’s break-up anecdote helped introduce. In this sense, Hansen is deeply tuned to how our experiences with popular music can rely on a digital network of additional texts that inform our listening, which resonates with Pereira de Sá’s take on music videos.

Digital Accounts of Oneself

Music videos play an important part in the networks that build artists’ personae. According to Jeder Janotti Jr and João André Alcântara, they in fact act upon the construction of such networks. Music videos, after all, model ‘connected listening’, ‘a set of heterogeneities that integrate music, audiovisual products, interviews, and participation in films and soap operas, among other materials, assuming plots of intrigue (narrations) that may even have the music video as a basic element, but that is not limited to it’.Footnote ²⁰

In such a context, interacting with digital platforms, screens, cameras, microphones, genre rules, and sonic environments assembles different modes of interaction for users. These interactions are constituted through the engagement of varied bodily postures and acts of narrations due to the ways digital platforms stage gestures, choreographic traditions, and discourses about us and others. All these gestures and narratives do not simply mask the subjects but constitute them through diverse, sometimes incoherent, acts. In a connected listening context, music videos appear as one of the important acts through which artists project not a finished persona but one contingent account of themselves. In music videos, each vocal and visual gesture — and the relation between them — is a term from which a performative narration is built. Since lip-synching is a performative mode often used to connect artists’ visible and vocal gestures, it also appears as an important locus of investigation of the terms artists use to create their mediatic selves. However, this does not mean lip-synching gives access to artists’ intentions nor that their personae are mere voluntary creations. As Hansen and Echard have thoroughly discussed, a persona should not be understood as one schematic construction conducted solely by the artist.

Trying to handle complex narratives built by conjunctions of performative acts while also disengaging with a fiction/reality dichotomy, Janotti and Alcântara use an exciting term to deal with corporate or official music videos. Inspired by Judith Butler’s Giving an Account of Oneself, they posit how such products have been working as ‘accounts of oneself’ that become intertwined with other performative accounts that emerge through different media. In Butler’s discussions, giving an account of oneself is a critical instance of how subjects both produce themselves and forge coherence into their heterogeneous actions, sensibilities, and experiences. The philosopher understands that ethics plays a vital role in this process. Ethics provides, after all, the materials with which we can narrate ourselves, whether from the terms available in our aural repertoires or by supplying the standards we use to assess the behaviours we refer to when talking about ourselves. On the other hand, it is precisely in an attempt of vital assimilation with ethical norms that subjects end up fragilizing themselves when they compose themselves as an ‘I’, given that our experiences are reshaped as we tell them. In other words, the fact that accounts of oneself always fail to fulfil ethical norms can expose the contingency of subjection.

Most normative ideas of subjectivity idealize the subject as one essential, coherent, and authentic being. In this sense, subjects’ contingencies help stress people’s fragile stability. Also, the terms we use in our acts of narration were not founded by us and, therefore, need to be constantly questioned and revised as we use them. Of course, an account like this exists amid what Butler calls ‘scenes of address’, which puts people in contact with each other and societal regulations. In a way, we are often required to explain whom we are in a reiterated way, negotiating with the ethical values implied in each scene of address. In Butler’s theory, ‘the “I” has no story of its own that is not also the story of a relation — or set of relations — to a set of norms’.Footnote ²¹ In this sense, an account of oneself always requires some deliberation on the conditions from which it emerges while also including these very conditions in the account. Through this complex scheme, though, to narrate our lives is always a gesture of self-poesis, in which ‘I am always recuperating, reconstructing, and I am left to fictionalize and fabulate origins I cannot know’.Footnote ²²

To take music videos’ as accounts of oneself, however, we need to consider that they are constituted mainly through gestural efforts and, therefore, are not necessarily verbal accounts. This is foreshadowed by Butler’s arguments, which concern how the body is formed through gestural regulations that stem from scenes of address. As Alcântara and Janotti posit,

In addition to what we say when we are questioned, silence, facial expressions, positionalities, and bodily presence are also part of our accounts. This is because, in Butler’s view, if we can read such gestures, if we have the ability to understand what a nod of the head or a raising of eyebrows means, it is because these gestures are embedded in the same codes and languages that contextualize them and socially contextualize us.Footnote ²³

Following their arguments, it is important to understand that vocal traditions, citational gestures, familiar camera framings, and refusals to engage with mediatic tropes constitute the ‘terms’ from which an account emerges in music videos. Lip-synching interconnects gestures, vocals, and the employment of audiovisual technologies, highlighting some of popular music’s performative traditions. When doing lip-sync, after all, artists make embodied choices in specific ways and produce their audiovisual bodies through particular employments of playback technologies. To use voice in a way (and not in another) or to lip-sync a vocal recording in a way (and not another) projects the choices artists make when giving a performative, musical, account of themselves. In this sense, I argue music video’s lip-synching produces an account of oneself through the way such performances help build bodily density to artists’ personae.

In such a context, music genres provide artists with many of the norms to which they got to answer. The ‘terms’ we expect Troye Sivan to use, for example, are generally different from the ones we would expect from a rapper like Missy Elliot. This touches on the music genres they are articulating themselves with and the racialized and gendered narratives such genres carry. Music genres, after all, ‘have histories and personalities and almost seem to exert a form of agency in the way they constrain and guide social actors’.Footnote ²⁴ Is it not true that the ways Black women appear in hip hop and R&B culture tend to have specificities that differ from Sivan’s white figure in his gay-themed indie pop? That said, as Auslander has argued, ‘this does not mean that any musician’s persona must adhere to these norms, only that they are inevitably a point of reference’.Footnote ²⁵

Rock and pop traditions, for example, hold very different histories of how they deal with lip-synching. Historically, rock music has been frequently associated with the idea of ‘live music’. On the other hand, pop history has notoriously been filled with lip-synching scandals. In BBC’s Top of The Pops, bands were famously forced to use playback tracks of their own instrumental and, sometimes, vocal tracks. Some rock performances, such as Nirvana’s in 1991 and Oasis’s in 1995, engaged with their own sounds in an exaggerated way that intentionally exposed the playback technology. These performances helped project rock artists’ subjectivities through a grammar of mockery while engaging with a notion of ‘authenticity’ widespread among rock artists and audiences. Curiously enough, both bands seemed to seriously engage with lip-synching in some of their corporate music videos, which reassures the crucial placement such videos have as ‘serious’, hardcore performances. Moreover, when discussing concealed lip-synching performances in pop music, Merrie Snell poignantly notes that the terms used to criticize lip-synched pop performances often correlate with ‘predominant “rockist” ideologies’.Footnote ²⁶ This leads her to investigate a gendered perspective in which young feminine singers (or artists whose audience is predominantly feminine or gay) tend to be the aim of most strong criticism when lip-synching becomes a matter.

Of course, lip-synching in music videos is generally the norm; most music videos use playback technology and assemble lip-sync performances. Even so, music genres traditionally linked to dance — like pop music — tend to employ lip-synching very differently than genres more strongly embedded into ‘rockist’ notions of authenticity, for example. While evident lip-synching is rather common in female pop music videos, traditional male rock bands and rappers often stage the gestural efforts of live performance (or even use live performance recordings as music videos). Evidently, this is a tendency — not a rule. Either way, to grasp lip-synching expressive potentials, we must engage more extensively in discussions on the voice-gesture relationship.

The Voice–Gesture Relationship

As discussed, most of the time, lip-synching open spaces where visible gestures and voices can either come together or drift apart from one another. Freya JarmanFootnote ²⁷ has discussed the issue in a paper on the intersections between camp, queerness, and voice. Jarman addresses diegetic lip-sync scenes in film and television, in which the characters are openly mouthing another singer’s playback. For Jarman, open lip-synching appears as something that negotiates a gap between the body lip-synching and the recorded voice. This gap might appear in many ways. Jarman mentions, for example, a gap between the sex of the person singing and the sex of the actor mouthing the voice (when these characteristics are believed to be evident). Moreover, there is also a gap between the gendered performance of the character we see and the gendered connotation of the music. Such a gap is also thought of in terms of race, weight, perceivable health, and age, among others, since gender is also a ‘function of these attributes just as it is a function of biological sex’.Footnote ²⁸ Nevertheless, the gap might appear in other less obvious ways, such as through the temporal distance between a recorded sound and the lip-sync performance. There might also be a gap between the song’s and scene’s sonic spaces in an even less evident turn in Jarman’s theory. Each lip-sync can, in this scheme, project particular and uncanny gaps.

Of course, Jarman’s theory revolves around the open lip-sync performance, which is at first distinct from music videos where a singer is often seen lip-synching their own voice. Even so, in music videos’ lip-syncs, singers are also often ‘openly’ lip-synching. The voice — even if it is their own — has also been recorded in another time and some other space (generally that of a studio), being then processed digitally, edited, and mixed in a way that the lip-synching is always a negotiation with a gap. Lip-synching is, then, a way to deal with a voice that, once their own, has now become alien. When watching music videos, we generally know singers are not singing live, even if sometimes we might believe in the audiovisual fiction for one ephemeral moment. Because of this condition of most audiovisual music, Jarman states that ‘the distinction between “sync” and “nonsync” sounds is something of a fiction, but open lip-sync scenes expose that fiction and play on it’.Footnote ²⁹ I would add that music videos acknowledge that fiction most of the time — by allowing singers to mouth their own voices, dislodging themselves of the struggles of singing — while also trying to create performances that are related to the audibility of their voices.

When referring to lip-sync scenes in cinema, Snell mentions the term ‘absent performance’, firstly posited by Estella Tincknell, to address the songs and voices assembled in such scenes.Footnote ³⁰ In her scheme, these recorded voices refer to absent performances in which the singer or vocal performer is not present. This absence opens voices to other connotations as they travel into other people’s mouths through new contexts. This way, when voices are re-embodied through lip-synching, the mouthing performance claims the voice and opens it to new meanings by providing visual performances to the voice. When someone is lip-synching their own pre-recorded voice, the artist is present. However, the performance opens a space where the artist can claim the recorded voice as theirs while also attributing them a new expressive gestural performance. In official or corporate music videos, it is often difficult to attribute a single original performance to the voice we hear, as the recording is usually a composite of different vocal tracks. In these cases, it is possible for the audiovisual product to take the place of an ‘original’ performance, filling the gap of its absence. It is difficult, for example, to listen to Beyoncé’s album version of ‘Single Ladies’ without being quickly reminded of her music video performance, where she danced and lip-synched her song.

Moreover, it is important to emphasize the relation between lip-synching and audibility. It is precisely when this link is properly made that, as Snell posits, lip-synching can act as an expression of oneself. When analysing open lip-sync scenes, Jarman is very attentive to the gestural effort around lip-synching, mentioning, for example, how Will Smith — in one episode of The Fresh Prince of Bel-Air — deals with Jennifer Holliday’s excessive vocals in ‘And I Am Telling You I’m Not Going’. In one of the most dramatic vocal moments in the song, when Holliday’s voice gets more over the top, ‘[Smith’s] face becomes distorted with emotional pain, and agape for the vowel, while his head shakes vigorously to capture the vibrato in the voice’.Footnote ³¹ In this sense, lip-synching also appears as an inventive listening performance that projects one gestural engagement with a given performance of singing. This illustrates a strange process where the alien voice seems to be internalized — creating a sense of interiority — while confusing the static notions of interior/exterior; Snell has referred to this as ‘receiver externalization’.Footnote ³²

On this matter, most discussions on voice have considered how listening to a voice is also to be provided with a body, and to imagine the movements and gestures this body has gone through in the process of voicing. Brandon Labelle goes far enough to question if the voice might ‘be thought of more as a tension — a tensed link, a flexed respiration, and equally, a struggle to constitute the body, rather than a disembodied sound’.Footnote ³³ In this scheme, these struggles of voicing to constitute a body (and a subject) appear through a gestural lexicon — as choreographies of mouthing and breathing, as vibrations that crosses the body, as tongues that twist in phonation. These vocal gestures seem to be sensed, recognized, fabled, recreated, or bent during most lip-sync acts.

Overall, both Snell’s and Jarman’s work helps establish the embodied and identification-based process that listening can be. They also posit lip-synching performances as creative practices that express personal ways of gesturally engaging with a voice; it is in this way, after all, that lip-sync performances can stand as personal accounts. Moreover, the visual performance we see informs our listening, and vice versa, by shaping or emphasizing some gestural struggles — hence the moment where Troye Sivan’s distorted image mouths a ‘scream’, shaping a digital sound into an emotional cry. This is important when discussing music videos since we have access to the final assemblage of visual gestures and audible voices, but we are often not able to recollect how these assemblages have been made. Has the artist lip-synched to their own voice when recording the music video? Have they sung along? Has the final musical track been only finished after the video was ready? In any of these cases, the lip-sync happens not only in the playback moment in the video studio but by the way image and vocal tracks are brought together, producing gestural performances through audiovisual technologies.

Since lip-synching is a creative gestural practice, the gestural struggles connected to the lip-sync act in music videos are an important part of how artists’ accounts are sensually, bodily, and technologically made. To frame such accounts by investigating lip-synching, we will need to dive into microanalyses that deal with vocal and visible gestures, acknowledging that describing lip-synching is a way of stressing some gestures and emphasizing their agency. The challenge here is to investigate the intricate and complex bodily choreographies that claim voices (most of the time by bending them) while also addressing the implications of understanding music videos in relation to the transmedia narratives into which they are built. Such framing involves writing texts that dive into particular audiovisual performances while also considering the context of connected listening from which such performances stem.

Part II

‘Grown Woman’ and the Autobiographic Enterprise of Beyoncé

‘Ready? Ok! Spin around, all right, ready? Begin!’, sings a child Beyoncé, accompanied by her old-times friend Kelly Rowland; they sing in girly voices, in what seems to be archival footage. With the clap of their youthful hands, ‘Grown Woman”s beat starts to soar. A flickering image, reminiscent of the VHS tape technology, transports us from a photograph of a teenage Beyoncé in a fancy white dress, posing among her awards and prizes, to a shot of a grown-up Beyoncé. With a chewing gum visibly moving around her mouth, she softly but confidently starts to lip-sync her own voice, which sings vigorously, stressing each syllable. At one point, we can even hear a scream with a guttural flair, but, on screen, Beyoncé casually opens her mouth in an almost out-of-sync way.

Grown Woman (Bonus Video) goes forward by mixing footage of Beyoncé in different times of her life.Footnote ³⁴ Sometimes, non-archival footage recreates comically original ones; other times, they are technically manipulated to take the place of the original footage. The music video is edited to make Beyoncé almost always mouth her own voice, even in the bits taken from her childhood footage. In one strange way, Beyoncé’s grown-up voice is attributed to herself as a child, a teenager, and so on, by assembling images of her rehearsing with her teen girl groups or amusingly singing in her parent’s living room. The editing is built so that, even though we know it is impossible, Beyoncé as a child, as a teenager, and as a grown-up, seems to be singing/mouthing the same song and voice. Notoriously, such a way of building coherence contradicts the notion that personae are contingent — or, to put it differently, the music video tries to overcome subjects’ unescapable instabilities while assuring us that Beyoncé has always been somehow the same.

Officially featured as a bonus video in her self-titled album Beyoncé in 2013, Grown Woman (Bonus Video) helps emphasize what the album seemed to be all about. Beyoncé follows the singer’s previous album 4, the first one she recorded after her father, Mathew Knowles, had stopped managing her career. 4 was also the album where Beyoncé mostly projected herself as a producer for the first time.Footnote ³⁵ Beyoncé’s self-titled visual album follows the same path while addressing even more her intimate life.Footnote ³⁶ The album assembles videos that address feminism directly (in Flawless), her sexuality and marriage with Jay-Z (XO, Partition, and Drunk in Love), her motherhood (Blue) and her links with Blackness and Southern heritage (No Angel), settling her tendency to use her music videos as a way of establishing her corporate work as the main source of narratives about her own life — as discussed by Suzana Mateus.Footnote ³⁷

The video for ‘Grown Woman’ seems to perform specifically this sense of autonomy. This is done by assembling images that create some coherence between Beyoncé’s youthful training, her child diva gestures, and her established prosperous place as a diva in the present day. In this process, Beyoncé sometimes appears casually mouthing (as when she is chewing gum and mouthing her voice simultaneously). At other times, as when she appears as a little child, her voice is attributed to the archival footage in moments when her younger self seems immersed in the effort of singing. Curiously enough, the visual gestures mostly seem to match the vocals in her younger mediatic apparitions. In such situations, her younger self is seen performing affirmative diva gestures, with her mouth rhythmically emphasizing the singing. Even though there is a playful and uncanny effect in how an adult’s voice can be attributed to a child’s gestures, it also builds strange synchrony due to how her younger self is immersed in the efforts of voicing.

This kind of editing helps build what Thiago Soares has discussed when referring to image strategies that emphasize certain parts of popular songs, addressing the audience in invitational ways. These ‘hooks’ often have to do with lip-synching performances. Soares mentions, for example, the way the tears roll through Sinéad O’Connor’s face while she mouths her voice in Nothing Compares 2 U as one crucial hook that performs intimacy.Footnote ³⁸ Another example would be the way top models lip-sync George Michael’s vocals in Freedom! ’90, sometimes dramatically performing the song but other times melancholically jamming to it in resemblance to our own listening practices.Footnote ³⁹ In Grown Woman (Bonus Video), each time Beyoncé’s younger self mouths a phrase, her singing seems to be highlighted by the uncanny match. It is not by chance that the first time we hear Beyoncé’s voice claiming to be a ‘Grown Woman’ that ‘can do whatever I want’; we are actually seeing a mix of her younger self singing through stereotypical diva gestures in addition to the footage of her teenage self consistently rehearsing and snapping her fingers. These R&B gestural performances match Beyoncé’s vocal gestures, which assemble styles associated with feminine Black vocal traditions.

In this sense, the way Beyoncé openly uses playback technology reminds us of the playfulness of her music video. At the same time, lip-synching is employed in a way that makes her archives reinforce her present self. Since she provides fans with personal footage through lip-synching, she can use such footage to highlight the supposed coherence of her diva persona — her gestural attitudes, we are led to believe, have not changed throughout the years. Beyoncé’s Grown Woman, then, seems to tell us a story of being grown and of growing, practicing, and ‘grinding’ from the childhood days, through the effortful teenage years, until the established diva success. It appears as a way of addressing the constant narrative that projects Beyoncé as someone not only talented but who worked hard her way up to success while still cherishing her values (sustaining, for example, her friendship with Kelly Rowland). In this context, her voice — passing through her different selves and personal archive footage — helps manufacture this notion of coherence that seems vital in her autobiographical take.

As previously discussed, lip-synching in music videos is not created solely through artists’ gestures but also due to the ways audiovisual media can create a body by employing playback technology in specific ways. Since Beyoncé creates her lip-sync through editing, cutting, mixing, and employing digital effects, her video reminds us that her body is mediatic. Of course, the video is part of Beyoncé’s first visual album, a format that helped establish her as an author. As Ciara Barrett has argued, after all, Beyoncé’s and other Black singers’ visual albums, such as FKA Twigs, are ‘self-consciously invoking a filmic mode of representation and spectatorship across formally experimental audio-images to exert fuller control over their audio-self-images’ narrativization and signification’.Footnote ⁴⁰ In this sense, being part of a full visual album, such evident mediatic gestures (that can reframe her own biography) seem to reinforce her authorship over her music video image.

There is, in this sense, always an effort to create coherence that can be perceived in such music videos — and that is emphasized in Beyoncé’s Grown Woman (Bonus Video) by the perceivable gap that reminds us that her account is a creation of present days. In other words, if the lip-synching is an integral part of her account, it is also one of the aspects that remind us of its contingency. Either way, Grown Woman (Bonus Video) helps to emphasize the autobiographical enterprise music videos can be — and, by relying heavily on lip-synching, it is a primarily gestural one. In a way, Beyoncé seems to project how her consistent training in R&B and her citational Black diva gestures are embedded into her vocal and body formation. At the same time, we are constantly reminded that an account is a gesture of the present and that its coherence can only be fully manufactured when we are looking back and productively working on the past.

Clarice Falcão’s ‘Survivor’, Personal Feelings and Political Engagement

Clarice Falcão’s Survivor begins with the singer’s face popping up on the screen as she opens her blue eyes, staring at us directly through the lenses.Footnote ⁴¹ She is framed in a front close-up that cuts from the tip of her chin to the top of her forehead. As her voice starts to sing the first verse of her cover of ‘Survivor’, by Destiny’s Child, she starts to mouth along in perfect sync, moving all her face through micro-movements. Her sweet, almost spoken voice soars as intimately as the framing of her face. We can hear her lips folding and catching her breath between one verse and another — her visible gestures follow each of these oral sounds.

This kind of framing is quite familiar to music videos. Carol Vernallis has addressed how these close-ups have been often used to emphasize some musical hooks and lyrics.Footnote ⁴² Ciara Barrett and Thiago Soares both note that this is a composition characteristically used to portray solo female artists — as in Beyoncé’s Ghost (from Beyoncé), Sinéad O’Connor’s Nothing Compares 2 U, and Miley Cyrus’s Wrecking Ball. Footnote ⁴³ Clarice’s close-up let us overly engage with her face, watching her as the lips fold and change shapes, the tongue quickly moves, forming phonemes, and the eyebrows wrinkle with the struggles of voicing. Her voice is also very close, reinforcing the sense of intimacy often created in such audiovisual products.

Authors concerned with the recorded voice in films have discussed how the sense of intimacy projected by the employment of close miking produces voices through gendered connotations. Liz Greene acknowledges explicitly that there is a lack of reverberation in how women’s voices are often recorded in the pop music industry allowing ‘listeners to feel unrealistically close to the singer’.Footnote ⁴⁴ In her writing, this leads to a context that creates feminine bodies by employing an idea of sensual availability in sexist ways. Moreover, Jacob Smith stresses that the microphone has enabled vocal styles that cherish timbre variation as a way of performing emotional modulation.Footnote ⁴⁵ Crying, rasping, breathing out loud, and whispering, for example, are vocal modes that are often employed in vocal tradition’s expressive repertoire. Each of these modes acquires gendered and racialized values in different social contexts.

Falcão’s voice follows the gendered intimacy observed by Greene while carrying the range of vocal expressivity posited by Smith. Her voice is breathy as she constantly gasps for air; some notes are raspy and others very clear, projecting a sense of dramatic vulnerability in which her singing sounds emotional. The combination of her intimate framing and her intimate voice firstly seems to tune with the sexist regime of women’s representation in audiovisual media. On the other hand, her on-point, closely framed lip-synching helps to make her vocal variations noticeable by visually stressing the rich expressiveness of her intimate singing. In this sense, Clarice employs a trope of lip-synching in music videos in which it is common for artists to rely on the expressiveness of live singing to produce dramatic performances. Moreover, Survivor helps us realize how tropes of intimacy (the close-up, the close vocals, the on-point lip-synching) can project Clarice’s vital relationship with her musical persona while also putting herself at the centre of her own stage.

The way Clarice’s lip-synching helps to situate her as a protagonist becomes more poignant when a cut takes us away from her face. In a frame like Clarice’s but significantly wider, a Black woman appears with amusedly reproving looks, not mouthing the words. At the same time, Clarice’s voice keeps singing about how ‘you thought I’d be weak without you, but I’m stronger’, transposing the song from its R&B connotation to a jazzier one. Clarice’s face pops up again in the next cut, making Beyoncé’s lyrics of surviving a bad relationship her own. Other cuts take us to meet various non-famous women in broader close-ups — there is an older, crying, white woman and a middle-aged one, her face contorted with laughing. In total, while we keep watching Clarice mouthing her own voice, more than sixty women of different ethnicities and ages also appear on camera, performing to Clarice’s voice, but rarely singing along. They are portrayed through a wider close-up framing, always alone and directly addressing us through the screen. Eventually, they all interact with a tube of red lipstick in varied and personal ways — painting their faces, writing mottos on their arms, throwing the lipstick away, and even eating it. Clarice’s voice gets a little louder when the song gets more dramatic — with the chorus repeating itself on the mantra ‘I am a survivor’ — stressing the open vowels of her carioca accent,Footnote ⁴⁶ and her face contorts while following her growls and vocal drives. As her voice distorts her once intimate sounds, pursuing more space to reverberate, her lip-synching becomes more desperate, breaking up with the sensual tradition of women’s vocal representation. Clarice mouths widely, staring at us directly while roughly painting her face with red lipstick.

At a primary level, it is not difficult to relate Clarice’s performance of the song to her personal life. Being famous as a comedian and as a singer with an ironic sense of humour, Clarice had broken up with her long-term partner Gregorio Duvivier — who worked with her on several projects — about one year earlier. Their break-up was heavily covered in the Brazilian media, and it was followed by the news that Clarice was leaving Porta dos Fundos, a popular comedic YouTube channel that Gregorio had helped to create. Survivor seemed to address a new moment in Clarice’s career, underlying her work’s newly severe and dramatic tone compared to her previous comedic songs. On a second level, it also seemed to posit how Clarice had engaged with anglophone pop culture through some sort of feminist formation. After all, ‘Survivor’ is a famous pop song initially sung by Destiny’s Child; Clarice emphasized the song’s dramatic potential by making it her own. In an interview with Trip magazine on YouTube, she directly addressed how Destiny’s Child songs were the kind of music to which she grew up listening.Footnote ⁴⁷ She stated that only more recently she would understand how feminist they were — Beyoncé had released her self-titled album three years before, defining herself publicly as a feminist through videos like Grown Woman. In the same interview, while discussing feminism, Clarice is also requested to address her relationship with Gregorio Duvivier — in one mediatic movement that commentaries on YouTube widely refer to as sexist.

Survivor seems to emphasize how Clarice’s narrative — of being a woman who, like many others, needed to struggle with gender regulations — was, in fact, a collective one. At the end of the video, a message let us know that all the profit from selling her version of ‘Survivor’ on iTunes would be donated to the feminist foundation Think Olga. Noticeably, the video uses stereotypical features of how women have been policed (the red lipstick, the intimate framing, and vocals) and distorts them through deviant uses. Scholarly works have also cited the video as a critical marker of Brazil’s digital feminism uprising. Both Josemira Silva Reis’s and Graciela Natansohn’s work and Debora Martini’s monograph mention Clarice Falcão’s Survivor when presenting what they dub a feminist uprising in Brazil in consideration of digital content, commenting on Survivor’s viral impact.Footnote ⁴⁸ In 2015, after all, Brazilian women were increasingly using social network sites to conduct discussions on abortion, domestic violence, and sexual abuse. Clarice herself has mentioned how the idea for her feminist music video came from her contact with the Think Olga Facebook page — which shows she was deliberately taking part in a network of mediatized activism.Footnote ⁴⁹ Both at the time of its release and in the following years, commentaries on YouTube reinforce the video’s feminist potential, with women sharing their personal stories of abusive relationships in an ongoing discussion.

On the other hand, Clarice faced many backlashes, especially from Black feminist activists who claimed that Clarice’s whiteness was crucial to her feminist turn. Geledés, an influential website dedicated to discussions on Black feminism, for example, reposted a text by activist Gabriela Moura that referred to Survivor as a ‘cute video where a rich girl gains visibility and money over the work of Black women, using the image of her pain to attest she is a feminist’.Footnote ⁵⁰ Clarice herself has addressed the issue on her personal social media pages, reposting texts that criticized her work. She acknowledged that her experience as a white woman might have limited her understanding of feminism while also arguing in favour of the potentials of her music video.

By watching the video, Clarice’s constant presence, with her pale face mouthing and claiming her singing of Destiny’s Child’s original R&B song, seems to matter in those discussions. After all, Clarice was often criticized for taking the central part of her feminist supplication, despite the music video presenting an assemblage of diverse women. Since her lip-sync performance highlights her ownership over her own recorded voice, emphasizing her embodied struggles against sexism, she seems to be projected as the main character of her video. The singer’s convincing lip-synching dramatically reinforces the intimacy and the gestures of her voice, reassuring us that she is the one singing, gasping for air, almost missing some musical notes — surviving. It is Clarice’s voice and performance we are listening to, and it is her take on Beyoncé’s song — undressing it from its R&B, Black diva vibe — that animates all the other women’s performances. Clarice’s lip-synching, which claims all the small movements of her voice, cannot help reassuring us of her own struggles. Lip-synching, in this case, is not only a way of claiming her singing and emphasizing vocal expressivity but also reinforces her authorial voice.

Janelle Monáe and the Bodily Spoils of a ‘Cold War’

At the beginning of Cold War,Footnote ⁵¹ Janelle Monáe appears slightly out of focus, mispositioned in her close-up shot, talking mutely to someone outside the framing. A card takes over the screen: ‘JANELLE MONÁE – COLD WAR – Take 1’. As soon as the first chords of ‘Cold War’ start playing, Monáe’s face pops up on the screen, now perfectly positioned. The video employs the same close-up shot female singers like Beyoncé and Clarice Falcão have used in their music videos. Monáe, however, immediately squeezes her eyes and turns her face right, unlike most of them. She seems to be looking at us while we look at her, taking control over her intimate framing and ‘letting us know that she has vision too’.Footnote ⁵² While she slowly turns her head back to the frontal framing — obeying the ‘rules’ of the music video frontal close-up — she moisturizes her lips as if preparing to sing. She then mouths vigorously to the sound of her own voice: ‘so you think I’m alone? But being alone is the only way to be’.

Her lip-synching is particularly on point, and, as in Clarice Falcão’s video, the intimate close-up lets us thoroughly analyse it: we can see her mouth moving while her voice quickly sings each syllable, her eyebrows lifting when she stretches into a brief high note midway through a sentence, and her lips trembling when her voice performs fast R&B melisma. In general, she engages with the tradition that pursues lip-synching to portray live singing. However, Monáe shows much more motility than artists typically present in this frontal close-up framing. This way, through a familiar-framing-turned-new, Monáe tells us, sometimes fiercely, other times as a soft supplication, that ‘This is a cold war’, and repeatedly asks, ‘do you know what you’re fighting for?’.

As Shana Redmond has discussed, by choosing the Cold War as an analogy to discuss the surveillance and regulations of contemporary marginalized groups, Monáe seems to address the spoils of imperialism and its everyday impact on marginalized peoples.Footnote ⁵³ In this sense, Monáe has often refused to submit her apparitions to the bodily regulations that have conditioned Black women into a grammar of sensuality while still making her body central to her work. As referred to by Aleksandra Szaniawska, her gestural refusals constantly appear through an exacerbation of her queerness and her enactments of gestures of flight (as in Tightrope and Many Moons).Footnote ⁵⁴ Her public apparitions widely reinforce the acknowledgment of the regulations she must articulate herself with: Monáe has been vocal on her profile on social media platforms and in interviews with big media conglomerates about her adherence to progressist social movements. This is extensively connected to how her albums have created a dystopic world that interconnects all her musical work through the character Cindi Mayweather, an alter-ego she has often embodied.

In Cold War, even though Monáe is undressed from the tuxedo uniform and androgynous characterization that marked her first appearances, her lip-synching still follows the gestural repertoire she has built for herself as Cindi Mayweather. As Szaniaswska has written, ‘her characteristic way of widening her eyes and rolling her eyeballs, her use of syncopated body isolations, and angular, robotic-like dance steps toy with the figure’s dual nature, balancing on the verge of organicity and artificiality.’Footnote ⁵⁵

Things start to change mid-way through the music video, however. After softly mouthing the last line of the song’s bridge, with her voice still powerfully soaring, she seems to miss the first line of the next verse. Recomposing herself, Monáe strongly mouths the line ‘I was made to believe there’s something wrong with me’. While her voice keeps singing, she looks at someone outside the frame and tries to force a cut in the take. She tries to look back at the camera, but as soon as she begins lip-synching again, she shakes her head and cries vigorously. She looks down, shakes her body as if trying to stop sobbing, and keeps trying to go back to her confident performance — her voice keeps going strong. The sync fiction becomes suddenly exposed. Differently from Beyoncé’s Grown Woman, in which the editing playfully exposes the playback technology assuming the fictional aspect of any music video, Monáe exposes the sync fiction by emotionally disrupting her one-take ‘singing’ performance.

The moment when a lip-sync performance collapses is critical to studies that have discussed the relation between gestures and voices in audiovisual media. Authors such as Jarman, Snell, and Greene have all addressed Mulholland Drive as a way of approaching the matter,Footnote ⁵⁶ focusing especially on the scene where singer Rebekah Del Rio convincedly lip-syncs her own voice only to collapse midway through the song.Footnote ⁵⁷ In general, they are all interested in how such a scene — and others that work similarly — violently exposes the fragile coherence of sound and image in video technologies and film. Like Monáe, Del Rio mouths her own voice in a very in-sync way that can lead us to believe she is singing ‘live’. However, a gap is firmly acknowledged when her voice keeps soaring fiercely and the gestures of singing collapse. This leads to different experiences: sometimes, it can remind us of the fiction of synching in audiovisual media; other times, it projects a violent rupture in how we perceive a body — it can even make us temporarily lose our faith in the coherence of audiovisual media.

In Monáe’s Cold War, however, while her collapse exposes the sync fiction of audiovisual media, it also reassures us of her personal relationship with her song and with what seems to be taken as proof of her struggles. A close-framed lip-sync performance in a music video tends to rely on its capacity to make the lip-sync performance engageable and believable, hence Falcão’s accurate lip-sync in Survivor and Sinéad O’Connor’s crying performance in Nothing Compares 2 U. Breaking the familiar lip-synching script by emotionally interrupting the lip-sync act, Monáe stressed her trajectory as a marginalized person by expressing her struggles to claim her voice, going against societal regulations. Such performative act deeply resonated with some of her queer, non-white fans. Commentaries on YouTube, for example, discuss the reasons for her sobbing, stressing the importance of the lyrics ‘I was made to believe there’s something wrong with me’. This opens spaces where people share their personal stories about queer struggles, race regulations, and women’s rights — ‘only certain demographics understand why this line is so heavy’, user Kano IX comments. In one strange turn, the fragility of her lip-synching forges a pact with her audience in which her relationship with her own voice is emphasized through a performance of vulnerability. To put it differently, the fact that Monáe is lip-synching (and not singing live) lets her engage bodily with her own recorded voice in unpredicted ways.

This is stressed even more by how she returns to lip-synching after her break-down. With tears running down her face, she resumes mouthing in time to catch up with one moment of vocal excess, in which her voice firmly engages with a portamento that becomes a vocalise. Her head moves while she mouths the voice’s movements, her forehead pulsing with veins and her mouth trembling with the vibrato that fades in by the end of the vocal phrase. We know Monaé is not live singing — this is something we always knew but of which we were just reminded. Even so, this new, more dramatic way of performing seems even more engaging since her crying lip-synching is surprisingly on-point.

In this sense, Monáe’s lip-synching accounts for her personal engagement with her own voice, her lyrics, and her song, showing us that she thoroughly knows every word she has written and every vocal gesture of her singing. Her crying performance, which mixes confidence and vulnerability and stages an emotional outburst, reminds us through the employment of playback technology that lip-synching has also to do with identification – in this case, with her own narrative and voice. The collapse of her lip-synching projects the spoils of the traumatic struggles Monaé frequently narrates in her other mediatic performances and, in this sense, reassures us of the embodied lived experience of her more expansive mediatic persona.

Concluding Thoughts

In this essay, I have argued that official music videos often do the part of an account that produces artists’ subjectivities, as discussed by Janotti and Alcântara. Such accounts deal with artists’ wider transmedial personae, encompassing artists’ voluntary presentations of themselves and other discourses surrounding them — as posited by Echard and Hansen in their work. Moreover, digital music videos are part of networks of connected listening, often remediating other media. In this way, digital platforms (and texts and products on them) inform our experiences with such audiovisual products. Lip-synching plays an important part in this panorama since it is one of the crucial devices through which artists’ performances are produced in music videos. As argued by Auslander, by organizing visible and vocal gestures in particular ways through the employment of playback technology, lip-synching supplies artists with ways of expanding their personae. In this context, lip-synching needs to be taken as an expressive performance that articulates listening, gesturing, and visible and vocal gestures, as posited by Snell and Jarman in their discussions. The specific way artists lip-sync (vivaciously, subtly, assuming playback’s vocal gap, dislodging their mouthing from their vocal efforts, or staging live singing, for example) provide some of the terms of which music videos’ accounts are made. In this sense, lip-synching touches on artists’ wider personae, stressing the performative enterprises that negotiate larger aspects of singers’ mediatic lives.

To discuss the gestural efforts of how singers build connections with their own recorded voice, we need to be attentive to their specific gestural enterprises. At the same time, as observed in Beyoncé’s Grown Woman, we must keep in focus that said gestures are constituted through audiovisual operations due to the unavoidable usage of playback technology. To put it simply, lip-synching is not only an action a singer performs on set but refers to how mediatic apparatuses assemble artists’ vocals and gestures. Butler explains how an account of oneself demands some deliberation, being attached to the context of its emergence but also reflecting upon this very context. In this sense, to properly assess music videos through lip-synching, we need to deal with their framing and image traditions, acknowledging how artists’ voices are produced (both regarding vocal techniques and recording technologies). These aspects, after all, are terms that build accounts and relate to specific gender, race, and genre regulations.

I have shown, for example, how both Clarice Falcão and Janelle Monáe employed a framing extensively used to portray female singers, especially in lip-sync performances, while attaching themselves distinctively to such gendered framing. Falcão’s voice was recorded through close miking, and her precise lip-sync performance claims all her vocal gestures. However, the framing starts to project desperation as the singer paints her face red and turns her lip-sync into a dramatic supplication that emphasizes her vocal struggles. Monáe, on the other hand, undermines the intimate framing by fiercely showing us that she has her own vision and agency. By exposing the sync fiction through her crying performance, Monáe also breaks our expectations about on-point lip-synching traditions in music videos and exposes her embodied struggles with hegemonic powers. In sum, Monáe and Falcão engage differently with music video tropes by using similar framing traditions in particular ways. Such framings also come to signify differently due to the distinct racialized connotations each artist’s presence carries.

In addition, each of the videos analysed also shines a light on some traditions of how lip-synching has been employed in music videos. Beyoncé’s Grown Woman, for example, relies on playback technology to playfully expose the sync fiction. In her music video, she can irreverently mouth her own voice while also making her archival footage engage with her present voice. Openly synching her recorded voice into other mouths resonates with the device employed in which voices travel through different people, as in George Michael’s Freedom! ’90. Moving forward, Clarice Falcão shines a light on another lip-sync tradition: the one where lip-synching mimics live singing by accurately employing the struggles of voicing. Such precision helps to stress the vocal expressivity of the voice visually and is extensively employed by other artists in their music videos. Janelle Monáe, on the other hand, engages differently with the same tradition Falcão assembles, exposing the playback technology by missing some vocal lines. Not mouthing an entire song is something artists often do — sometimes lip-synching only a few verses, other times mouthing most of them. Overall, each lip-sync performance assembles specific traditions of how singers are portrayed in music videos while also exposing how each of these artists negotiates every tradition in which they are engaged.

When analysing music videos’ lip-sync acts, it is also crucial to be attentive to how gestures appear in a way that organizes choreographies and scripts, composing a narrative thread that often connects music videos with broader mediatic performances. In this sense, analysing music videos by investigating lip-synching helps us understand performances’ dramatic scripts. In Rager Teenager!, Troye Sivan sometimes appears melancholically jamming to his own song, and sometimes he is inquisitively talking with us. His one-take lip-sync performance leads him from sad and lazy gestures to angry and then amusing ones, in a script of turning loneliness into solitude. In Grown Woman, Beyoncé is playful with her voice by first lip-synching it in a laid-back way, then attributing it to her younger selves through their youthful diva expressivity. This helps to produce a narrative in which her successful stardom is cemented through years of effortful grinding. Clarice Falcão’s lip-sync in Survivor first reinforces the gendered intimacy of her voice and then turns it into one desperate supplication, creating a script in which surviving as a woman seems to require more and more strength. Finally, in her Cold War, Monáe collapses into crying midway through her inquisitive performance, going back into it more dramatically than before, and adding more gestural qualities to her performance. Such a breakdown helps provide an emotional depth to Monáe’s complex, vulnerable but fierce persona that surpasses the music video. To sum up, each music video creates bodily scripts through lip-synching that articulate artists’ broader narratives.

In this sense, my analyses stress the potential of using lip-synching as an entry point to discuss artists’ accounts of themselves in music videos. When studying lip-sync acts, we have the possibility of diving into particular issues in artists’ personae also because lip-synching can be crucial to how we perceive artists’ bodies and biographies. As demonstrated, framing lip-synching as multiplying artists’ self-narration, we are constantly invited to move from audio to image through videos and digital networks, building connections between vocal and visible gestures, biographies, and specific performances. Lip-synching is posited, then, as a framing that helps us address the audiovisual and performative struggle of music videos, interconnecting artists’ bodily productions, vocals, and mediatic gestures with their broader — never finished — narrative threads.

Footnotes

I would like to deeply thank Amy Skjerseth, whose comments have greatly improved my original manuscript. I also thank Suzana Mateus for the assistance with the discussions on Beyoncé. This research was partly financed by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.

References

¹ Troye Sivan, Rager Teenager!, dir. by Troye Sivan, 5 August 2020 <https://www.youtube.com/watch?v=H1pGPgQPAG8> [accessed 26 June 2024].

² Troye Sivan, Easy, dir. by Troye Sivan, 16 July 2020 <https://www.youtube.com/watch?v=bCHqIhiFmEY> [accessed 26 June 2024].

³ Thiago Soares, A Estética do Videoclipe (Editora da UFPB, 2013), p. 18; my translation.

⁴ While Janotti and Alcântara draw from Butler’s theory in general, they seem specifically inspired by their book on giving an account of oneself: Butler, Judith, Giving an Account of Oneself (Fordham University Press, 2005).CrossRef Google Scholar

⁵ Júnior, Jeder Janotti and Alcântara, João André, O Videoclipe na era pós-televisiva: questões de gênero e categorias músicas nas obras de Daniel Peixoto e Johnny Hooker (Appris, 2018).Google Scholar

⁶ Korsgaard, Mathias Bonde, Music Video After-MTV: Audiovisual Studies, New Media and Popular Music (Routledge, 2017).CrossRef Google Scholar

⁷ Soares, A Estética do Videoclipe, p. 52.

⁸ Korsgaard, Music Video After-MTV, p. 17.

⁹ See Vernallis, Carol, ‘Accelerated Aesthetics: A New Lexicon of Time, Space and Rhythm’, in The Oxford Handbook of Sound and Image in Digital Media, ed. by Vernallis, Carol, Herzog, Amy, and Richardson, John (Oxford University Press, 2013), pp. 707–31CrossRef Google Scholar, and Carol Vernallis, ‘Music Video and YouTube: New Aesthetics and Generic Transformations’, in Rewind, Play, Fast Forward: The Past, Present and Future of the Music Video, ed. by Henry Keazor and Thorsten Wubena (Transcript Verlag, 2010), pp. 233–60.

¹⁰ Korsgaard, Music Video After-MTV, p. 173.

¹¹ Simone Pereira de Sá, ‘Somos todos fãs e haters? Cultura pop, afetos e performances de gosto em sites de redes sociais’, Revista Eco-Pós, 19 (2016), pp. 50–67 (p. 61), doi:10.29146/eco-pos.v19i3.5421; my translation.

¹² Auslander, Philip, ‘Framing Personae in Music Videos’, in The Bloomsbury Handbook of Popular Music Video Analysis, ed. by Lori, A. Burns (Bloomsbury Academic, 2019), pp. 91–111 (p. 91).CrossRef Google Scholar

¹³ Ibid., p. 100.

¹⁴ Lady Gaga, 911, dir. by Tarsem Singh, 18 September 2020 <https://www.youtube.com/watch?v=58hoktsqk_Q> [accessed 26 June 2024]; Lizzo, Truth Hurts, dir. by Quinn Wilson, 25 September 2017 <https://www.youtube.com/watch?v=P00HMxdsVZI> [accessed 26 June 2024].

¹⁵ BTS, Boy With Luv feat. Halsey, dir. by Yong-seok Choi, 12 April 2019 <https://www.youtube.com/watch?v=XsX3ATc3FbA> [accessed 26 June 2024]; Blackpink, Kill This Love, dir. by Seo Hyun-Seung, 4 April 2019 <https://www.youtube.com/watch?v=2S24-y0Ij3Y> [accessed 26 June 2024].

¹⁶ Guma, Jugular, dir. by Felipe André Silva, 13 May 2021 <https://www.youtube.com/watch?v=OgxNGUQP2pQ> [accessed 26 June 2024]; Stabat Mater Vivaldi, dir. by Sebastian Pańczyk (Poland, 2021).

¹⁷ Beyoncé, Single Ladies (Put a Ring on It), dir. by Jake Nava, 3 October 2009 <https://www.youtube.com/watch?v=4m1EFMoRFvY> [accessed 26 June 2024].

¹⁸ Echard, William, ‘Someone and Someone: Dialogic Intertextuality and Neil Young’, The Pop Palimpsest: Intertextuality in Recorded Popular Music, ed. by Burns, Lori A. and Lacasse, Serge (University of Michigan Press, 2018), pp. 169–89.Google Scholar

¹⁹ Hansen, Kai Arne, ‘(Re)Reading Pop Personae: A Transmedial Approach to Studying the Multiple Construction of Artist Identities’, Twentieth-Century Music, 16.3 (2019), pp. 501–29 (p. 509), doi:10.1017/S1478572219000276.CrossRef Google Scholar

²⁰ Janotti and Alcântara, O Videoclipe na era pós-televisiva, p. v; my translation.

²¹ Butler, Giving an Account of Oneself, p. 8.

²² Ibid., p. 39.

²³ Janotti and Alcantara, O videoclipe na era pos-televisiva, p. 30; my translation.

²⁴ Echard, ‘Someone and Someone’, p. 175.

²⁵ Auslander, ‘Framing Personae in Music Videos’, p. 106.

²⁶ Merrie Snell, Lipsynching (Bloomsbury Academic, 2020), p. 50.

²⁷ Jarman, Freya, ‘Watch My Lips: The Limits of Camp in Lip-Syncing Scenes’, in Music and Camp, ed. by Christopher Moore and Purvis, Philip (Wesleyan University Press, 2018), pp. 95–117.Google Scholar

²⁸ Ibid., p. 102.

²⁹ Jarman, ‘Watch My Lips’, p. 103.

³⁰ Snell, Lipsynching.

³¹ Jarman, ‘Watch My Lips’, p. 105.

³² Snell, Lipsynching, p. 72.

³³ Labelle, Brandon, Lexicon of the Mouth: Poetics and Politics of the Voice and the Oral Imaginary (Bloomsbury, 2014), p. 5.CrossRef Google Scholar

³⁴ Beyoncé, Grown Woman (Bonus Video), dir. by Jake Nava, 24 November 2014 <https://www.youtube.com/watch?v=y3MjxWn5W9M> [accessed 26 June 2024].

³⁵ Beyoncé, 4 (Parkwood Entertainment and Columbia Records, 2011).

³⁶ Beyoncé, Beyoncé (Parkwood Entertainment and Columbia Records, 2013).

³⁷ Suzana de Sousa Mateus, ‘Narrativas do feminino nas performances de Beyoncé’ (unpublished master’s dissertation, Universidade Federal de Pernambuco, 2018).

³⁸ Sinéad O’Connor, Nothing Compares 2 U, dir. by John Maybury, 10 July 1990 <https://www.youtube.com/watch?v=0-EF60neguk> [accessed 26 June 2024].

³⁹ George Michael, Freedom! ’90, dir. by David Fincher, 3 October 1990 <https://www.youtube.com/watch?v=diYAc7gB-0A> [accessed 26 June 2024].

⁴⁰ Barrett, Ciara, ‘“Formation” of the Female Author in the Hip Hop Visual Album: Beyoncé and FKA twigs’, The Soundtrack, 9.1–2 (2016), pp. 41–57 (p. 55), doi:10.1386/ts.9.1-2.41_1.CrossRef Google Scholar

⁴¹ Clarice Falcão, Survivor, dir. by Clarice Falcão and Celio Porto, 13 November 2015 <https://www.youtube.com/watch?v=NlxFf40Lqx4> [accessed 26 June 2024].

⁴² Vernallis, Carol, Experiencing Music Video: Aesthetics and Cultural Context (Columbia University Press, 2004).Google Scholar

⁴³ Barret, ‘“Formation” of the Female Author’; Soares, A Estética do Videoclipe.

⁴⁴ Greene, Liz, ‘Speaking, Singing, Screaming: Controlling the Female Voice in American Cinema’, The Soundtrack, 2.1 (2009), pp. 63–76 (p. 64), doi:10.1386/st.2.1.63_1.CrossRef Google Scholar

⁴⁵ Smith, Jacob, Vocal Tracks: Performance and Sound Media (University of California Press, 2008).CrossRef Google Scholar

⁴⁶ Carioca refers to a citizen of the City of Rio de Janeiro, in Brazil. The carioca accent is famous for its open vowels, with speakers sometimes pronouncing some single vowels as diphthongs.

⁴⁷ Clarice Falcão, ‘Clarice Falcão fala sobre saída do Porta, Gregório e feminismo, Trip TV’, Trip TV, 4 April, 2016 <https://www.youtube.com/watch?v=oB-3teo9Efw&t=219s> [accessed 26 June 2024].

⁴⁸ Natansohn, Graciela and Reis, Josemira Silva, ‘Com quantas hashtags se constrói um movimento? O que nos diz a “Primavera Feminista brasileira”’, Tríade, 5 (2017), pp. 113–30, doi:10.22484/2318-5694.2017v5n10p113-130 Google Scholar; Deborah Martini, ‘Brazilian Feminism on the Rise: A Case Study on Brazilian Feminist Cyberactivism’ (unpublished master’s thesis, Linköping University, 2016).

⁴⁹ Falcão, ‘Clarice Falcão fala’.

⁵⁰ Gabriela Moura, ‘Jout Jout, Clarice e o Feminismo Branco’, Geledes (2015) <https://www.geledes.org.br/jout-jout-clarice-e-o-feminismo-branco/> [accessed 12 July 2022]; my translation.

⁵¹ Janelle Monáe, Cold War, dir. by Wendy Morgan, 5 August 2020 <https://www.youtube.com/watch?v=lqmORiHNtN4> [accessed 26 June 2024].

⁵² Redmond, Shana, ‘This Safer Space: Janelle Monáe’s “Cold War”’, Journal of Popular Music Studies, 23 (2011), pp. 393–411 (p. 397), doi:10.1111/j.1533-1598.2011.01303.x.CrossRef Google Scholar

⁵³ Redmond, ‘This Safer Space’.

⁵⁴ Szaniawska, Aleksandra, ‘Gestural Refusals, Embodied Flights: Janelle Monae’s Vision of Black Queer Futurity’, The Black Scholar, 49.4 (2019), pp. 35–50, doi:10.1080/00064246.2019.1655371.CrossRef Google Scholar

⁵⁵ Ibid., p. 40.

⁵⁶ Greene, ‘Speaking, Singing, Screaming’.

⁵⁷ Mulholland Drive, dir. by David Lynch (USA, 2001).

Article contents

Lip-Synching as an Account of Oneself: Digital Music Videos and the Voice–Gesture Relationship

Abstract

Part I

Towards the Intimacy of Lip-Synching

Post-MTV Music Videos

Framing Personae

Digital Accounts of Oneself

The Voice–Gesture Relationship

Part II

‘Grown Woman’ and the Autobiographic Enterprise of Beyoncé

Clarice Falcão’s ‘Survivor’, Personal Feelings and Political Engagement

Janelle Monáe and the Bodily Spoils of a ‘Cold War’

Concluding Thoughts

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests