Living Language. Laura M. Ahearn
languages
Whistles
Song
Illustrations and images
Writing
Often, particular uses of language will involve more than one of these modes. Instances involving what linguistic anthropologists call multimodal discourse are more common than you might imagine; any time a person’s speech is accompanied by gestures or body movements, for example, the interaction is multimodal. Before focusing in on some of these nonverbal modes of meaning-making, however, it will be helpful to present the work of two scholars, Erving Goffman and Mikhail Bakhtin, each of whom mainly worked on verbal interactions, but their theories are still quite applicable to understanding the multimodality inherent in human communication.
Bakhtin’s Double-Voiced Discourse
Mikhail Bakhtin was a Russian literary critic whose wisdom about the “socially charged life of language” formed the epigraph of the first chapter of this book. Bakhtin is well known for his concept of heteroglossia, which refers to the multiplicity of socially tinged ways of speaking in any given society – some of high status, some low. We will return to this concept in a later chapter. For our purposes here, Bakhtin suggested another helpful term: double-voiced discourse. Such discourse involves the embedding of others’ voices into one’s own voice, either through direct or indirect quotation, or more subtly through mimicry or tone. Because he was a literary critic and not a social scientist, Bakhtin analyzed this phenomenon in the context of the novel, but it is easy to see its relevance and utility for everyday linguistic interactions. About double-voiced discourse, Bakhtin wrote,
there are two voices, two meanings and two expressions … Double-voiced discourse is always internally dialogized. Examples of this would be comic, ironic or parodic discourse, the refracting discourse of a narrator, refracting discourse in the language of a character and finally the discourse of a whole incorporated genre – all these discourses are double-voiced and internally dialogized. A potential dialogue is embedded in them, one as yet unfolded, a concentrated dialogue of two voices, two world views, two languages.
(Bakhtin 1981a:324–325)
Returning to the “#Hashtag” skit for a moment, it is clear that the humor in it depends not only on the use (or overuse) of the word and gesture “hashtag” but also relies heavily on the participants’ allusions to, or direct quoting of, snippets from songs, slogans, and other popular culture sources. These are all examples of Bakhtin’s double-voiced discourse and are extremely common in everyday speech. Whenever this sort of discourse takes place, echoes, associations, and even moral connotations stemming from the source of the quotation are taken up by the speaker and then presented, but usually in a refracted way. The “internally dialogized” aspect that Bakhtin mentions allows the speaker to comment upon the words being borrowed – and yet usually say nothing explicitly. In this way, entire genres can be incorporated or commented upon, much as Jimmy Fallon and Justin Timberlake were presenting in “#Hashtag” a commentary on social media users (or at least on those who overuse hashtags).
Goffman’s Participation Framework and Production Format
Some linguistic anthropologists who analyze conversations draw on the theories of Erving Goffman, a sociologist who rejected many of the most common preconceived notions – or language ideologies – regarding the ways in which conversations allegedly take place between speakers and hearers. Like Dell Hymes before him, Goffman rejected approaches that focused on isolated speakers or even isolated speaker-hearer dyads. Instead, Goffman emphasized the importance of foregrounding participation in general as an analytical concept (Goodwin 2001). Goffman recognized that there were many potential interactional roles people can inhabit, so he suggested applying a sophisticated participation framework and production format even to the simplest of conversations. He argued, for example, that we should distinguish between ratified and unratified hearers. Some hearers are addressees (those to whom the speaker addresses an utterance), but others are bystanders, overhearers, or even eavesdroppers.
Similarly, Goffman realized that the seemingly unified role of speaker in any interaction can also be separated into different roles (Goffman 1981:144):
Animator. The person who serves as the voice box; the person who animates the words being spoken, whether they are the speaker’s own words or not.
Author. The person who composed the words, whether or not this person is the one who voices them.
Principal. The person who stands behind what is said; the person whose opinions are expressed, whether or not this person composed or voiced these opinions.
In an earlier version of this well-known typology, Goffman (1986[1974]) presented a slightly different set of roles that included “emitter,” for the voice box, and “animator” to represent the “expressive actions” accompanying talk, which is interesting to note with reference to our focus on multimodality in this chapter. Other scholars have also suggested multiple alternative roles. The most important insight to glean, however, is Goffman’s initial one that the dominant language ideology concerning conversations – that they involve one unified speaker and one unified hearer – is a model that is, at best, too simple and, at worst, simply incorrect.
Sometimes, all three speaker roles are inhabited by one person, but sometimes they can be distributed across several people or be relatively indeterminate (cf. Irvine 1996). So, to give a hypothetical example, let’s assume that President Obama once delivered a speech that was written by a speech writer who totally disagreed with the President’s policies. Let’s also assume that the speech writer wrote eloquently and convincingly enough to keep his or her job. As President Obama delivered the speech, he would be considered the animator (the voice box) and, presumably, the principal (the person whose opinions are being expressed), but the speech writer would be the author – and not the animator or the principal. Even in ordinary conversations, these roles frequently shift, especially when reported speech is used. Goffman called these instances shifts in footing:
A change in footing implies a change in the alignment we take up to ourselves and the others present as expressed in the way we manage the production or reception of an utterance. A change in our footing is another way of talking about a change in our frame for events. … [P]articipants over the course of their speaking constantly change their footing, these changes being a persistent feature of natural talk.
(1981:128)
Such shifts in footing are important to study closely, as they offer scholars clues about the multifunctionality of even the most mundane of utterances. Changes in footing also often index various social identities, cultural values, attitudes, stances, or relationships. They can be triggered by subtle verbal or nonverbal moves and are tracked by all of us as a normal part of the complex multimodality that constitutes linguistic interactions.
Speech and the Analysis of Conversation
The approach advocated here is an integrative one that considers multiple modes of communication together because that is how we experience meaning-making on an everyday basis. Nevertheless, it is useful to separate out a few key modes for analytical purposes. Speech, as the primary mode of communication in many instances, deserves special attention. Linguistic anthropologists, sociolinguists, and other scholars who study speech often refer to “talk-in-interaction” so as to emphasize the socially situated and jointly constructed nature of the speech they are analyzing.
In the 1960s and 1970s, Emanuel Schegloff, Harvey Sacks, and Gail Jefferson developed the approach known as Conversation Analysis (CA for short), and it has evolved since then into a vibrant field of study. CA practitioners focus on the sequential organization of talk as it unfolds moment by moment. They consider each utterance to be the context for the next utterance, and most CA scholars therefore believe that bringing in data from other methods such as interviews or experiments would