Living Language. Laura M. Ahearn
the sequential nature of conversation is a key insight of CA scholars, turn-taking is one of their central concepts. The precise coordination needed for participants in a conversation to switch from speaking to listening and back again is accomplished largely unconsciously. We pick up on subtle pauses, intonation, or other prosodic features in speech such as pitch, volume, or rhythm, and most of the time use that information to switch turns in a conversation right on cue with what came to be called “no gaps, no overlaps” turn-taking (Duranti 1997:245ff; Walker 2013).
In addition to such cues, part of what enables this sort of coordination to occur are components of conversation that CA scholars have identified called adjacency pairs. These are sequences of two utterances spoken by two different speakers. An extremely common adjacency pair in English is, “How are you?” – with the preferred response, “Fine.” A typical greeting exchange in Nepal is a bit different, however. When one meets another person, one frequently asks, “Where are you going?” The response is often a less-than-informative one: “In that direction.” (Both of these adjacency pairs are perfect examples of cases in which Jakobson’s phatic function predominates.)
There are many types of adjacency pairs, and they vary across languages and speech communities. Some examples include all sorts of question/answer exchanges, including offer/acceptance and offer/rejection (“Would you like to come with me?” “Sure!” or, “Do you want me to help you?” “No, thanks.”). Compliment/rejection is a fairly common (and gendered) adjacency pair in this society (“That’s a lovely dress you’re wearing.” “Oh, it’s just a rag I pulled out of the closet.”).
Many other adjacency pairs can be identified, along with their culturally preferred types of responses. Dispreferred responses can be illuminating as well, such as when someone is asked an everyday “How are you?” and answers with a 30-minute litany of complaints. “Conversational trouble,” as well as dysfluencies and repairs, can also be extremely interesting to investigate. By focusing so closely on everyday talk – something that many researchers have either overlooked or looked right through in order to get at the “real” data – CA practitioners have drawn attention to the complex accomplishments involved in even the most mundane conversations.
As valuable as the insights of CA are, however, the assumptions and research questions of most CA practitioners differ from those of most linguistic anthropologists.3 CA scholars have traditionally insisted upon the autonomy of talk, therefore limiting themselves to analysis of transcripts. Moreover, their research questions have primarily pertained to the organization of talk itself. Linguistic anthropologists, in contrast, have generally utilized CA as one method among many (as described in the next chapter), and their research questions have usually focused on the intersections of linguistic interactions with broader social dynamics and cultural meanings. Alessandro Duranti (1997:266) identifies three main criticisms leveled at researchers who use only CA: (1) they are uninterested in the “larger contexts” of the conversations they analyze, even such basic aspects as the relationship between the people who are talking, or where or when the conversation took place; (2) their transcripts tend to indicate a very narrow view of “speech,” omitting nonverbal interactions, changes in intonation, and other aspects of multimodal discourse to be discussed in this chapter; and (3) they are completely uninterested in what the speakers themselves might say to explain or interpret their own utterances. For these reasons, when linguistic anthropologists use CA (and many do consider it an extremely valuable approach), they combine it with other methods and contextualize the conversations they analyze far more comprehensively than strict CA practitioners do. In fact, conversation analysis, in conjunction with ethnographic methods, can provide valuable insights into many different kinds of linguistic and social practices.
There have actually been some interesting indications of a rapprochement between the two groups of scholars in recent years. Ignasi Clemente (2013) identifies three phases in the relationship between CA and anthropology:
1 a period of sharing during the 1960s and 1970s during which Schegloff, Sacks, and Jefferson and Dell Hymes, John Gumperz, and other anthropologically inclined sociolinguists were all involved in countering the Chomskyan view of linguistic competence that focused solely on syntax;
2 a second period during the 1980s and 1990s when CA came into its own as a discipline and grew apart from linguistic anthropology, which was also developing a separate set of intellectual interests, and this led to pretty fundamental disagreements between the two approaches along the lines of what constituted appropriate context, units of analysis, and the autonomous nature (or lack thereof) of conversation; and
3 a third period beginning in the early 2000s and extending to the present of reinvigorated “interdisciplinary convergence” (Clemente 2013:690) as a result of cross-fertilization and renewed dialogue between scholars in both disciplines.
For scholars who view language as a form of social action, it is easy to see why it makes sense to study actual talk closely.
Gestures and Other Forms of Embodied Communication
But talk rarely comes to us as a disembodied voice, so linguistic anthropologists who study communicative events often analyze gestures, body movements, facial expressions, and interaction with various objects (or “props”) in the material environment alongside speech as an integrated, multimodal event. Meanings cannot begin to be understood – or might even be misunderstood – if such elements are left out of the analysis.
Various typologies have been suggested for the analysis of multimodal discourse. Following Enfield (2005), Stivers and Sidnell (2005) distinguish between “vocal/aural” modalities on the one hand and “visuospatial” modalities on the other, but while some scholars have found these sorts of typologies to be useful, others, such as Haviland (2004) and Streeck et al. (2011:9), have not. These latter researchers prefer instead to focus on the integration and coordination of multiple modalities of communication within the material world rather than separating modalities apart from one another. They argue that the interaction should be understood as a complex, emergent ensemble in which the whole adds up to be more than the sum of the parts. “Insofar as gestural typologies ignore or minimize such semiotic complexity in the different gestural ‘types’ they isolate, the classificatory impulse seems analytically obfuscating rather than helpful” argues Haviland (2004:205).
While it is undoubtedly true that meaning-making involves multiple modalities, as well as the material environment (and, I would add, knowledge of personal histories, cultural norms, social relations, and many other invisible and inaudible aspects of the event at hand), it is still useful to take note of at least one of the gesture categorizations before presenting a few examples of analyses of emergent multimodal discourse.
Perhaps the most common typology of gestures is psychologist David McNeill’s (1992:78–80):
Iconics are gestures that “bear close formal relationship to the semantic content of speech” (1992:78). In other words, they iconically (in a Peircean sense) resemble that which is being described verbally. An example might be if a girl traced in the air the shape of a huge tree that she saw being cut down. A subset of iconics are sometimes called emblems, which are stand-alone gestures that have a conventional meaning within a particular society or speech community. Examples of emblems include the thumbs-up sign, giving someone “the finger,” and placing the cuckold or rabbit-ears sign behind someone else’s head. Emojis can also be considered emblems, given their iconic resemblance to that which they represent, as we will discuss further in Chapter 8. Each of these signs can mean different things in different speech communities around the world.1992:78). In other words, they iconically (in a Peircean sense) resemble that which is being described verbally. An example might be if a girl traced in the air the shape of a huge tree that she saw being cut down. A subset of iconics are sometimes called emblems, which are stand-alone gestures that have a conventional meaning within a particular society or speech community. Examples of emblems include the thumbs-up sign, giving someone “the finger,” and placing the cuckold or rabbit-ears sign behind someone else’s head. Emojis can also be considered emblems, given their iconic resemblance to that which