Second Language Pronunciation. Группа авторов
a feature of second language (L2) pronunciation learning. Furthermore, conscious attention and input from a variety of speakers are important to L2 pronunciation development as well.
In this chapter, I describe how typical L1 pronunciation develops, and apply this knowledge to identifying both similarities and fundamental differences in L2 pronunciation learning. I then discuss critical evidence supporting the benefit of perception-oriented training for L2 pronunciation. Finally, I describe practical implications for pronunciation instruction and recommend perceptually focused resources for teaching and research. For the purpose of this chapter, speech perception refers to the cognitive process by which sounds are heard and categorized. This contrasts with speech production, which refers to the output of the cognitive system, mediated by physical control of speech gestures. Pronunciation is an umbrella term capturing both perception and production processes. I primarily focus on the perception of segments, for which there is a far greater literature, but some of the same principles may extend to suprasegmentals as well.
L1 pronunciation
The development of L1 speech perception begins in utero, evidence for which is manifested immediately after birth (Zhao & Kuhl, 2018). Researchers have used laboratory techniques to establish that within hours of birth, babies not only show a preference for their mother’s voice (Lee & Kisilevsky, 2014; May et al., 2018) but are also capable of distinguishing between L1 versus foreign language sounds (Moon et al., 2013). These abilities are hypothesized to result from fetal experience of the mother’s voice, and the voices of other speakers of the L1, through the abdominal wall.
Despite their proclivity toward speakers heard in utero, newborns maintain the ability to perceive fine-grained phonetic details associated with any speech sounds found in any of the world’s languages for some time after birth. By six months, however, experience with their ambient language (or languages) results in refinement of their perceptual systems, such that they begin paying less attention to any phonetic information not needed to categorize L1 speech sounds (Kuhl, 2009). By the time they are a year old, infants begin losing the ability to discriminate foreign speech sounds, while their ability to recognize ambient language sounds strengthens (Zhao & Kuhl, 2018). This loss of perceptual plasticity corresponds with increasing L1 processing efficiency, which is a necessary precursor to the learning of L1 vocabulary and higher-order language skills (Kuhl, 2009). By the time children reach the age of four, they cannot discriminate sounds in a foreign language any better than an adult can (Werker, 2018).
Extensive research describing L1 speech perception and later language development makes it clear that the foundation of L1 pronunciation is accurate speech perception, which results in automatic categorization of sounds (Werker & Curtin, 2005). Though lagging behind perception, the development of L1 speech production follows the same trajectory. As with perception, infant vocalizations first emerge in language-independent ways. By ten months, babbling begins to reflect the properties of the ambient language (Grenon et al., 2007). The observed asymmetry between L1 perception and production is in part due to physiology, since speaking is a physical activity, while perception is cognitive. Over time, the acoustic properties of L1 speech production begin to closely match that of older speakers in the community (Flege, 2003). Kuhl and Meltzoff (1996) argue that children’s vocal imitation of interlocutors explains how this happens, as was illustrated in the personal anecdote with which I began this chapter.
L2 pronunciation
Proponents of the Critical Period Hypothesis (CPH) argued that L2 speech learning is fundamentally different from L1 speech development. In their view, the mechanisms used in L1 speech learning are no longer available, and instead, L2 learners must rely on general learning mechanisms, which are not optimized for speech (Scovel, 2000). While the CPH was once an appealing explanation for why adult L2 learners rarely achieve nativelike pronunciation, no clear empirical evidence has been found to support it. In contrast, strong evidence now favors a more nuanced understanding of L2 pronunciation development. Flege et al. (1995a, 1995b) investigated the L2 English pronunciation of Italian immigrants to Canada who had arrived at a range of ages from early childhood to adulthood. They found no critical biological period after which the ability to acquire a nativelike accent precipitously declined. Instead, the relationship between age of arrival and strength of foreign accent was found to be linear. Therefore, while these studies provide evidence that it is better to learn L2 pronunciation at a younger age, they appear to falsify the CPH’s claim, that the learning mechanisms used in L1 acquisition are lost during brain lateralization. Flege et al. (1997) later established that the strength of individuals’ L2 accent is also strongly correlated with the quantity of their L2 experience and the extent to which L2 learners continue to use their L1s in everyday life. Taken together, this evidence suggests that the perceptual mechanisms utilized in L1 learning remain intact over the lifespan, but that accessing them becomes increasingly difficult. L2 learners are no longer a blank slate but come to the task with established L1 categories and typically fewer opportunities to obtain impactful experience with the L2 (Flege, 1995; Flege & Bohn, 2021).
It appears, then, that the automatic nature of L1 speech perception after the age of four is what most conspires against adults easily accessing speech learning mechanisms. Unfortunately, after these perceptual processes have become automatic, it is difficult to notice the phonetic-level information needed to categorize sounds in a new language. Instead, L2 sounds are automatically filtered through L1 perceptual categories. Best and Tyler’s (2007) Perceptual Assimilation Model (PAM) and Flege’s (1995, see also Flege & Bohn, 2021 for an update) Speech Learning Model (SLM) are the most often-cited explanations of L1 influences on L2 learning. Both argue that the relative dissimilarity of L1 and L2 speech sounds predicts how easy it will be for L2 learners to acquire sound categories in a new language. When L1 and L2 speech sounds are identical, or nearly so, a simple substitution will suffice. Nothing needs to change regarding the automatic processing for such sounds. When an L2 sound is unlike any sound category in the learner’s L1, new category development is also likely to occur, but it may take some time. The greatest challenge in L2 pronunciation learning presents when one or more dissimilar L2 categories are perceptually assimilated to a single L1 category. For example, a Japanese L2 English learner may perceive both English /l/ and /ɹ/ to be equivalent to a single Japanese apico-alveolar tap /ɾ/. This misperception of two contrasting English categories causes Japanese speakers to substitute their /ɾ/ sound for both English /l/ and /ɹ/. It happens to be the case that despite the Japanese category not being a perfect example of English /l/, English L1 listeners perceive it to be closest to their /l/ category, and perceive it as such. This results in English L1 listeners recognizing Japanese attempts to pronounce English /l/-/ɹ/ word pairs as homophonous (e.g., their renditions of “right” and “light” are both perceived as foreign-accented versions of “light”). Similar assimilation patterns occur for learners of other L2s. For example, English learners of a Hindi alveolar and retroflex stop contrast typically assimilate both Hindi categories to English /t/ (Guion & Pederson, 2007).
In the case of the English /l/-/ɹ/ contrast, acoustic information needed to discriminate these sounds is tuned out by Japanese L1 speakers, since it has no importance in their language (Brown, 1998). Similarly, English L1 learners of Mandarin tonal contrasts cannot easily recognize tonal distinctions because pitch cues associated with Mandarin tones are not used in the same way in English (Guion & Pederson, 2007). In sum, the primary source of difficulty in L2 pronunciation development is learners’ inability to reorient attention to phonetic information that they have learned to ignore (Chang, 2018). The processing efficiency that was an advantage to L1 learning has now become an impediment in L2 learning. To successfully learn L2 sounds, learners must re-educate this selective perception (Strange & Shafer, 2008). In the next section we will see that this is indeed possible.
Critical Issues
While L2 pronunciation development follows similar paths in both naturalistic and instructed learning contexts, what learners ultimately achieve varies. In naturalistic environments, perceptual processes remain largely automatic until there is a breakdown in communication, which may alert learners to problems with their perception of a sound. In contrast,