Perceptual adaptation of atypical vowels


When perceiving speech, listeners encounter enormous variability in speech sounds. Despite this variability, listeners are shown to successfully recognize the variable speech input and understand speech without much difficulty. For example, native listeners understand foreign-accented speech by rapidly adapting to unfamiliar speech input that sounds atypical compared to their long-term representation of those sounds (Idemaru & Holt, 2011). Most studies of perceptual learning have focused on learning of speech sounds based on group observations (Bradlow & Bent, 2008; Reinisch & Holt, 2014). Also, relatively little attention has been paid to individual cognitive abilities in relation to perceptual learning although global cognitive abilities may be closely linked to individuals’ perceptual learning outcomes (Bent, Baese-Berk, Borrie, & McKee, 2016). The present study examines whether listeners adapt to atypical speech sounds and how they reorganize acoustic-phonetic structures to successfully adapt to those speech sounds. Furthermore, this study examines how individual listeners’ adaptive strategies in speech perception are related to their cognitive abilities. Thirty-six monolingual speakers of English completed a perceptual adaptation task and cognitive tasks as possible predictors of individual perceptual learning performance. Perceptual adaptation stimuli were created from productions of head and had recorded by a native speaker of English. Seven steps varying in formant frequency (created with TANDEM-STRAIGHT) were crossed with 2 duration steps (PSOLA in Praat). Baseline stimuli consisted of 7 spectral steps and 2 duration steps, and Exposure stimuli consisted of 6 tokens of ambiguous formant frequencies and 12 adjacent ambiguous tokens to the most ambiguous tokens. Both Baseline and Exposure stimuli included 2 test stimuli (See Figure 1). Cognitive ability tasks included Stroop (inhibition), Corsi (working memory), Berg’s Card Sorting (cognitive flexibility), and Continuous Performance (sustained attention) using PEBL (Mueller & Piper, 2014). The results showed that listeners rapidly adapt to atypical speech by increasing their use of the secondary acoustic dimension (i.e., vowel duration) to overcome a situation when the primary dimension (i.e., vowel formants) is not informative (Figure 2). This indicates that listeners use more secondary acoustic dimensions to maintain a phonetic contrast when the most reliable acoustic-phonetic dimension is no longer informative for phonetic categorization. Furthermore, listeners’ cognitive abilities (i.e., inhibition) predicted their adaptation of atypical speech sounds where listeners with better inhibitory control showed better perceptual adaptation to unfamiliar speech sounds. This shows that individual differences in inhibitory control is linked to how listeners inhibit less relevant acousticphonetic dimensions and selectively attend to more relevant dimensions for perceptual adaptation to unfamiliar speech. Together, this study suggests that listeners can flexibly change their categorization strategies in speech perception to adapt to the unfamiliar speech input by increasing their use of secondary acoustic-phonetic dimensions which generally contribute to a minor role in a normal listening context. Further, this adaptive plasticity in speech perception is in part related to individual listeners’ cognitive abilities. The present findings can contribute to our understanding of adaptive processes in speech perception for dealing with acoustic-phonetic variability.

University of Seoul, Seoul, South Korea