The link between speech perception and production and the mechanisms of phonetic imitation

Abstract

Previous studies exploring a perception-production link within individuals have been mixed due in part to the complex nature of the relationship and disparate speech perception and production tasks (Beddor, 2015; Shultz et al., 2012). The present study uses phonetic imitation and manipulated stimuli with the goal of more directly probing the link between speech perception and production. More specifically, this study examines (1) whether individual listeners’ perceptual cue weights are related to their patterns of phonetic imitation and (2) the underlying mechanisms of phonetic imitation. Twenty-three native speakers of English completed a two-alternative forced choice identification task followed by a baseline production and a forced imitation task. Perception stimuli were created from productions of head and had recorded by a native speaker of English. Seven steps varying in formant frequency (created with TANDEM-STRAIGHT) were crossed with 7 duration steps (PSOLA in Praat). Imitation stimuli were a subset of stimuli from the perception task plus extended and shortened vowel durations. We found that natural, shortened and extended vowel durations were imitated well, indicating fine-grained sensitivity for imitation. Natural formant frequencies were imitated well but ambiguous formant frequencies were not, suggesting an effect of phonological categorization. Thus, preserving phonetic details in imitation may depend on the nature of target stimuli. The results from the relation between cue weights and degree of imitation suggest that individuals with greater ability to use formant frequency (higher weights) in perception showed more imitation of vowel duration. This may indicate that better phonetic perception leads to more fine-grained imitation in dimensions which are not constrained by phonological categorization. Our results suggest that phonetic imitation is mediated in part by a low-level cognitive process involving a direct link between perception and production as evidenced by imitation of all vowel durations. However, this study also suggests that imitation is mediated by a high-level linguistic component, i.e., phonological contrasts, which is a selective rather than an automatic process as indicated by imitation of phonologically relevant formant frequencies.

Date
Location
UQAM, Montreal, Canada