{"title":"Shape matters: Machine classification and listeners’ perceptual discrimination of American English intonational tunes","authors":"J. Cole, Jeremy Steffman, Sam Tilsen","doi":"10.21437/speechprosody.2022-61","DOIUrl":null,"url":null,"abstract":"In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.","PeriodicalId":442842,"journal":{"name":"Speech Prosody 2022","volume":"21 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Speech Prosody 2022","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/speechprosody.2022-61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In Autosegmental-Metrical models of intonational phonology, pitch accents, phrase accents and boundary tones may combine freely to create a predicted set of phonologically distinct phrase-final “nuclear” tunes. In this study we ask if an 8-way distinction in nuclear tune shape in American English, predicted from combinations of 2 (monotonal) pitch accents, 2 phrase accents and 2 boundary tones, is manifest in speech production and in speech perception. F0 trajectories from an imitative speech production experiment were analyzed using (i) neural net classification, and (ii) human listeners’ perceptual discrimination of the model utterances. Pairwise classification accuracy of the imitative productions is highest for tune pairs that differ in holistic shape (high-rising vs. rise-fall), and poorest for tunes with the same shape that differ in (higher vs. lower) final f0. Perception results show a similar pattern, with poor pairwise discrimination for tunes that differ primarily, but by a small degree, in final f0. Together the results suggest a hierarchy of distinctiveness among nuclear tunes, with a robust distinction based on holistic tune shape, which only partly aligns with distinctions in tonal specification, and a weak/poorly differentiated distinction between tunes with the same holistic shape but small differences in final f0.