Luis M.T. Jesus , Sara Castilho , Aníbal Ferreira , Maria Conceição Costa
{"title":"Discriminative segmental cues to vowel height and consonantal place and voicing in whispered speech","authors":"Luis M.T. Jesus , Sara Castilho , Aníbal Ferreira , Maria Conceição Costa","doi":"10.1016/j.wocn.2023.101223","DOIUrl":"https://doi.org/10.1016/j.wocn.2023.101223","url":null,"abstract":"<div><h3>Purpose</h3><p>The acoustic signal attributes of whispered speech potentially carry sufficiently distinct information to define vowel spaces and to disambiguate consonant place and voicing, but what these attributes are and the underlying production mechanisms are not fully known. The purpose of this study was to define segmental cues to place and voicing of vowels and sibilant fricatives and to develop an articulatory interpretation of acoustic data.</p></div><div><h3>Method</h3><p>Seventeen speakers produced sustained sibilants and oral vowels, disyllabic words, sentences and read a phonetically balanced text. All the tasks were repeated in voiced and whispered speech, and the sound source and filter analysed using the following parameters: Fundamental frequency, spectral peak frequencies and levels, spectral slopes, sound pressure level and durations. Logistic linear mixed-effects models were developed to understand what acoustic signal attributes carry sufficiently distinct information to disambiguate /i, a/ and /s, ʃ/.</p></div><div><h3>Results</h3><p>Vowels were produced with significantly different spectral slope, sound pressure level, first and second formant frequencies in voiced and whispered speech. The low frequencies spectral slope of voiced sibilants was significantly different between whispered and voiced speech. The odds of choosing /a/ instead of /i/ were estimated to be lower for whispered speech when compared to voiced speech. Fricatives’ broad peak frequency was statistically significant when discriminating between /s/ and /ʃ/.</p></div><div><h3>Conclusions</h3><p>First formant frequency and relative duration of vowels are consistently used as height cues, and spectral slope and broad peak frequency are attributes associated with consonantal place of articulation. The relative duration of same-place voiceless fricatives was higher than voiced fricatives both in voiced and whispered speech. The evidence presented in this paper can be used to restore voiced speech signals, and to inform rehabilitation strategies that can safely explore the production mechanisms of whispering.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49816846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Production and perception of prevelar merger: Two-dimensional comparisons using Pillai scores and confusion matrices","authors":"Valerie Freeman","doi":"10.1016/j.wocn.2023.101213","DOIUrl":"10.1016/j.wocn.2023.101213","url":null,"abstract":"<div><p>Vowel merger production is quantified with gradient acoustic measures, while phonemic perception methods are often coarser, complicating comparisons within mergers in progress. This study implements a perception experiment in two-dimensional formant space (F1 × F2), allowing unified plotting, quantification, and statistics with production data. Production and perception are compared within 20 speakers for a two-part prevelar merger in progress in Pacific Northwest English, where mid-front /ɛ, e/ approximate or merge before voiced velar /ɡ/ (<span>leg–vague</span> merger), and low-front prevelar /æɡ/ raises toward them (<span>bag-</span>raising). Distributions are visualized with kernel density plots and overlap quantified with Pillai scores and confusion matrices from linear discriminant analysis models. Results suggest that <span>leg–vague</span> merger is perceived as more complete than it is produced (in both the sample and community), while <span>bag-</span>raising is highly variable in production but rejected in perception. Relationships between production and perception varied by age, with raising and merger progressing across two generations in production but not perception, followed by younger adults perceiving <span>leg–vague</span> merger but not producing it and varying in (minimal) raising perception while varying in <span>bag</span>-raising in production. Thus, prevelar raising/merger may be progressing among some social groups but reversing in others.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879351/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10576296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speakers coarticulate less in response to both real and imagined communicative challenges: An acoustic analysis of the LUCID corpus","authors":"Zhe-chen Guo, Rajka Smiljanic","doi":"10.1016/j.wocn.2022.101210","DOIUrl":"https://doi.org/10.1016/j.wocn.2022.101210","url":null,"abstract":"<div><p>Overlap of adjacent articulatory gestures leads to coarticulation. Understanding how hyperarticulated intelligibility-enhancing clear speech modifications affect coarticulation can inform theories of phonetic variation and speech intelligibility. However, prior research yielded mixed findings regarding the relationship between hyperarticulation and coarticulatory patterns. This study extends previous work by analyzing the degree of coarticulation across several different communicative conditions in the LUCID corpus (<span>Baker & Hazan, 2010</span>). Southern British English speakers completed an interactive spot-the-difference task with a partner with and without a communicative barrier (e.g., speech degraded by talker babble). They also read sentences without an interlocutor casually and clearly. Diphones in keywords produced in both tasks were analyzed using two whole-spectrum measures, with greater spectral distance and shorter coarticulatory overlap between the diphones indexing less coarticulation. Results revealed that speakers coarticulated less in response to both real (interactive task) and imaginary (sentence-reading) communicative challenges. Speakers furthermore varied the degree of coarticulatory resistance in different real communicative barriers. Diphones with greater consonant articulatory constraint were less sensitive to differences between the conditions, suggesting a limit to the hyperarticulation-induced phonetic variation. The findings agree with the models of targeted speaker adaptations assuming coarticulatory resistance in hyperarticulated clear speech (the H&H theory: <span>Lindblom, 1990</span>).</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49816840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison
{"title":"Speaker-specificity in speech production: The contribution of source and filter","authors":"Vincent Hughes , Amanda Cardoso , Paul Foulkes , Peter French , Amelia Gully , Philip Harrison","doi":"10.1016/j.wocn.2023.101224","DOIUrl":"https://doi.org/10.1016/j.wocn.2023.101224","url":null,"abstract":"<div><p>This study examines the extent to which speaker-specific information is encoded in different features of vocal output and the relationships between those features. A range of acoustic features, grouped as source (laryngeal voice quality measures and fundamental frequency) and filter features (formants and Mel-frequency cepstral coefficients; MFCCs), were extracted from the vocalic portion of the hesitation marker <em>um</em> for 90 male speakers of Standard Southern British English. Little overall correlation between the sets of features was observed, suggesting no strong interdependence between source and filter in our data. Although filter features were consistently better at discriminating between same- and different-speaker pairs compared with source features, combining source and filter has the potential of producing the lowest error rates and the strongest speaker discrimination scores. Taken together, results show that source and filter provide complementary speaker-specific information. However, the extent of the improvements in speaker discrimination performance when combining source and filter varied across speakers. We explore potential explanations for this finding and discuss the implications for source-filter theory, and for applied fields such as speaker recognition and forensic speech science.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49816844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anita Lorenc , Marzena Żygis , Łukasz Mik , Daniel Pape , Márton Sóskuthy
{"title":"Articulatory and acoustic variation in Polish palatalised retroflexes compared with plain ones","authors":"Anita Lorenc , Marzena Żygis , Łukasz Mik , Daniel Pape , Márton Sóskuthy","doi":"10.1016/j.wocn.2022.101181","DOIUrl":"https://doi.org/10.1016/j.wocn.2022.101181","url":null,"abstract":"<div><p>The present paper investigates articulatory and acoustic variation in Polish palatalised retroflex sibilants compared with their plain counterparts. It tests the hypothesis advanced by Hamann (2003: 44) that palatalised retroflexes are non-existent and that retroflexes in Polish change to palato-alveolars [ʃ ʒ t͡ʃ d͡ʒ] when being palatalised. Based on articulatory data from 20 speakers we provide evidence that at least part of the data (53.5%) are palatalised retroflexes [ʂʲ ʐʲ ʈ͡ʂʲ ɖ͡ʐʲ]. The plain counterparts are shown to be retroflex, as proposed by Hamann (2003).</p><p>Our averaged results indicate that both palatalised and plain retroflexes show a convex tongue shape. However, individual data reveals a wide range of realisations, from a bunched dorsum to flat and even hollowed tongue shapes. Taking this variability into account, we propose a new tongue shape classification based on Heron’s Formula – i.e. concave, slightly concave, flat, convex and slightly convex. The different tongue shapes are also visualised in the form of videos created using GAMMs.</p><p>Regarding acoustic results, our analysis reveals that the strongest correlate of palatalised retroflex sibilants is longer duration of frication in palatalised sibilants followed by higher Centre of Gravity (COG) and m1 spectral slope.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49754776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Schwa’s duration and acoustic position in American English","authors":"Uriel Cohen Priva, Emily Strand","doi":"10.1016/j.wocn.2022.101198","DOIUrl":"https://doi.org/10.1016/j.wocn.2022.101198","url":null,"abstract":"<div><p>Is American English schwa’s position determined solely by the context in which it appears? Do vowels neutralize to schwa when their duration is shorter? We address these two inter-related questions using the Buckeye corpus to study vowel behavior across multiple contexts of spontaneous speech. We find that all except tense high vowels shift to lower F1 values when their duration is relatively short, including lax high vowels and lexical schwas, rather than toward a mid-vowel position that schwa occupies when its duration is long. However, we also replicate the finding that schwa is more dependent on both context and duration than other vowels. The results are not consistent with the idea that schwa’s position is determined exclusively by the context in which it appears. However, schwa’s shift to higher F1 values when its duration is longer is not necessarily different from other vowels’ shift to higher F1 values when their duration is longer, making it unnecessary to argue that schwa’s mid-vowel properties are due to having a target in F1 terms.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49760260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Constantijn Kaland , Marc Swerts , Nikolaus P. Himmelmann
{"title":"Red and blue bananas: Time-series f0 analysis of contrastively focused noun phrases in Papuan Malay and Dutch","authors":"Constantijn Kaland , Marc Swerts , Nikolaus P. Himmelmann","doi":"10.1016/j.wocn.2022.101200","DOIUrl":"https://doi.org/10.1016/j.wocn.2022.101200","url":null,"abstract":"<div><p>The prosody of Papuan Malay, spoken in the easternmost provinces of Indonesia, is not fully described and understood. The limited work available suggests that phrase prosody in this language is different from other well-studied (West-Germanic) languages. However, not much is known about possible correlates of focus marking, for which prosody is used extensively in languages like Dutch and English. To gain insight into universal and specific usages of prosody, this study reports two identical production experiments and acoustic analyses carried out for Papuan Malay and Dutch, to investigate the prosody of noun phrases in different contrastive focus conditions. Participants in the experiments described pictures with different shapes and colors using specific matrix phrases. The prosody of these descriptions was examined by time-series measures of f0 and statistically analysed using generalised additive mixed models (GAMMs). Results show that speakers of Papuan Malay do not use f0 to mark contrastively focused noun phrases, unlike Dutch speakers. The main function of f0 in Papuan Malay phrases appears to be boundary marking on the final syllable in the phrase, a function also observed in Dutch. In addition, the pre-final syllable in the Papuan Malay phrase was always marked with a rising f0, whereas in Dutch an interaction between the boundary and focus marking was found. The results are discussed in a typological perspective and provide new insights into the prosody of Papuan Malay.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49754732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phonological and phonetic contributions to perception of non-native lexical tones by tone language listeners: Effects of memory load and stimulus variability","authors":"Juqiang Chen , Mark Antoniou , Catherine T. Best","doi":"10.1016/j.wocn.2022.101199","DOIUrl":"https://doi.org/10.1016/j.wocn.2022.101199","url":null,"abstract":"<div><p>The present study examined native language phonological and phonetic factors in non-native lexical tone perception by tone language listeners, manipulating memory load and stimulus variability to bias listeners towards a more phonological or more phonetic mode of perception. Mandarin and Vietnamese listeners categorised the five Thai lexical tones to their native tones, and discriminated five selected Thai tone contrasts that were predicted by the Perceptual Assimilation Model (PAM, <span>Best, 1995</span>) to be discriminated differently. Categorisation responses showed more phonologically-based patterns under high than low memory load but were unaffected by talker and vowel variability, whereas discrimination accuracy was reduced by talker and vowel variability but not by memory load. Phonological factors indicated by type of categorisation and category overlap generally predicted the discrimination of non-native tone contrasts in line with PAM principles. Phonetic factors reflected in category overlap scores and fit index difference scores predicted variations in discriminating contrasts of the same contrast categorisation type. These findings uphold the extension of PAM principles to non-native tone perception by native listeners of other tone languages. Native phonological and phonetic contributions to non-native speech perception differ between categorisation and discrimination tasks, as reflected in differential modulation by memory load and stimulus variability.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49760261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Corrigendum to “Homophone discrimination based on prior exposure” [J. Phonet. 95 (2022) 101182]","authors":"Chelsea Sanker","doi":"10.1016/j.wocn.2022.101211","DOIUrl":"https://doi.org/10.1016/j.wocn.2022.101211","url":null,"abstract":"","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49760262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Native language experience with tones influences both phonetic and lexical processes when acquiring a second tonal language","authors":"Eric Pelzl , Jiang Liu , Chunhong Qi","doi":"10.1016/j.wocn.2022.101197","DOIUrl":"10.1016/j.wocn.2022.101197","url":null,"abstract":"<div><p>Second language acquisition of lexical tones requires that a learner form appropriate tone categories and bind those categories to lexical representations for fluent word recognition. Research has shown that second language (L2) learners with no previous tone language experience can become highly accurate at identification of tones in isolation, but, even at advanced levels, have difficulty using tones to differentiate real words from nonwords. The present research considers the same skills in L2 learners who <em>do</em> have previous tone experience. Using largely the same tasks and stimuli previously used with English speakers in Pelzl, Lau, Guo, & DeKeyser (2021a) (“PLGD21”), we examined the tone identification and (tone) word recognition abilities of thirty-three Vietnamese speakers who had achieved advanced L2 proficiency in Mandarin. Results indicate that Vietnamese speakers experience different tone identification difficulties than English speakers, presumably due to interference from their native language tone categories. However, unlike English speakers in previous studies, Vietnamese speakers did not display differences in lexical decision accuracy for vowel and tone nonwords. These results provide evidence of the complexities of cross-linguistic influence, illustrating that the influence of native language tones can be illuminated by considering perception and acquisition at multiple levels.</p></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":null,"pages":null},"PeriodicalIF":1.9,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126274037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}