{"title":"Relative importance of stress correlates in native listeners’ identification of Spanish lexical stress produced by monolingual and bilingual speakers","authors":"Ji Young Kim","doi":"10.1016/j.wocn.2025.101418","DOIUrl":"10.1016/j.wocn.2025.101418","url":null,"abstract":"<div><div>Spanish has many minimal stress pairs, and lexical stress in Spanish is marked primarily via suprasegmental cues. Thus, sensitivity to suprasegmental information is crucial for spoken-word identification in Spanish. Using stimuli produced by speakers of Mexican Spanish with varying language learning experience (i.e., monolingual speakers, heritage speakers, L2 learners), this study examines native listeners’ identification of Spanish lexical stress under enhanced variability in phonetic cues. Our data demonstrate that listeners exploit various stress correlates in the speech signal and assign different weights to them, which is context-specific; when there is a pitch accent, native listeners mainly attend to f0-related cues, whereas in the absence of a pitch accent, intensity plays a stronger role. Our data also show that clustering based on stress correlates is not consistent with the predetermined monolingual-heritage-L2 group division, which indicates that language learning experience alone is not sufficient to explain how Spanish speakers mark stress. This study highlights the importance of incorporating variable speech data in speech perception research and adopting a data-driven, individual-centered approach to speaker grouping in cross-sectional studies.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"111 ","pages":"Article 101418"},"PeriodicalIF":1.9,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144115751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The role of tone and phrasing in the occurrence of period doubling and vocal fry in Mandarin","authors":"Yaqian Huang","doi":"10.1016/j.wocn.2025.101416","DOIUrl":"10.1016/j.wocn.2025.101416","url":null,"abstract":"<div><div>Period doubling, an under-studied yet frequently-occurring subtype of creaky voice, has distinct acoustic and phonatory properties compared to vocal fry, the most-studied and known subtype of creaky voice. Little is known about their distributional patterns across tones or utterances, let alone their potentially different functions in informing linguistic meaning and categories. In this paper, I investigate the tonal and phrasal influences on the distribution of these two voicing types as they occur sub-phonemically in Mandarin Chinese. The results show that both creak subtypes occur most frequently in Tones 3 and 2, and period doubling is more widespread across tones than vocal fry. Period doubling occurs most frequently at utterance edges, with its frequency gradually increasing toward the end of utterances, possibly reflecting vocal instability. Vocal fry, in contrast, is concentrated in the post- and pre-focal positions conditioned by the sentence-medial stimuli and in utterance-final positions, suggesting a stronger linguistic role in marking weak prosodic elements. This study also discusses implications for speech production and linguistic functions of different kinds of creak.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"111 ","pages":"Article 101416"},"PeriodicalIF":1.9,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The role of prior knowledge in second-language learners’ overnight consolidation of Cantonese tones","authors":"Quentin Zhen Qin , Rui Jin , Ruofan Wu","doi":"10.1016/j.wocn.2025.101417","DOIUrl":"10.1016/j.wocn.2025.101417","url":null,"abstract":"<div><div>This study examines the role of prior (tonal) knowledge in memory consolidation of non-native tones after an overnight sleep. While memory consolidation is beneficial in learning new sounds in a second language, only new linguistic information consistent with the existing knowledge is often prioritized for consolidation. What remains unclear from the research is whether prior tonal knowledge from a native language (i.e., pitch contour signaling the Mandarin contour-tone system) influences an overnight consolidation of tone learning. The study adopts an overnight design, using Cantonese contour and level tones contrasting in pitch contour and height, for two perceptual learning experiments conducted separately on Mandarin and English-speaking novice learners of Cantonese. The first experiment found that Mandarin-speaking participants showed a stronger effect of consolidation in novel words contrasting in contour tones than in level tones, thanks to their prior knowledge of contour tones. The consolidation effect was predicted by rough estimates of deep-sleep length. Without prior knowledge of tones, English-speaking L2 learners in the second experiment showed an (unexpected) offline improvement for both contour and level tones. Overall, the findings suggest a preferential effect on overnight consolidation of contour tones when the cues contrasting L2-Cantonese tones are consistent with L1-Mandarin prior knowledge.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"111 ","pages":"Article 101417"},"PeriodicalIF":1.9,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143931342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advancements of phonetics in the 21st century: Quantitative data analysis","authors":"Morgan Sonderegger , Márton Sóskuthy","doi":"10.1016/j.wocn.2025.101415","DOIUrl":"10.1016/j.wocn.2025.101415","url":null,"abstract":"<div><div>Phonetic research in the 21st century has relied heavily on quantitative analysis. This article reviews the evolution of common practices and the emergence of newer techniques. Using a detailed literature survey, we show that most work follows a mainstream, which has shifted from ANOVAs to mixed-effects regression models over time. Alongside this mainstream, we highlight the increasing use of a diverse methodological toolbox, especially Bayesian methods and dynamic methods, for which we provide comprehensive reviews. Bayesian methods, as well as frequentist methods beyond linear and logistic regression, offer flexibility in model specification, interpretation, and incorporation of prior knowledge. Dynamic methods, such as GAMs and functional data analysis, capture non-linear patterns in acoustic and articulatory data. Machine learning techniques, such as random forests, expand the questions and types of data phoneticians can analyze. We also discuss the growing importance of open science practices promoting replicability and transparency. We argue that the future lies in a diverse methodological toolbox, with techniques chosen based on research questions and data structure.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"111 ","pages":"Article 101415"},"PeriodicalIF":1.9,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143923730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kenneth de Jong , Yu-Jung Lin , Yen-Chen Hao , Hanyong Park
{"title":"Mapping to perceptual identification in Mandarin learners of English","authors":"Kenneth de Jong , Yu-Jung Lin , Yen-Chen Hao , Hanyong Park","doi":"10.1016/j.wocn.2025.101411","DOIUrl":"10.1016/j.wocn.2025.101411","url":null,"abstract":"<div><div>This paper examines the relationship between cross-language segmental mapping and second language identification accuracy in Taiwan Mandarin speakers learning English, and compares this relationship with that found in previous, parallel research on Korean learners of English. Mapping and identification data were collected for English anterior plosives and non-sibilant fricatives, by means of two parallel identification experiments. Mapping data came from a 17-alternative identification task with <em>Zhuyin Fuhao</em> labels (phonetic script used to annotate Mandarin sounds in Taiwan), and identification data came from a 15-alternative identification task with Roman labels, both applied to the same stimuli. Mapping data were used to generate predictions about the identification performance by estimating what the performance would be, given the use of only the Mandarin categories. Like the previous Korean data, Mandarin speakers exhibited identification rates for plosives that are very close to predicted, indicating that their plosive identification performance was heavily entangled with their Mandarin system, while fricative identification performance was greatly under-predicted by the mapping data. Further analyses of category differentiation measured with <em>d</em>-prime estimates showed that Mandarin speakers’ manner differentiation performance was very well-predicted by the mapping data, while Korean speakers’ laryngeal differentiation was better predicted. Taken together, these results indicate that the second language identification performance and the cross-language mapping into the first language are closely entangled in a single system. The additional second language component appears in a pervasive increment in performance in the second language beyond what is predicted from the first language system, and in more unaccounted-for variance in laryngeal identification than in manner identification.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101411"},"PeriodicalIF":1.9,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contribution of F0 and phonation to tone perception in the Zaiwa language","authors":"Yao Lu, Changwei Liang, Jiangping Kong","doi":"10.1016/j.wocn.2025.101413","DOIUrl":"10.1016/j.wocn.2025.101413","url":null,"abstract":"<div><div>Previous research on categorical perception of tone has primarily examined the influence of fundamental frequency (F0), while the role of phonation, though increasingly studied, remains underexplored. This study investigates the role of phonation and how it interacts with F0 cues in tone perception, using the Zaiwa language as a case study. Specifically, we examine the categorical perception of Tone 44 (produced with a pressed voice) and Tone 35 (produced with a modal voice). To achieve this, we first conducted an acoustic analysis of the Zaiwa tone system, which forms the basis for our novel method of speech synthesis. Using this method, we created six tonal continua between Tone 44 and Tone 35 by systematically modifying F0 alone, phonation alone, and both simultaneously. Native Zaiwa speakers then participated in an experiment using the categorical perception paradigm with these synthesized continua. The results indicate that the participants were unable to distinguish the phonemic categories of the two tones when only phonation was modified. While modifying F0 alone allowed for tone distinction, participants’ perception followed a continuous pattern. However, when both F0 and phonation were modified simultaneously, participants accurately identified the phonemic categories of tones and perceived the continuum between the two tones categorically. These findings suggest that both F0 and phonation serve as perceptual cues for distinguishing Tone 44 and Tone 35 in Zaiwa, with F0 as the primary cue and phonation as a secondary cue. However, phonation remains crucial, as its absence weakens the categorical perception of these tones.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101413"},"PeriodicalIF":1.9,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shihao Du, Stephan R. Kuberski, Adamantios I. Gafos
{"title":"Corrigendum to “Towards a dynamical account of inter-segmental coordination” [J. Phon. 109 (2025) 101392]","authors":"Shihao Du, Stephan R. Kuberski, Adamantios I. Gafos","doi":"10.1016/j.wocn.2025.101414","DOIUrl":"10.1016/j.wocn.2025.101414","url":null,"abstract":"","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101414"},"PeriodicalIF":1.9,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143838437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contextual and paradigmatic effects on suspended contrast across generations: The case of Cantonese pinjam revisited","authors":"Alan C.L. Yu , Vivian Guo Li , Peggy P.K. Mok","doi":"10.1016/j.wocn.2025.101412","DOIUrl":"10.1016/j.wocn.2025.101412","url":null,"abstract":"<div><div>Suspended contrast refers to the phenomenon whereby sound change brings two phonemes into such close approximation that semantic contrast between them is suspended for native speakers of the language, without necessarily leading to complete merger or neutralization. The existence of suspended contrasts not only raises questions about the nature of the phonetics-phonology interface, but also for theories of sound change that assume sound change is biased toward selective maintenance of phonemes that contribute more to distinguishing existing lexical items in usage. Small differences supporting a suspended contrast are expected to disappear quickly given that they do not serve any apparent communicative functions. It remains a question whether a contrast can be suspended for a considerable period of time. This study revisits a case of suspended contrast in Cantonese between the lexical high rising tone and the high rising tone derived through morphological tone change (<em>pinjam</em>). We use an apparent-time approach to investigate the diachronic trajectory of this neutralization by comparing the distribution of this suspended contrast along both F0 and durational dimensions across two generations of Hong Kong Cantonese speakers. While this case of suspended tonal contrast has been in circulation for almost a century, our findings suggest that the distinction might be disappearing among the younger speakers. Only older speakers maintain a distinction between the lexical and derived rising tones, albeit in very restricted tonal contexts. The fact that this suspended tonal contrast exhibits great sensitivity to contextual and morphological influences may help explain the progression of this case of merger-in-progress.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101412"},"PeriodicalIF":1.9,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143816847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Normalization, essentialization, and the erasure of social and linguistic variation","authors":"Santiago Barreda","doi":"10.1016/j.wocn.2025.101409","DOIUrl":"10.1016/j.wocn.2025.101409","url":null,"abstract":"<div><div>Linguists investigating the phonetic properties of vowels, e.g. height and frontness, often use normalization algorithms to remove ‘irrelevant’ variation from vowel formant data. The current conception and evaluation of these algorithms focuses on phonemic classification and the removal of ‘anatomical’ variation, an approach which suggests an essentialist perspective on linguistic variation and leads to the erasure and underreporting of linguistic and social information. Instead, it is suggested that for many purposes, researchers need algorithms that correctly represent phonetic information by removing only <em>non-phonetic</em> formant variation. Acoustic variation that does not affect phonetic properties is non-phonetic, making it ‘transparent’ to the linguistic system and incapable of communicating linguistic contrast. Evidence is presented that only the uniform scaling of formant patterns appears to be non-phonetic, indicating that uniform scaling normalization algorithms should be preferred. Finally, given that phonetic properties are products of human psychology that enter into experience only through perception, it is argued that the normalization algorithms used by phoneticians and sociolinguists should be thought of as models of human perception. The change to a perceptual and phonetic, rather than anatomical and phonemic, approach to normalization will promote more reliable and theoretically sound research outcomes, and better aligns with linguistic theory.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101409"},"PeriodicalIF":1.9,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143808284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Gesture-Field-Register (GFR) framework for modeling F0 control","authors":"Seung-Eun Kim , Sam Tilsen","doi":"10.1016/j.wocn.2025.101410","DOIUrl":"10.1016/j.wocn.2025.101410","url":null,"abstract":"<div><div>In this study, we introduce an F0 modeling framework – which we refer to as the Gesture-Field-Register (GFR) framework – in which F0 production involves joint control of relatively generic intentions and how those intentions are mapped to physical F0 values. Building on Articulatory Phonology (AP) and Task Dynamics (TD), the GFR framework considers F0 gestures to be the fundamental units of F0 control. It further holds (i) that the dynamic target F0 state of a speaker is determined by the blending of F0 gestural targets in a planning field and (ii) that the gestural targets and dynamic targets are represented in normalized values which are converted to F0 in Hz via dynamic control of F0 register. We show how this framework accounts for a variety of empirical F0 patterns, and we present a case study that uses parameter optimization to analyze empirical F0 contours into a time series of gestural activation and register states. In doing so, we demonstrate that the framework allows for gestural targets to be invariant within an utterance, despite the fact that the surface contours are highly variable. Model code and examples for generating and fitting F0 contours are publicly available in Github and OSF repositories. Overall, the GFR framework provides a novel way of conceptualizing and modeling F0 control under AP/TD and further expands the AP/TD by incorporating the mechanisms of a planning field and dynamic register control.</div></div>","PeriodicalId":51397,"journal":{"name":"Journal of Phonetics","volume":"110 ","pages":"Article 101410"},"PeriodicalIF":1.9,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143697848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}