Mark Berardi, Erin Tippit, Yixiang Gao, Guilherme N DeSouza, Maria Dietrich
{"title":"Automated Analysis of Relative Fundamental Frequency in Continuous Speech: Development and Comparison of Three Processing Pipelines.","authors":"Mark Berardi, Erin Tippit, Yixiang Gao, Guilherme N DeSouza, Maria Dietrich","doi":"10.1016/j.jvoice.2025.04.006","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Relative fundamental frequency (RFF) estimates laryngeal tension during speech, providing insights into vocal effort. Current methods to derive RFF from continuous speech require manual processing, hindering large-scale studies with ecologically valid speech productions. This research aimed to develop and evaluate three fully automated pipelines for RFF analysis from continuous speech, addressing this limitation.</p><p><strong>Methods: </strong>Three pipelines were compared: two modifications of an existing semiautomated approach [automated relative fundamental frequency (aRFF)-AP] and one novel pipeline replicating manual analysis. The pipelines were tested on speech samples containing vowel-consonant-vowel (VCV) utterances from 82 female participants with and without vocal fatigue complaints in the absence of phonotraumatic vocal fold changes. The pipelines automatically segmented VCVs and measured RFF. Manual measurements of a subset provided reliability and validity benchmarks.</p><p><strong>Results: </strong>All pipelines demonstrated good reliability (r ≥ 0.84) and validity when compared with manual analysis. They required minimal manual correction (<4%) for fricative identification. Notably, the novel aRFF-B pipeline rejected the fewest samples (10%-25%) while maintaining reliability and was able to leverage parallel computing.</p><p><strong>Conclusions: </strong>Three automated pipelines, especially aRFF-B, enabled time-efficient RFF analysis of large continuous speech data sets without manual intervention. This advancement can facilitate large-scale studies using RFF applied to continuous speech, potentially expanding its application in voice research and clinical practice.</p>","PeriodicalId":49954,"journal":{"name":"Journal of Voice","volume":" ","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Voice","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jvoice.2025.04.006","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Relative fundamental frequency (RFF) estimates laryngeal tension during speech, providing insights into vocal effort. Current methods to derive RFF from continuous speech require manual processing, hindering large-scale studies with ecologically valid speech productions. This research aimed to develop and evaluate three fully automated pipelines for RFF analysis from continuous speech, addressing this limitation.
Methods: Three pipelines were compared: two modifications of an existing semiautomated approach [automated relative fundamental frequency (aRFF)-AP] and one novel pipeline replicating manual analysis. The pipelines were tested on speech samples containing vowel-consonant-vowel (VCV) utterances from 82 female participants with and without vocal fatigue complaints in the absence of phonotraumatic vocal fold changes. The pipelines automatically segmented VCVs and measured RFF. Manual measurements of a subset provided reliability and validity benchmarks.
Results: All pipelines demonstrated good reliability (r ≥ 0.84) and validity when compared with manual analysis. They required minimal manual correction (<4%) for fricative identification. Notably, the novel aRFF-B pipeline rejected the fewest samples (10%-25%) while maintaining reliability and was able to leverage parallel computing.
Conclusions: Three automated pipelines, especially aRFF-B, enabled time-efficient RFF analysis of large continuous speech data sets without manual intervention. This advancement can facilitate large-scale studies using RFF applied to continuous speech, potentially expanding its application in voice research and clinical practice.
期刊介绍:
The Journal of Voice is widely regarded as the world''s premiere journal for voice medicine and research. This peer-reviewed publication is listed in Index Medicus and is indexed by the Institute for Scientific Information. The journal contains articles written by experts throughout the world on all topics in voice sciences, voice medicine and surgery, and speech-language pathologists'' management of voice-related problems. The journal includes clinical articles, clinical research, and laboratory research. Members of the Foundation receive the journal as a benefit of membership.