筛选语音障碍:声学语音质量指数,倒谱峰突出,和机器学习。

IF 1.1 4区 医学 Q3 AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY
Ahmed M Yousef, Adrián Castillo-Allendes, Mark L Berardi, Juliana Codino, Adam D Rubin, Eric J Hunter
{"title":"筛选语音障碍:声学语音质量指数,倒谱峰突出,和机器学习。","authors":"Ahmed M Yousef, Adrián Castillo-Allendes, Mark L Berardi, Juliana Codino, Adam D Rubin, Eric J Hunter","doi":"10.1159/000544852","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The Acoustic Voice Quality Index (AVQI) and Smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessment of voice quality in persons seeking voice care across many languages. This study aimed to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.</p><p><strong>Methods: </strong>This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (random forest, k-nearest neighbors, support vector machine, and decision tree) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden Index.</p><p><strong>Results: </strong>A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered a better balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.</p><p><strong>Conclusions: </strong>ML shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.</p>","PeriodicalId":12114,"journal":{"name":"Folia Phoniatrica et Logopaedica","volume":" ","pages":"1-15"},"PeriodicalIF":1.1000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.\",\"authors\":\"Ahmed M Yousef, Adrián Castillo-Allendes, Mark L Berardi, Juliana Codino, Adam D Rubin, Eric J Hunter\",\"doi\":\"10.1159/000544852\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>The Acoustic Voice Quality Index (AVQI) and Smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessment of voice quality in persons seeking voice care across many languages. This study aimed to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.</p><p><strong>Methods: </strong>This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (random forest, k-nearest neighbors, support vector machine, and decision tree) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden Index.</p><p><strong>Results: </strong>A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered a better balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.</p><p><strong>Conclusions: </strong>ML shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.</p>\",\"PeriodicalId\":12114,\"journal\":{\"name\":\"Folia Phoniatrica et Logopaedica\",\"volume\":\" \",\"pages\":\"1-15\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2025-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Folia Phoniatrica et Logopaedica\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1159/000544852\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Folia Phoniatrica et Logopaedica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1159/000544852","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

导言:声学语音质量指数(AVQI)和平滑倒谱峰突出(CPPs)已经被报道可以有效地支持在多种语言中寻求语音护理的人的语音质量评估。本研究旨在评估这两种方法在检测美国英语使用者语音障碍方面的诊断准确性,并将其性能与机器学习(ML)模型进行比较。方法:本回顾性研究纳入187名参与者:138名临床诊断为声音障碍的患者和49名声音健康的个体。每个参与者都完成了两个发声任务:维持[a:]元音,并产生一个连续的语音样本,然后将它们连接起来。使用AVQI-3 (version 03.01)和CPPs的VOXplot软件对这些样本进行分析。此外,还训练了四种ML模型(随机森林(RF), k-近邻(k-NN),支持向量机(SVM)和决策树(DT))进行比较。采用多种评价指标(包括受试者工作特征曲线和约登指数)评价两种方法和模型的诊断准确性。结果:AVQI-3(灵敏度55%,特异性80%)的截止评分为1.54,CPPs(灵敏度65%,特异性78%)的截止评分为14.35 dB。与平均ML灵敏度89%和特异性55%相比,CPPs提供了灵敏度和特异性之间的最佳平衡,优于AVQI-3,几乎与平均ML性能相匹配。结论:机器学习在支持语音障碍诊断方面显示出巨大的潜力,特别是当模型变得更加一般化和更容易解释时。然而,目前的工具,如AVQI-3和CPPs,在评估语音质量方面仍然比常用的模型更实用,更易于临床使用。特别是cps在识别声音障碍方面具有明显优势,使其成为资源有限的诊所的推荐和可行选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Screening Voice Disorders: Acoustic Voice Quality Index, Cepstral Peak Prominence, and Machine Learning.

Introduction: The Acoustic Voice Quality Index (AVQI) and Smoothed Cepstral Peak Prominence (CPPs) have been reported to effectively support the assessment of voice quality in persons seeking voice care across many languages. This study aimed to evaluate the diagnostic accuracy of these two measures in detecting voice disorders in American English speakers, comparing their performance to machine learning (ML) models.

Methods: This retrospective study included a cohort of 187 participants: 138 patients with clinically diagnosed voice disorders and 49 vocally healthy individuals. Each participant completed two voicing tasks: sustaining [a:] vowel and producing a running speech sample, which were then concatenated. These samples were analyzed using VOXplot software for AVQI-3 (version 03.01) and CPPs. Additionally, four ML models (random forest, k-nearest neighbors, support vector machine, and decision tree) were trained for comparison. The diagnostic accuracy of the two measures and models was assessed using various evaluation metrics, including receiver operating characteristic curve and Youden Index.

Results: A cutoff score of 1.54 for the AVQI-3 (with 55% sensitivity and 80% specificity) and 14.35 dB for CPPs (with 65% sensitivity and 78% specificity) were identified for detecting voice disorders. Compared to an average ML sensitivity of 89% and specificity of 55%, CPPs offered a better balance between sensitivity and specificity, outperforming AVQI-3 and nearly matching the average ML performance.

Conclusions: ML shows great potential for supporting voice disorder diagnostics, especially as models become more generalizable and easier to interpret. However, current tools like AVQI-3 and CPPs remain more practical and accessible for clinical use in evaluating voice quality than commonly implemented models. CPPs, in particular, offers distinct advantages for identifying voice disorders, making it a recommended and feasible choice for clinics with limited resources.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Folia Phoniatrica et Logopaedica
Folia Phoniatrica et Logopaedica AUDIOLOGY & SPEECH-LANGUAGE PATHOLOGY-OTORHINOLARYNGOLOGY
CiteScore
2.30
自引率
10.00%
发文量
28
审稿时长
>12 weeks
期刊介绍: Published since 1947, ''Folia Phoniatrica et Logopaedica'' provides a forum for international research on the anatomy, physiology, and pathology of structures of the speech, language, and hearing mechanisms. Original papers published in this journal report new findings on basic function, assessment, management, and test development in communication sciences and disorders, as well as experiments designed to test specific theories of speech, language, and hearing function. Review papers of high quality are also welcomed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信