Poster: Vggish Embeddings Based Audio Classifiers to Improve Parkinson's Disease Diagnosis

Sruthi Kurada, Abhinav Kurada
{"title":"Poster: Vggish Embeddings Based Audio Classifiers to Improve Parkinson's Disease Diagnosis","authors":"Sruthi Kurada, Abhinav Kurada","doi":"10.1145/3384420.3431775","DOIUrl":null,"url":null,"abstract":"The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.","PeriodicalId":193143,"journal":{"name":"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3384420.3431775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.
海报:基于Vggish嵌入的音频分类器改善帕金森病的诊断
缺乏高度预测和易于应用的帕金森病(PD)生物标志物严重阻碍了病情的诊断和后续监测。然而,由于高达90%的PD患者表现出语言异常,因此使用患者声音作为快速诊断措施已显示出重大的前景。过去对创建基于语音的自动诊断工具的研究依赖于专家手工制作的音频特征集,这些特征集可以捕捉患者的发音、发音和韵律特性。不仅对PD音频诊断特征集的理想内容存在有限的共识,而且手动选择的特征可能无法充分利用底层数据的预测能力。在这项研究中,我们展示了使用VGGish嵌入的好处,这是一种更通用和更高吞吐量的特征提取策略,用于基于语音的PD诊断。我们的顶级基于vggish的模型在检测PD方面达到了87%的准确率,并且显著优于使用多个手工特征集、mel频率倒谱系数集以及ImageNet预训练的卷积神经网络提取策略训练的模型。VGGish模型与临床确定的UPDRS III-18言语恶化评分在PD诊断方面也具有很强的竞争力。这些结果证明了VGGish嵌入在创建快速准确的基于语音的PD分类模型方面的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信