{"title":"海报:基于Vggish嵌入的音频分类器改善帕金森病的诊断","authors":"Sruthi Kurada, Abhinav Kurada","doi":"10.1145/3384420.3431775","DOIUrl":null,"url":null,"abstract":"The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.","PeriodicalId":193143,"journal":{"name":"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Poster: Vggish Embeddings Based Audio Classifiers to Improve Parkinson's Disease Diagnosis\",\"authors\":\"Sruthi Kurada, Abhinav Kurada\",\"doi\":\"10.1145/3384420.3431775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.\",\"PeriodicalId\":193143,\"journal\":{\"name\":\"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3384420.3431775\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3384420.3431775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Poster: Vggish Embeddings Based Audio Classifiers to Improve Parkinson's Disease Diagnosis
The absence of highly predictive and readily applicable biomarkers for Parkinson's disease (PD) significantly hinders the diagnosis and subsequent monitoring of the condition. Since up to 90% of PD patients exhibit speech aberrations, however, the use of patient voice as a rapid diagnostic measure has shown significant promise. Past research towards creating voice-based automated diagnostic tools has relied on expert handcrafted audio feature sets that capture patient articulation, phonation, and prosody properties. Not only is there a limited consensus on the ideal contents of a PD audio diagnostic feature set, but also manually selected features may not fully exploit the predictive power of the underlying data. In this study, we demonstrate the benefit of employing VGGish embeddings, a more generalizable and higher throughput feature extraction strategy, for voice-based PD diagnosis. Our top VGGish-based model achieved 87% accuracy for detecting PD and significantly outperformed models trained on multiple handcrafted feature sets, a mel-frequency cepstral coefficient set, as well as an ImageNet pretrained convolutional neural network extraction strategy. VGGish models were also highly competitive with clinically determined UPDRS III–18 speech deterioration ratings for PD diagnosis. These results demonstrate the potential of VGGish embeddings for creating fast and accurate voice-based PD classification models.