基于语音任务的ALS和帕金森病及其严重程度的对数Mel谱图自动分类

BN Suhas, Jhansi Mallela, Aravind Illa, B. Yamini, A. Nalini, R. Yadav, D. Gope, P. Ghosh
{"title":"基于语音任务的ALS和帕金森病及其严重程度的对数Mel谱图自动分类","authors":"BN Suhas, Jhansi Mallela, Aravind Illa, B. Yamini, A. Nalini, R. Yadav, D. Gope, P. Ghosh","doi":"10.1109/SPCOM50965.2020.9179503","DOIUrl":null,"url":null,"abstract":"We consider the task of speech based classification of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s disease (PD) and healthy controls (HC). Recent work in convolutional neural networks (CNN) to solve image classification problems raises the possibility of utilizing spectral representation of speech for detection of neurological diseases. In this paper, a spectrogram based approach is used. Feeding overlapping windows to the CNN makes sure that the temporal aspects are considered by using short signal segments or wide analysis filters. A three class (ALS, PD or HC) dysarthria classification is performed. In addition, we perform two severity classification experiments for ALS (5 class) and PD (3 class) respectively. Experiments are conducted on both baseline MFCC data [1] and log Mel spectrograms. Classification results show that for several audio lengths, models trained on log Mel spectrograms consistently outperform those of MFCC’s. The ability of the network to accurately classify different classes is evaluated via the area under receiver operating characteristic curve [2],[3]. The findings from this study could aid in better detection and monitoring of ALS and PD diseases.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Speech task based automatic classification of ALS and Parkinson’s Disease and their severity using log Mel spectrograms\",\"authors\":\"BN Suhas, Jhansi Mallela, Aravind Illa, B. Yamini, A. Nalini, R. Yadav, D. Gope, P. Ghosh\",\"doi\":\"10.1109/SPCOM50965.2020.9179503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the task of speech based classification of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s disease (PD) and healthy controls (HC). Recent work in convolutional neural networks (CNN) to solve image classification problems raises the possibility of utilizing spectral representation of speech for detection of neurological diseases. In this paper, a spectrogram based approach is used. Feeding overlapping windows to the CNN makes sure that the temporal aspects are considered by using short signal segments or wide analysis filters. A three class (ALS, PD or HC) dysarthria classification is performed. In addition, we perform two severity classification experiments for ALS (5 class) and PD (3 class) respectively. Experiments are conducted on both baseline MFCC data [1] and log Mel spectrograms. Classification results show that for several audio lengths, models trained on log Mel spectrograms consistently outperform those of MFCC’s. The ability of the network to accurately classify different classes is evaluated via the area under receiver operating characteristic curve [2],[3]. The findings from this study could aid in better detection and monitoring of ALS and PD diseases.\",\"PeriodicalId\":208527,\"journal\":{\"name\":\"2020 International Conference on Signal Processing and Communications (SPCOM)\",\"volume\":\"150 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Signal Processing and Communications (SPCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPCOM50965.2020.9179503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM50965.2020.9179503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

我们考虑了肌萎缩侧索硬化症(ALS)、帕金森病(PD)和健康对照(HC)患者基于语言分类的任务。卷积神经网络(CNN)最近在解决图像分类问题方面的工作,提出了利用语音的频谱表示来检测神经系统疾病的可能性。本文采用了一种基于谱图的方法。向CNN输入重叠窗口确保使用短信号段或宽分析滤波器来考虑时间方面。进行三类(ALS, PD或HC)构音障碍分类。此外,我们还分别对ALS(5级)和PD(3级)进行了两项严重程度分类实验。在基线MFCC数据[1]和对数Mel谱图上都进行了实验。分类结果表明,在不同的音频长度下,对数梅尔谱图训练的模型始终优于MFCC的模型。通过接收者工作特征曲线下的面积来评估网络对不同类别进行准确分类的能力[2],[3]。这项研究的发现有助于更好地检测和监测ALS和PD疾病。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Speech task based automatic classification of ALS and Parkinson’s Disease and their severity using log Mel spectrograms
We consider the task of speech based classification of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s disease (PD) and healthy controls (HC). Recent work in convolutional neural networks (CNN) to solve image classification problems raises the possibility of utilizing spectral representation of speech for detection of neurological diseases. In this paper, a spectrogram based approach is used. Feeding overlapping windows to the CNN makes sure that the temporal aspects are considered by using short signal segments or wide analysis filters. A three class (ALS, PD or HC) dysarthria classification is performed. In addition, we perform two severity classification experiments for ALS (5 class) and PD (3 class) respectively. Experiments are conducted on both baseline MFCC data [1] and log Mel spectrograms. Classification results show that for several audio lengths, models trained on log Mel spectrograms consistently outperform those of MFCC’s. The ability of the network to accurately classify different classes is evaluated via the area under receiver operating characteristic curve [2],[3]. The findings from this study could aid in better detection and monitoring of ALS and PD diseases.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信