BN Suhas, Jhansi Mallela, Aravind Illa, B. Yamini, A. Nalini, R. Yadav, D. Gope, P. Ghosh
{"title":"基于语音任务的ALS和帕金森病及其严重程度的对数Mel谱图自动分类","authors":"BN Suhas, Jhansi Mallela, Aravind Illa, B. Yamini, A. Nalini, R. Yadav, D. Gope, P. Ghosh","doi":"10.1109/SPCOM50965.2020.9179503","DOIUrl":null,"url":null,"abstract":"We consider the task of speech based classification of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s disease (PD) and healthy controls (HC). Recent work in convolutional neural networks (CNN) to solve image classification problems raises the possibility of utilizing spectral representation of speech for detection of neurological diseases. In this paper, a spectrogram based approach is used. Feeding overlapping windows to the CNN makes sure that the temporal aspects are considered by using short signal segments or wide analysis filters. A three class (ALS, PD or HC) dysarthria classification is performed. In addition, we perform two severity classification experiments for ALS (5 class) and PD (3 class) respectively. Experiments are conducted on both baseline MFCC data [1] and log Mel spectrograms. Classification results show that for several audio lengths, models trained on log Mel spectrograms consistently outperform those of MFCC’s. The ability of the network to accurately classify different classes is evaluated via the area under receiver operating characteristic curve [2],[3]. The findings from this study could aid in better detection and monitoring of ALS and PD diseases.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Speech task based automatic classification of ALS and Parkinson’s Disease and their severity using log Mel spectrograms\",\"authors\":\"BN Suhas, Jhansi Mallela, Aravind Illa, B. Yamini, A. Nalini, R. Yadav, D. Gope, P. Ghosh\",\"doi\":\"10.1109/SPCOM50965.2020.9179503\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the task of speech based classification of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s disease (PD) and healthy controls (HC). Recent work in convolutional neural networks (CNN) to solve image classification problems raises the possibility of utilizing spectral representation of speech for detection of neurological diseases. In this paper, a spectrogram based approach is used. Feeding overlapping windows to the CNN makes sure that the temporal aspects are considered by using short signal segments or wide analysis filters. A three class (ALS, PD or HC) dysarthria classification is performed. In addition, we perform two severity classification experiments for ALS (5 class) and PD (3 class) respectively. Experiments are conducted on both baseline MFCC data [1] and log Mel spectrograms. Classification results show that for several audio lengths, models trained on log Mel spectrograms consistently outperform those of MFCC’s. The ability of the network to accurately classify different classes is evaluated via the area under receiver operating characteristic curve [2],[3]. The findings from this study could aid in better detection and monitoring of ALS and PD diseases.\",\"PeriodicalId\":208527,\"journal\":{\"name\":\"2020 International Conference on Signal Processing and Communications (SPCOM)\",\"volume\":\"150 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Signal Processing and Communications (SPCOM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SPCOM50965.2020.9179503\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Signal Processing and Communications (SPCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPCOM50965.2020.9179503","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speech task based automatic classification of ALS and Parkinson’s Disease and their severity using log Mel spectrograms
We consider the task of speech based classification of patients with amyotrophic lateral sclerosis (ALS), Parkinson’s disease (PD) and healthy controls (HC). Recent work in convolutional neural networks (CNN) to solve image classification problems raises the possibility of utilizing spectral representation of speech for detection of neurological diseases. In this paper, a spectrogram based approach is used. Feeding overlapping windows to the CNN makes sure that the temporal aspects are considered by using short signal segments or wide analysis filters. A three class (ALS, PD or HC) dysarthria classification is performed. In addition, we perform two severity classification experiments for ALS (5 class) and PD (3 class) respectively. Experiments are conducted on both baseline MFCC data [1] and log Mel spectrograms. Classification results show that for several audio lengths, models trained on log Mel spectrograms consistently outperform those of MFCC’s. The ability of the network to accurately classify different classes is evaluated via the area under receiver operating characteristic curve [2],[3]. The findings from this study could aid in better detection and monitoring of ALS and PD diseases.