CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC

Q3 Economics, Econometrics and Finance
N. Boualoulou, T. BELHOUSSINE DRISSI, B. Nsiri
{"title":"CNN AND LSTM FOR THE CLASSIFICATION OF PARKINSON'S DISEASE BASED ON THE GTCC AND MFCC","authors":"N. Boualoulou, T. BELHOUSSINE DRISSI, B. Nsiri","doi":"10.35784/acs-2023-11","DOIUrl":null,"url":null,"abstract":"Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC.","PeriodicalId":36379,"journal":{"name":"Applied Computer Science","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35784/acs-2023-11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Economics, Econometrics and Finance","Score":null,"Total":0}
引用次数: 0

Abstract

Parkinson's disease is a recognizable clinical syndrome with a variety of causes and clinical presentations; it represents a rapidly growing neurodegenerative disorder. Since about 90 percent of Parkinson's disease sufferers have some form of early speech impairment, recent studies on tele diagnosis of Parkinson's disease have focused on the recognition of voice impairments from vowel phonations or the subjects' discourse. In this paper, we present a new approach for Parkinson's disease detection from speech sounds that are based on CNN and LSTM and uses two categories of characteristics Mel Frequency Cepstral Coefficients (MFCC) and Gammatone Cepstral Coefficients (GTCC) obtained from noise-removed speech signals with comparative EMD-DWT and DWT-EMD analysis. The proposed model is divided into three stages. In the first step, noise is removed from the signals using the EMD-DWT and DWT-EMD methods. In the second step, the GTCC and MFCC are extracted from the enhanced audio signals. The classification process is carried out in the third step by feeding these features into the LSTM and CNN models, which are designed to define sequential information from the extracted features. The experiments are performed using PC-GITA and Sakar datasets and 10-fold cross validation method, the highest classification accuracy for the Sakar dataset reached 100% for both EMD-DWT-GTCC-CNN and DWT-EMD-GTCC-CNN, and for the PC-GITA dataset, the accuracy is reached 100% for EMD-DWT-GTCC-CNN and 96.55% for DWT-EMD-GTCC-CNN. The results of this study indicate that the characteristics of GTCC are more appropriate and accurate for the assessment of PD than MFCC.
CNN和LSTM在GTCC和MFCC基础上对帕金森病进行分类
帕金森病是一种可识别的临床综合征,有多种病因和临床表现;它代表了一种快速发展的神经退行性疾病。由于大约90%的帕金森病患者有某种形式的早期言语障碍,最近关于帕金森病远程诊断的研究集中在从元音发音或受试者的话语中识别语音障碍。在本文中,我们提出了一种从语音中检测帕金森氏症的新方法,该方法基于CNN和LSTM,并使用从去除噪声的语音信号中获得的两类特征Mel频率倒谱系数(MFCC)和Gammatone倒谱系数(GTCC),并对EMD-DWT和DWT-EMD进行比较分析。所提出的模型分为三个阶段。在第一步中,使用EMD-DWT和DWT-EMD方法从信号中去除噪声。在第二步骤中,从增强的音频信号中提取GTCC和MFCC。在第三步中,通过将这些特征输入LSTM和CNN模型来执行分类过程,LSTM和NN模型旨在从提取的特征中定义序列信息。实验使用PC-GITA和Sakar数据集和10倍交叉验证方法进行,对于EMD-DWT-EMD-GTC-CNN和DWT-EMD-GTC-CNN,Sakar的最高分类准确率均达到100%,对于PC-GITA数据集,EMD-DWT-GTC-CNN的准确率分别达到100%和96.55%。本研究结果表明,GTCC的特征比MFCC更适合和准确地评估PD。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Computer Science
Applied Computer Science Engineering-Industrial and Manufacturing Engineering
CiteScore
1.50
自引率
0.00%
发文量
0
审稿时长
8 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信