基于声门电信号和语音信号的卷积神经网络语音病理检测

Rumana Islam , Esam Abdel-Raheem , Mohammed Tarique
{"title":"基于声门电信号和语音信号的卷积神经网络语音病理检测","authors":"Rumana Islam ,&nbsp;Esam Abdel-Raheem ,&nbsp;Mohammed Tarique","doi":"10.1016/j.cmpbup.2022.100074","DOIUrl":null,"url":null,"abstract":"<div><p>This paper presents a convolutional neural network (CNN) based automated noninvasive voice pathology detection system. The proposed system functions in two steps. First, it discriminates pathological voices from healthy ones, and then, it classifies the discriminated pathological voices into one of the three pathologies. Two CNNs are used for these purposes; one works as a binary classifier to identify pathological voices. The other one works as a multiclass classifier for categorizing the voice pathologies. This work investigates the effectiveness of electroglottographic (EGG) and speech signals to detect and classify pathological voices using sustained vowel ('/a/') samples. EGG signals can assess the vibratory pattern of the vocal folds during voiced sound. On the other hand, the speech signals add spectral color to the EGG signals. Hence, their contributions for pathology identification and segregation differ, as demonstrated in this work. The Saarbrücken Voice Database (SVD) is used in this investigation. The results show that the proposed system achieves a higher accuracy (more than 9%) in identifying pathological voices from healthy ones with speech signals than EGG signals. However, categorizing pathological voices into different pathology types demonstrates higher accuracy (more than 12%) with EGG signals than speech signals. A comparative performance analysis of the proposed system is presented with these two signals in terms of clinical and statistical measures. The obtained results of this work are also compared with those of other related published works.</p></div>","PeriodicalId":72670,"journal":{"name":"Computer methods and programs in biomedicine update","volume":"2 ","pages":"Article 100074"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666990022000258/pdfft?md5=8eea3c31d7c8f756c52783bf420ea51b&pid=1-s2.0-S2666990022000258-main.pdf","citationCount":"8","resultStr":"{\"title\":\"Voice pathology detection using convolutional neural networks with electroglottographic (EGG) and speech signals\",\"authors\":\"Rumana Islam ,&nbsp;Esam Abdel-Raheem ,&nbsp;Mohammed Tarique\",\"doi\":\"10.1016/j.cmpbup.2022.100074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper presents a convolutional neural network (CNN) based automated noninvasive voice pathology detection system. The proposed system functions in two steps. First, it discriminates pathological voices from healthy ones, and then, it classifies the discriminated pathological voices into one of the three pathologies. Two CNNs are used for these purposes; one works as a binary classifier to identify pathological voices. The other one works as a multiclass classifier for categorizing the voice pathologies. This work investigates the effectiveness of electroglottographic (EGG) and speech signals to detect and classify pathological voices using sustained vowel ('/a/') samples. EGG signals can assess the vibratory pattern of the vocal folds during voiced sound. On the other hand, the speech signals add spectral color to the EGG signals. Hence, their contributions for pathology identification and segregation differ, as demonstrated in this work. The Saarbrücken Voice Database (SVD) is used in this investigation. The results show that the proposed system achieves a higher accuracy (more than 9%) in identifying pathological voices from healthy ones with speech signals than EGG signals. However, categorizing pathological voices into different pathology types demonstrates higher accuracy (more than 12%) with EGG signals than speech signals. A comparative performance analysis of the proposed system is presented with these two signals in terms of clinical and statistical measures. The obtained results of this work are also compared with those of other related published works.</p></div>\",\"PeriodicalId\":72670,\"journal\":{\"name\":\"Computer methods and programs in biomedicine update\",\"volume\":\"2 \",\"pages\":\"Article 100074\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666990022000258/pdfft?md5=8eea3c31d7c8f756c52783bf420ea51b&pid=1-s2.0-S2666990022000258-main.pdf\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine update\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666990022000258\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine update","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666990022000258","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

提出了一种基于卷积神经网络(CNN)的无创语音病理自动检测系统。该系统分两步运行。首先将病理性的声音与健康的声音区分开来,然后将区分出来的病理性声音分为三种病理之一。两个cnn被用于这些目的;一种是作为二元分类器来识别病态的声音。另一个作为多类分类器对语音病理进行分类。本研究探讨了电声门图(EGG)和语音信号在使用持续元音('/a/')样本检测和分类病理声音方面的有效性。EGG信号可以评估发声时声带的振动模式。另一方面,语音信号为EGG信号添加了光谱色彩。因此,他们对病理鉴定和分离的贡献不同,正如在这项工作中所证明的那样。本次调查使用了saarbr cken语音数据库(SVD)。结果表明,与EGG信号相比,基于语音信号的病理语音识别准确率更高(9%以上)。然而,与语音信号相比,EGG信号将病理语音分类为不同的病理类型的准确率更高(超过12%)。比较性能分析提出的系统与这两个信号在临床和统计措施。并将所得结果与其他已发表的相关文献进行了比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Voice pathology detection using convolutional neural networks with electroglottographic (EGG) and speech signals

This paper presents a convolutional neural network (CNN) based automated noninvasive voice pathology detection system. The proposed system functions in two steps. First, it discriminates pathological voices from healthy ones, and then, it classifies the discriminated pathological voices into one of the three pathologies. Two CNNs are used for these purposes; one works as a binary classifier to identify pathological voices. The other one works as a multiclass classifier for categorizing the voice pathologies. This work investigates the effectiveness of electroglottographic (EGG) and speech signals to detect and classify pathological voices using sustained vowel ('/a/') samples. EGG signals can assess the vibratory pattern of the vocal folds during voiced sound. On the other hand, the speech signals add spectral color to the EGG signals. Hence, their contributions for pathology identification and segregation differ, as demonstrated in this work. The Saarbrücken Voice Database (SVD) is used in this investigation. The results show that the proposed system achieves a higher accuracy (more than 9%) in identifying pathological voices from healthy ones with speech signals than EGG signals. However, categorizing pathological voices into different pathology types demonstrates higher accuracy (more than 12%) with EGG signals than speech signals. A comparative performance analysis of the proposed system is presented with these two signals in terms of clinical and statistical measures. The obtained results of this work are also compared with those of other related published works.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
5.90
自引率
0.00%
发文量
0
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信