{"title":"目的阿尔及利亚医院环境下基于深度学习神经网络的病理性语音评价","authors":"Mahraz Kabache, M. Guerti","doi":"10.53907/enpesj.v2i2.170","DOIUrl":null,"url":null,"abstract":"In this study, we propose a method based on Recurrent Neural Networks, to objectively evaluate the process of rehabilitation of the pathological voice, in an Algerian clinical environment. We choose Unilateral Laryngeal Paralysis as the pathology of the voice. In this paper, we used a Deep Learning system of pathological voice detection by Long Short Term Memory neural model (LSTM). As the dysphony studied in our work concerns essentially the laryngeal vibration, we choose the acoustic parameters based on the instability of the frequency and the amplitude of the laryngeal vibration: Jitter and Shimmer, Noise parameters and Cepstraux MFCC coefficients (Mel Frequency Cepstral Coefficients). A pathological voice detection rate of 88.65% shows important results brought by the rehabilitation technique adopted in Algerian clinical setting. The exclusive and abusive use of hearing to evaluate the effect of speech rehabilitation in the Algerian hospital environment remains insufficient. It is important to correlate perceptual data with objective methods based on detection and classification methods by introducing relevant acoustic parameters, for an effective and objective management of vocal pathology assessment.","PeriodicalId":200690,"journal":{"name":"ENP Engineering Science Journal","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Objective Evaluation of the Pathological Voice Based on Deep Learning Neural Networks in an Algerian hospital environment\",\"authors\":\"Mahraz Kabache, M. Guerti\",\"doi\":\"10.53907/enpesj.v2i2.170\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this study, we propose a method based on Recurrent Neural Networks, to objectively evaluate the process of rehabilitation of the pathological voice, in an Algerian clinical environment. We choose Unilateral Laryngeal Paralysis as the pathology of the voice. In this paper, we used a Deep Learning system of pathological voice detection by Long Short Term Memory neural model (LSTM). As the dysphony studied in our work concerns essentially the laryngeal vibration, we choose the acoustic parameters based on the instability of the frequency and the amplitude of the laryngeal vibration: Jitter and Shimmer, Noise parameters and Cepstraux MFCC coefficients (Mel Frequency Cepstral Coefficients). A pathological voice detection rate of 88.65% shows important results brought by the rehabilitation technique adopted in Algerian clinical setting. The exclusive and abusive use of hearing to evaluate the effect of speech rehabilitation in the Algerian hospital environment remains insufficient. It is important to correlate perceptual data with objective methods based on detection and classification methods by introducing relevant acoustic parameters, for an effective and objective management of vocal pathology assessment.\",\"PeriodicalId\":200690,\"journal\":{\"name\":\"ENP Engineering Science Journal\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ENP Engineering Science Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.53907/enpesj.v2i2.170\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ENP Engineering Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53907/enpesj.v2i2.170","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在本研究中,我们提出了一种基于递归神经网络的方法,在阿尔及利亚临床环境中客观评估病理性语音的康复过程。我们选择单侧喉麻痹作为声音的病理。在本文中,我们使用了一个长短期记忆神经模型(LSTM)的深度学习病理语音检测系统。由于我们研究的发声障碍主要涉及喉部振动,因此我们根据喉部振动频率和振幅的不稳定性选择声学参数:抖动和闪烁、噪声参数和倒频MFCC系数(Mel frequency Cepstral coefficients)。病理语音检出率为88.65%,显示了阿尔及利亚临床采用康复技术所带来的重要效果。在阿尔及利亚的医院环境中,专门和滥用听力来评估语言康复的效果仍然不足。通过引入相关声学参数,将感知数据与基于检测和分类方法的客观方法相关联,对于有效和客观地管理声带病理评估非常重要。
Objective Evaluation of the Pathological Voice Based on Deep Learning Neural Networks in an Algerian hospital environment
In this study, we propose a method based on Recurrent Neural Networks, to objectively evaluate the process of rehabilitation of the pathological voice, in an Algerian clinical environment. We choose Unilateral Laryngeal Paralysis as the pathology of the voice. In this paper, we used a Deep Learning system of pathological voice detection by Long Short Term Memory neural model (LSTM). As the dysphony studied in our work concerns essentially the laryngeal vibration, we choose the acoustic parameters based on the instability of the frequency and the amplitude of the laryngeal vibration: Jitter and Shimmer, Noise parameters and Cepstraux MFCC coefficients (Mel Frequency Cepstral Coefficients). A pathological voice detection rate of 88.65% shows important results brought by the rehabilitation technique adopted in Algerian clinical setting. The exclusive and abusive use of hearing to evaluate the effect of speech rehabilitation in the Algerian hospital environment remains insufficient. It is important to correlate perceptual data with objective methods based on detection and classification methods by introducing relevant acoustic parameters, for an effective and objective management of vocal pathology assessment.