An Intelligent System for the Diagnosis of Voice Pathology Based on Adversarial Pathological Response (APR) Net Deep Learning Model: An Intelligent System for the Diagnosis of Voice Pathology-Based Deep Learning
{"title":"An Intelligent System for the Diagnosis of Voice Pathology Based on Adversarial Pathological Response (APR) Net Deep Learning Model: An Intelligent System for the Diagnosis of Voice Pathology-Based Deep Learning","authors":"Vikas Mittal, R. Sharma","doi":"10.4018/ijsi.312261","DOIUrl":null,"url":null,"abstract":"The work investigates the use of two types of glottal flow derivative-based image variants of the input signal with an n-dilated (nD)-inception-layers-based deep learning model for providing optimal labels. The authors have proposed an n-dilated (nD) inception layer-based adversarial pathological response (APR) net deep learning model. This model is trained using the two image databases separately in an adversarial manner so that when a test image is common to test image is applied to both the networks. The results show a mean accuracy of 96.82%, 96.36%, and 99.35% for the Glottal inverse filtering with extended Kalman Filter-Morse scalogram (GIFEKF-MS) APRNet, Glottal inverse filtering with extended Kalman Filter-spectrogram (GIFEKF-S) APRNet, and proposed APR fusion net respectively using the VOice ICar fEDerico II (VOICED) dataset; and mean accuracies 95.67%, 93.27%, and 99.04% for the GIFEKF-MS APRNet, GIFEKF-S APRNet, and proposed APR fusion net respectively using the Saarbrucken voice database (SVD)dataset.","PeriodicalId":396598,"journal":{"name":"Int. J. Softw. Innov.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Softw. Innov.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijsi.312261","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The work investigates the use of two types of glottal flow derivative-based image variants of the input signal with an n-dilated (nD)-inception-layers-based deep learning model for providing optimal labels. The authors have proposed an n-dilated (nD) inception layer-based adversarial pathological response (APR) net deep learning model. This model is trained using the two image databases separately in an adversarial manner so that when a test image is common to test image is applied to both the networks. The results show a mean accuracy of 96.82%, 96.36%, and 99.35% for the Glottal inverse filtering with extended Kalman Filter-Morse scalogram (GIFEKF-MS) APRNet, Glottal inverse filtering with extended Kalman Filter-spectrogram (GIFEKF-S) APRNet, and proposed APR fusion net respectively using the VOice ICar fEDerico II (VOICED) dataset; and mean accuracies 95.67%, 93.27%, and 99.04% for the GIFEKF-MS APRNet, GIFEKF-S APRNet, and proposed APR fusion net respectively using the Saarbrucken voice database (SVD)dataset.