Using ASR Posterior Probability and Acoustic Features for Voice Disorder Classification

Miklós Gábriel Tulics, György Szaszák, K. Mészáros, K. Vicsi
{"title":"Using ASR Posterior Probability and Acoustic Features for Voice Disorder Classification","authors":"Miklós Gábriel Tulics, György Szaszák, K. Mészáros, K. Vicsi","doi":"10.1109/CogInfoCom50765.2020.9237866","DOIUrl":null,"url":null,"abstract":"Dysphonia can be caused not only by the frequent use voice, but many other reasons, including environmental noise, environmental pollution and dry environment. Dysphonia can serve as an indicator for several serious and less serious diseases. Therefore a system that models the cognitive decision making processes of an expert would be of great value in order to make reliable and quick decisions to help physicians in diagnosing dysphonia. This paper focuses on the front-end of such a system, and evaluates acoustic features measured in different phonetic classes and ASR posterior probability values in two classification model schemes, with SVM and a DNN, for the classification of healthy and disordered voices in Hungarian-speaking patients. When the combination of the two features is used the classification accuracy increases to 89 %. While this is better than just using ‘acoustic’ features as an input for the DNN (88 %), we did not find significant impact of using ASR posterior probability values. Based on our results, it can be concluded that it is not worthwhile to calculate ASR phone posterior, as it has no significant impact, but it can greatly complicate and slow down a diagnosis support system.","PeriodicalId":236400,"journal":{"name":"2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","volume":"270 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CogInfoCom50765.2020.9237866","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Dysphonia can be caused not only by the frequent use voice, but many other reasons, including environmental noise, environmental pollution and dry environment. Dysphonia can serve as an indicator for several serious and less serious diseases. Therefore a system that models the cognitive decision making processes of an expert would be of great value in order to make reliable and quick decisions to help physicians in diagnosing dysphonia. This paper focuses on the front-end of such a system, and evaluates acoustic features measured in different phonetic classes and ASR posterior probability values in two classification model schemes, with SVM and a DNN, for the classification of healthy and disordered voices in Hungarian-speaking patients. When the combination of the two features is used the classification accuracy increases to 89 %. While this is better than just using ‘acoustic’ features as an input for the DNN (88 %), we did not find significant impact of using ASR posterior probability values. Based on our results, it can be concluded that it is not worthwhile to calculate ASR phone posterior, as it has no significant impact, but it can greatly complicate and slow down a diagnosis support system.
基于ASR后验概率和声学特征的语音障碍分类
造成发音障碍的原因除了频繁使用语音外,还有环境噪声、环境污染、环境干燥等多种原因。发音障碍可以作为几种严重或不太严重疾病的指标。因此,一个模拟专家的认知决策过程的系统将非常有价值,它可以帮助医生做出可靠而快速的决策,以诊断语音障碍。本文以该系统的前端为研究重点,利用SVM和DNN两种分类模型方案对匈牙利语患者的健康和紊乱语音进行分类,评估不同语音类别测量的声学特征和ASR后验概率值。当这两个特征结合使用时,分类准确率提高到89%。虽然这比仅仅使用“声学”特征作为DNN的输入(88%)要好,但我们没有发现使用ASR后验概率值的显著影响。基于我们的研究结果,可以得出结论,计算ASR电话后验是不值得的,因为它没有显著的影响,但它会极大地复杂化并减慢诊断支持系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信