COVID-19 Disease Classification by Cough Records Analysis using Machine Learning

Kien Trang, Hoang An Nguyen, Long TonThat, Hung Ngoc Do, B. Vuong
{"title":"COVID-19 Disease Classification by Cough Records Analysis using Machine Learning","authors":"Kien Trang, Hoang An Nguyen, Long TonThat, Hung Ngoc Do, B. Vuong","doi":"10.1109/CyberneticsCom55287.2022.9865610","DOIUrl":null,"url":null,"abstract":"The rapid spreading rate of the Coronavirus disease 2019 (COVID-19) has resulted in more than 6.2 million deceased cases. Furthermore, the patients of the latest Omicron variation carry light to almost no symptoms of the disease themselves. Thus, the requirement for a new diagnosis method besides Reverse Transcription-Polymerase Chain Reaction (RT-PCR) becomes the most important step to successfully detect infected cases. In this research, the application of the KNN, Ensemble and SincNet models are implemented as the main models for classification diagnosis based on cough sound records of infected patients. After pre-processing steps for removing silence ranges in the audio scripts, the cough sounds are augmented, subsequently separated into single cough samples, then generated 3 testing scenarios for dealing with the imbalanced problem between the sample classes. Afterward, MelFrequency information and MelSprectrogram are extracted as main features for analysis in order to distinguish patients with COVID-19 disease and healthy cases. The AICV115M dataset consisting of two classes COVID-19 and NonCOVID-19 is implemented for performance evaluation. The recorded highest accuracy on the models KNN, Ensemble and SincNet are 92.49%, 90.1% and 85.15%, respectively.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid spreading rate of the Coronavirus disease 2019 (COVID-19) has resulted in more than 6.2 million deceased cases. Furthermore, the patients of the latest Omicron variation carry light to almost no symptoms of the disease themselves. Thus, the requirement for a new diagnosis method besides Reverse Transcription-Polymerase Chain Reaction (RT-PCR) becomes the most important step to successfully detect infected cases. In this research, the application of the KNN, Ensemble and SincNet models are implemented as the main models for classification diagnosis based on cough sound records of infected patients. After pre-processing steps for removing silence ranges in the audio scripts, the cough sounds are augmented, subsequently separated into single cough samples, then generated 3 testing scenarios for dealing with the imbalanced problem between the sample classes. Afterward, MelFrequency information and MelSprectrogram are extracted as main features for analysis in order to distinguish patients with COVID-19 disease and healthy cases. The AICV115M dataset consisting of two classes COVID-19 and NonCOVID-19 is implemented for performance evaluation. The recorded highest accuracy on the models KNN, Ensemble and SincNet are 92.49%, 90.1% and 85.15%, respectively.
基于机器学习的咳嗽记录分析的COVID-19疾病分类
2019冠状病毒病(COVID-19)的快速传播速度已导致620多万例死亡病例。此外,最新的基因组变异患者本身几乎没有任何疾病症状。因此,除了逆转录聚合酶链反应(RT-PCR)之外,需要一种新的诊断方法成为成功检测感染病例的最重要步骤。在本研究中,应用KNN、Ensemble和SincNet模型作为基于感染患者咳嗽声记录的分类诊断的主要模型。在去除音频脚本中沉默范围的预处理步骤后,对咳嗽声音进行增强,随后将咳嗽声音分离为单个咳嗽样本,然后生成3个测试场景来处理样本类别之间的不平衡问题。然后,提取MelFrequency信息和MelSprectrogram作为主要特征进行分析,以区分COVID-19疾病患者和健康病例。采用由COVID-19和non - covid两个类组成的AICV115M数据集进行性能评估。在KNN、Ensemble和SincNet模型上记录的最高准确率分别为92.49%、90.1%和85.15%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信