COVID-19 Disease Classification by Cough Records Analysis using Machine Learning

2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom) Pub Date : 2022-06-16 DOI:10.1109/CyberneticsCom55287.2022.9865610

Kien Trang, Hoang An Nguyen, Long TonThat, Hung Ngoc Do, B. Vuong

{"title":"COVID-19 Disease Classification by Cough Records Analysis using Machine Learning","authors":"Kien Trang, Hoang An Nguyen, Long TonThat, Hung Ngoc Do, B. Vuong","doi":"10.1109/CyberneticsCom55287.2022.9865610","DOIUrl":null,"url":null,"abstract":"The rapid spreading rate of the Coronavirus disease 2019 (COVID-19) has resulted in more than 6.2 million deceased cases. Furthermore, the patients of the latest Omicron variation carry light to almost no symptoms of the disease themselves. Thus, the requirement for a new diagnosis method besides Reverse Transcription-Polymerase Chain Reaction (RT-PCR) becomes the most important step to successfully detect infected cases. In this research, the application of the KNN, Ensemble and SincNet models are implemented as the main models for classification diagnosis based on cough sound records of infected patients. After pre-processing steps for removing silence ranges in the audio scripts, the cough sounds are augmented, subsequently separated into single cough samples, then generated 3 testing scenarios for dealing with the imbalanced problem between the sample classes. Afterward, MelFrequency information and MelSprectrogram are extracted as main features for analysis in order to distinguish patients with COVID-19 disease and healthy cases. The AICV115M dataset consisting of two classes COVID-19 and NonCOVID-19 is implemented for performance evaluation. The recorded highest accuracy on the models KNN, Ensemble and SincNet are 92.49%, 90.1% and 85.15%, respectively.","PeriodicalId":178279,"journal":{"name":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CyberneticsCom55287.2022.9865610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The rapid spreading rate of the Coronavirus disease 2019 (COVID-19) has resulted in more than 6.2 million deceased cases. Furthermore, the patients of the latest Omicron variation carry light to almost no symptoms of the disease themselves. Thus, the requirement for a new diagnosis method besides Reverse Transcription-Polymerase Chain Reaction (RT-PCR) becomes the most important step to successfully detect infected cases. In this research, the application of the KNN, Ensemble and SincNet models are implemented as the main models for classification diagnosis based on cough sound records of infected patients. After pre-processing steps for removing silence ranges in the audio scripts, the cough sounds are augmented, subsequently separated into single cough samples, then generated 3 testing scenarios for dealing with the imbalanced problem between the sample classes. Afterward, MelFrequency information and MelSprectrogram are extracted as main features for analysis in order to distinguish patients with COVID-19 disease and healthy cases. The AICV115M dataset consisting of two classes COVID-19 and NonCOVID-19 is implemented for performance evaluation. The recorded highest accuracy on the models KNN, Ensemble and SincNet are 92.49%, 90.1% and 85.15%, respectively.

查看原文本刊更多论文

基于机器学习的咳嗽记录分析的COVID-19疾病分类

2019冠状病毒病(COVID-19)的快速传播速度已导致620多万例死亡病例。此外，最新的基因组变异患者本身几乎没有任何疾病症状。因此，除了逆转录聚合酶链反应(RT-PCR)之外，需要一种新的诊断方法成为成功检测感染病例的最重要步骤。在本研究中，应用KNN、Ensemble和SincNet模型作为基于感染患者咳嗽声记录的分类诊断的主要模型。在去除音频脚本中沉默范围的预处理步骤后，对咳嗽声音进行增强，随后将咳嗽声音分离为单个咳嗽样本，然后生成3个测试场景来处理样本类别之间的不平衡问题。然后，提取MelFrequency信息和MelSprectrogram作为主要特征进行分析，以区分COVID-19疾病患者和健康病例。采用由COVID-19和non - covid两个类组成的AICV115M数据集进行性能评估。在KNN、Ensemble和SincNet模型上记录的最高准确率分别为92.49%、90.1%和85.15%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Cybernetics and Computational Intelligence (CyberneticsCom)

自引率

0.00%

发文量