Topological data analysis and machine learning for COVID-19 detection in CT scan lung images.

Rabih Assaf, Abbas Rammal, Alban Goupil, Mohammad Kacim, Valeriu Vrabie
{"title":"Topological data analysis and machine learning for COVID-19 detection in CT scan lung images.","authors":"Rabih Assaf, Abbas Rammal, Alban Goupil, Mohammad Kacim, Valeriu Vrabie","doi":"10.1186/s42490-025-00089-1","DOIUrl":null,"url":null,"abstract":"<p><p>COVID-19 has claimed the lives of thousands over the past years. Although pathogenic laboratory testing is the established standard, it carries a significant drawback with a notable rate of false negatives. Consequently, there is an urgent need for alternative diagnostic approaches to combat this threat. In response to this pressing need for accurate and parameter-free methods for COVID-19 identification, particularly within lung images, we introduce a novel approach that combines the principles of topological data analysis with the capabilities of machine learning. Our proposed methodology entails the extraction of persistent homology features from lung images, effectively capturing the intrinsic topological properties inherent in the data. These extracted persistent homology features then serve as inputs for various machine learning methods employed for classification purposes. Our primary objective is to achieve exceptional accuracy in the detection of COVID-19 all while showcasing the effectiveness of these topological features. The experimental results demonstrate that the Random Forest Classifier and the Support Vector Machine models outperform the rest, showcasing their effectiveness in classifying CT scan lung images with remarkable precision-an accuracy rate of 97.5% for the Random Forest model and an AUC score that surpasses 0.99 for the SVM. Results of the model on the same data after exclusion of the topological features and on other data with application of the same model with topological features showed the efficiency of these features in the classification task.</p>","PeriodicalId":72425,"journal":{"name":"BMC biomedical engineering","volume":"7 1","pages":"4"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11963280/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC biomedical engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s42490-025-00089-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

COVID-19 has claimed the lives of thousands over the past years. Although pathogenic laboratory testing is the established standard, it carries a significant drawback with a notable rate of false negatives. Consequently, there is an urgent need for alternative diagnostic approaches to combat this threat. In response to this pressing need for accurate and parameter-free methods for COVID-19 identification, particularly within lung images, we introduce a novel approach that combines the principles of topological data analysis with the capabilities of machine learning. Our proposed methodology entails the extraction of persistent homology features from lung images, effectively capturing the intrinsic topological properties inherent in the data. These extracted persistent homology features then serve as inputs for various machine learning methods employed for classification purposes. Our primary objective is to achieve exceptional accuracy in the detection of COVID-19 all while showcasing the effectiveness of these topological features. The experimental results demonstrate that the Random Forest Classifier and the Support Vector Machine models outperform the rest, showcasing their effectiveness in classifying CT scan lung images with remarkable precision-an accuracy rate of 97.5% for the Random Forest model and an AUC score that surpasses 0.99 for the SVM. Results of the model on the same data after exclusion of the topological features and on other data with application of the same model with topological features showed the efficiency of these features in the classification task.

CT扫描肺部图像中COVID-19检测的拓扑数据分析和机器学习。
在过去几年中,COVID-19 已夺去了数千人的生命。虽然病原体实验室检测是既定的标准,但它也有一个显著的缺点,那就是假阴性率很高。因此,迫切需要替代诊断方法来应对这一威胁。为了满足对准确且无参数的 COVID-19 识别方法(尤其是在肺部图像中)的迫切需求,我们引入了一种将拓扑数据分析原理与机器学习功能相结合的新方法。我们提出的方法需要从肺部图像中提取持久同源性特征,从而有效捕捉数据固有的拓扑特性。这些提取的持久同源性特征可作为各种机器学习方法的输入,用于分类目的。我们的主要目标是在检测 COVID-19 时达到极高的准确率,同时展示这些拓扑特征的有效性。实验结果表明,随机森林分类器和支持向量机模型的表现优于其他模型,它们在对 CT 扫描肺部图像进行分类时效果显著--随机森林模型的准确率高达 97.5%,而 SVM 的 AUC 分数超过了 0.99。该模型在排除拓扑特征后的相同数据上的结果,以及在其他数据上应用具有拓扑特征的相同模型的结果,都显示了这些特征在分类任务中的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
19 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信