Detecting soil-transmitted helminth and Schistosoma mansoni eggs in Kato-Katz stool smear microscopy images: A comprehensive in- and out-of-distribution evaluation of YOLOv7 variants.
Mohammed Aliy Mohammed, Esla Timothy Anzaku, Peter Kenneth Ward, Bruno Levecke, Janarthanan Krishnamoorthy, Wesley De Neve, Sofie Van Hoecke
{"title":"Detecting soil-transmitted helminth and Schistosoma mansoni eggs in Kato-Katz stool smear microscopy images: A comprehensive in- and out-of-distribution evaluation of YOLOv7 variants.","authors":"Mohammed Aliy Mohammed, Esla Timothy Anzaku, Peter Kenneth Ward, Bruno Levecke, Janarthanan Krishnamoorthy, Wesley De Neve, Sofie Van Hoecke","doi":"10.1371/journal.pntd.0013234","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Soil-transmitted helminth (STH) and Schistosoma mansoni (S. mansoni) infections remain significant public health concerns in tropical and subtropical regions. Deep Convolutional Neural Networks (DCNNs) have already shown promising accuracy in identifying STH and S. mansoni eggs in the same, in-distribution (ID) settings. However, their performance in real-world, out-of-distribution (OOD) scenarios, characterized by variations in image capture devices and the appearance of previously unseen egg types, has not been thoroughly investigated. Assessing the robustness of DCNNs under these challenging conditions is crucial for ensuring their reliability in field diagnostics.</p><p><strong>Methodology: </strong>Our study addresses the gap in evaluating DCNNs for identifying STH and S. mansoni eggs by rigorously testing multiple variants of the You Only Look Once (YOLO) version 7 model under two OOD conditions: (i) a dataset shift due to a change in the image capture device, and (ii) a combination of this device change and the presence of two egg types not occurring during training. We adopted a 2 [Formula: see text] 3 montage data augmentation strategy to enhance OOD generalization. Additionally, we used the Toolkit for Identifying object Detection Errors (TIDE) and Gradient-weighted Class Activation Mapping (Grad-CAM) to perform a comprehensive analysis of the results.</p><p><strong>Principal findings: </strong>In ID settings, YOLOv7-E6E outperformed other models, achieving an F1-score of 97.47%. For the OOD scenario involving only a change in the image capture device, the 2 [Formula: see text] 3 montage strategy significantly enhanced performance, increasing precision by 8%, recall by 14.85%, and mAP@IoU0.5 by 21.36%. However, for the more complex OOD scenario that involves both a change in the capture device and the introduction of two previously unseen egg types, the proposed augmentation technique, while beneficial, did not fully address the generalization challenges across all YOLOv7 variants, highlighting the necessity of testing beyond ID scenarios, on which state-of-the-art models predominantly focus.</p><p><strong>Conclusions/significance: </strong>This study underscores the critical importance of utilizing comprehensive test sets and conducting rigorous OOD evaluations when designing machine learning solutions for STH, S. mansoni or any other helminth infections. Understanding the true capabilities of DCNNs in real-world settings depends on such thorough testing. Expanding AI-driven diagnostic assessments to account for the complexities encountered in the field is essential for creating robust tools that can significantly contribute to the global elimination of STH and S. mansoni infections as public health problems by 2030, a goal put forth by the World Health Organization.</p>","PeriodicalId":49000,"journal":{"name":"PLoS Neglected Tropical Diseases","volume":"19 7","pages":"e0013234"},"PeriodicalIF":3.4000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Neglected Tropical Diseases","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1371/journal.pntd.0013234","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PARASITOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Soil-transmitted helminth (STH) and Schistosoma mansoni (S. mansoni) infections remain significant public health concerns in tropical and subtropical regions. Deep Convolutional Neural Networks (DCNNs) have already shown promising accuracy in identifying STH and S. mansoni eggs in the same, in-distribution (ID) settings. However, their performance in real-world, out-of-distribution (OOD) scenarios, characterized by variations in image capture devices and the appearance of previously unseen egg types, has not been thoroughly investigated. Assessing the robustness of DCNNs under these challenging conditions is crucial for ensuring their reliability in field diagnostics.
Methodology: Our study addresses the gap in evaluating DCNNs for identifying STH and S. mansoni eggs by rigorously testing multiple variants of the You Only Look Once (YOLO) version 7 model under two OOD conditions: (i) a dataset shift due to a change in the image capture device, and (ii) a combination of this device change and the presence of two egg types not occurring during training. We adopted a 2 [Formula: see text] 3 montage data augmentation strategy to enhance OOD generalization. Additionally, we used the Toolkit for Identifying object Detection Errors (TIDE) and Gradient-weighted Class Activation Mapping (Grad-CAM) to perform a comprehensive analysis of the results.
Principal findings: In ID settings, YOLOv7-E6E outperformed other models, achieving an F1-score of 97.47%. For the OOD scenario involving only a change in the image capture device, the 2 [Formula: see text] 3 montage strategy significantly enhanced performance, increasing precision by 8%, recall by 14.85%, and mAP@IoU0.5 by 21.36%. However, for the more complex OOD scenario that involves both a change in the capture device and the introduction of two previously unseen egg types, the proposed augmentation technique, while beneficial, did not fully address the generalization challenges across all YOLOv7 variants, highlighting the necessity of testing beyond ID scenarios, on which state-of-the-art models predominantly focus.
Conclusions/significance: This study underscores the critical importance of utilizing comprehensive test sets and conducting rigorous OOD evaluations when designing machine learning solutions for STH, S. mansoni or any other helminth infections. Understanding the true capabilities of DCNNs in real-world settings depends on such thorough testing. Expanding AI-driven diagnostic assessments to account for the complexities encountered in the field is essential for creating robust tools that can significantly contribute to the global elimination of STH and S. mansoni infections as public health problems by 2030, a goal put forth by the World Health Organization.
背景:土壤传播蠕虫(STH)和曼氏血吸虫(S. mansoni)感染仍然是热带和亚热带地区重大的公共卫生问题。深度卷积神经网络(DCNNs)已经显示出在相同分布(ID)设置中识别STH和S. mansoni卵的准确性。然而,它们在真实世界的非分布(OOD)场景中的表现,以图像捕获设备的变化和以前未见过的卵子类型的出现为特征,尚未得到彻底的研究。在这些具有挑战性的条件下评估DCNNs的鲁棒性对于确保其在现场诊断中的可靠性至关重要。方法:我们的研究通过在两种OOD条件下严格测试You Only Look Once (YOLO) version 7模型的多个变体,解决了评估用于识别STH和S. mansoni卵的DCNNs的差距:(i)由于图像捕获设备的变化而导致的数据集移位,以及(ii)该设备变化和训练期间未发生的两种卵类型的组合。我们采用了2[公式:见文本]3蒙太奇数据增强策略来增强OOD泛化。此外,我们使用了识别对象检测错误工具包(TIDE)和梯度加权类激活映射(Grad-CAM)来对结果进行全面分析。主要发现:在ID设置中,YOLOv7-E6E优于其他型号,f1得分为97.47%。对于仅涉及图像捕获设备变化的OOD场景,2[公式:见文本]3蒙太奇策略显著提高了性能,精度提高了8%,召回率提高了14.85%,mAP@IoU0.5提高了21.36%。然而,对于更复杂的OOD场景,包括捕获设备的改变和引入两种以前未见过的卵子类型,提议的增强技术虽然有益,但并不能完全解决所有YOLOv7变体的泛化挑战,强调了在ID场景之外进行测试的必要性,最先进的模型主要关注ID场景。结论/意义:本研究强调了在设计针对STH、S. mansoni或任何其他蠕虫感染的机器学习解决方案时,利用综合测试集和进行严格的OOD评估的重要性。理解DCNNs在现实环境中的真实能力取决于这种彻底的测试。扩大人工智能驱动的诊断评估,以考虑该领域遇到的复杂性,对于创建强大的工具至关重要,这些工具可以为实现世界卫生组织提出的到2030年在全球消除作为公共卫生问题的std和S. mansoni感染做出重大贡献。
期刊介绍:
PLOS Neglected Tropical Diseases publishes research devoted to the pathology, epidemiology, prevention, treatment and control of the neglected tropical diseases (NTDs), as well as relevant public policy.
The NTDs are defined as a group of poverty-promoting chronic infectious diseases, which primarily occur in rural areas and poor urban areas of low-income and middle-income countries. Their impact on child health and development, pregnancy, and worker productivity, as well as their stigmatizing features limit economic stability.
All aspects of these diseases are considered, including:
Pathogenesis
Clinical features
Pharmacology and treatment
Diagnosis
Epidemiology
Vector biology
Vaccinology and prevention
Demographic, ecological and social determinants
Public health and policy aspects (including cost-effectiveness analyses).