通过合成数据生成和对比学习增强微生物少射检测的鲁棒性和泛化性

IF 7 2区 医学 Q1 BIOLOGY
Nikolas Ebert , Didier Stricker , Oliver Wasenmüller
{"title":"通过合成数据生成和对比学习增强微生物少射检测的鲁棒性和泛化性","authors":"Nikolas Ebert ,&nbsp;Didier Stricker ,&nbsp;Oliver Wasenmüller","doi":"10.1016/j.compbiomed.2025.110141","DOIUrl":null,"url":null,"abstract":"<div><div>In many medical and pharmaceutical processes, continuous hygiene monitoring is crucial, often involving the manual detection of microorganisms in agar dishes by qualified personnel. Although deep learning methods hold promise for automating this task, they frequently encounter a shortage of sufficient training data, a prevalent challenge in colony detection. To overcome this limitation, we propose a novel pipeline that combines generative data augmentation with few-shot detection. Our approach aims to significantly enhance detection performance, even with (very) limited training data. A main component of our method is a diffusion-based generator model that inpaints synthetic bacterial colonies onto real agar plate backgrounds. This data augmentation technique enhances the diversity of training data, allowing for effective model training with only 25 real images. Our method outperforms common training-techniques, demonstrating a +0.45 mAP improvement compared to training from scratch, and a +0.15 mAP advantage over the current SOTA in synthetic data augmentation. Additionally, we integrate a decoupled feature classification strategy, where class-agnostic detection is followed by lightweight classification via a feed-forward network, making it possible to detect and classify colonies with minimal examples. This approach achieves an AP<sup>50</sup> score of 0.7 in a few-shot scenario on the AGAR dataset. Our method also demonstrates robustness to various image corruptions, such as noise and blur, proving its applicability in real-world scenarios. By reducing the need for large labeled datasets, our pipeline offers a scalable, efficient solution for colony detection in hygiene monitoring and biomedical research, with potential for broader applications in fields where rapid detection of new colony types is required.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"191 ","pages":"Article 110141"},"PeriodicalIF":7.0000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing robustness and generalization in microbiological few-shot detection through synthetic data generation and contrastive learning\",\"authors\":\"Nikolas Ebert ,&nbsp;Didier Stricker ,&nbsp;Oliver Wasenmüller\",\"doi\":\"10.1016/j.compbiomed.2025.110141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In many medical and pharmaceutical processes, continuous hygiene monitoring is crucial, often involving the manual detection of microorganisms in agar dishes by qualified personnel. Although deep learning methods hold promise for automating this task, they frequently encounter a shortage of sufficient training data, a prevalent challenge in colony detection. To overcome this limitation, we propose a novel pipeline that combines generative data augmentation with few-shot detection. Our approach aims to significantly enhance detection performance, even with (very) limited training data. A main component of our method is a diffusion-based generator model that inpaints synthetic bacterial colonies onto real agar plate backgrounds. This data augmentation technique enhances the diversity of training data, allowing for effective model training with only 25 real images. Our method outperforms common training-techniques, demonstrating a +0.45 mAP improvement compared to training from scratch, and a +0.15 mAP advantage over the current SOTA in synthetic data augmentation. Additionally, we integrate a decoupled feature classification strategy, where class-agnostic detection is followed by lightweight classification via a feed-forward network, making it possible to detect and classify colonies with minimal examples. This approach achieves an AP<sup>50</sup> score of 0.7 in a few-shot scenario on the AGAR dataset. Our method also demonstrates robustness to various image corruptions, such as noise and blur, proving its applicability in real-world scenarios. By reducing the need for large labeled datasets, our pipeline offers a scalable, efficient solution for colony detection in hygiene monitoring and biomedical research, with potential for broader applications in fields where rapid detection of new colony types is required.</div></div>\",\"PeriodicalId\":10578,\"journal\":{\"name\":\"Computers in biology and medicine\",\"volume\":\"191 \",\"pages\":\"Article 110141\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2025-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in biology and medicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0010482525004925\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525004925","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

在许多医疗和制药过程中,持续的卫生监测是至关重要的,通常涉及由合格人员手动检测琼脂培养皿中的微生物。尽管深度学习方法有望自动化这项任务,但它们经常遇到足够的训练数据短缺,这是群体检测中的一个普遍挑战。为了克服这一限制,我们提出了一种结合生成数据增强和少镜头检测的新管道。我们的方法旨在显著提高检测性能,即使(非常)有限的训练数据。我们方法的一个主要组成部分是一个基于扩散的生成器模型,它将合成的细菌菌落涂在真实的琼脂板背景上。这种数据增强技术增强了训练数据的多样性,允许仅用25张真实图像进行有效的模型训练。我们的方法优于常见的训练技术,与从头开始训练相比,mAP提高了+0.45,在合成数据增强方面,mAP比当前的SOTA提高了+0.15。此外,我们集成了一种解耦的特征分类策略,其中类别不可知检测之后是通过前馈网络进行的轻量级分类,从而可以用最少的示例检测和分类群集。这种方法在琼脂数据集上的少量射击场景中获得0.7的AP50分数。我们的方法还证明了对各种图像损坏(如噪声和模糊)的鲁棒性,证明了其在现实场景中的适用性。通过减少对大型标记数据集的需求,我们的管道为卫生监测和生物医学研究中的菌落检测提供了可扩展,高效的解决方案,在需要快速检测新菌落类型的领域具有更广泛的应用潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing robustness and generalization in microbiological few-shot detection through synthetic data generation and contrastive learning
In many medical and pharmaceutical processes, continuous hygiene monitoring is crucial, often involving the manual detection of microorganisms in agar dishes by qualified personnel. Although deep learning methods hold promise for automating this task, they frequently encounter a shortage of sufficient training data, a prevalent challenge in colony detection. To overcome this limitation, we propose a novel pipeline that combines generative data augmentation with few-shot detection. Our approach aims to significantly enhance detection performance, even with (very) limited training data. A main component of our method is a diffusion-based generator model that inpaints synthetic bacterial colonies onto real agar plate backgrounds. This data augmentation technique enhances the diversity of training data, allowing for effective model training with only 25 real images. Our method outperforms common training-techniques, demonstrating a +0.45 mAP improvement compared to training from scratch, and a +0.15 mAP advantage over the current SOTA in synthetic data augmentation. Additionally, we integrate a decoupled feature classification strategy, where class-agnostic detection is followed by lightweight classification via a feed-forward network, making it possible to detect and classify colonies with minimal examples. This approach achieves an AP50 score of 0.7 in a few-shot scenario on the AGAR dataset. Our method also demonstrates robustness to various image corruptions, such as noise and blur, proving its applicability in real-world scenarios. By reducing the need for large labeled datasets, our pipeline offers a scalable, efficient solution for colony detection in hygiene monitoring and biomedical research, with potential for broader applications in fields where rapid detection of new colony types is required.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers in biology and medicine
Computers in biology and medicine 工程技术-工程:生物医学
CiteScore
11.70
自引率
10.40%
发文量
1086
审稿时长
74 days
期刊介绍: Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信