Diffusion-augmented nematode dataset improves few-shot classification of nematodes

IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Ying Zhu, Pengjun Wang, Jiayan Zhuang, Jiangjian Xiao, Jianfeng Gu, Weilun Ren, Xiong Ouyang
{"title":"Diffusion-augmented nematode dataset improves few-shot classification of nematodes","authors":"Ying Zhu,&nbsp;Pengjun Wang,&nbsp;Jiayan Zhuang,&nbsp;Jiangjian Xiao,&nbsp;Jianfeng Gu,&nbsp;Weilun Ren,&nbsp;Xiong Ouyang","doi":"10.1007/s10489-025-06783-w","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate identification of plant-parasitic nematodes is essential for mitigating crop losses and maintaining agro-ecological balance. While deep learning offers a promising automated solution, it is hindered by data scarcity. To address this, the present study introduces the Diffusion-Augmented Nematode Dataset (DA-Nema), a novel offline data augmentation strategy that leverages diffusion models. DA-Nema employs a combination of semantic adaptation, morphological constraints, and style transfer to generate high-fidelity images, thereby enriching nematode datasets. Experimental results reveal that images generated using DA-Nema exhibit the lowest Fréchet Inception Distance (FID) scores, indicating superior visual realism and close alignment with the original data distribution. Expert evaluation of the nematode images corroborated these findings, highlighting DA-Nema’s enhanced visual fidelity and feature discernment. In classification tasks, models trained on DA-Nema-augmented data demonstrated only a 2% reduction in accuracy on balanced datasets, even at 40% augmentation ratio, compared to models trained solely on authentic data. Under data imbalance conditions, DA-Nema achieved a 75.3% accuracy rate in identifying 18 species of plant-parasitic nematodes, which significantly impact crops and ecosystems, marking an 18.7% improvement over baseline models. These competitive results underscore DA-Nema’s robust capacity for dataset augmentation, effectively addressing the pervasive issue of data scarcity in plant-parasitic nematode identification. Consequently, this advances the state of the art in computational biology. Furthermore, DA-Nema introduces innovative methodologies in semi-supervised learning and automated feature extraction, with the potential to significantly enhance agricultural diagnostics and management practices.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06783-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate identification of plant-parasitic nematodes is essential for mitigating crop losses and maintaining agro-ecological balance. While deep learning offers a promising automated solution, it is hindered by data scarcity. To address this, the present study introduces the Diffusion-Augmented Nematode Dataset (DA-Nema), a novel offline data augmentation strategy that leverages diffusion models. DA-Nema employs a combination of semantic adaptation, morphological constraints, and style transfer to generate high-fidelity images, thereby enriching nematode datasets. Experimental results reveal that images generated using DA-Nema exhibit the lowest Fréchet Inception Distance (FID) scores, indicating superior visual realism and close alignment with the original data distribution. Expert evaluation of the nematode images corroborated these findings, highlighting DA-Nema’s enhanced visual fidelity and feature discernment. In classification tasks, models trained on DA-Nema-augmented data demonstrated only a 2% reduction in accuracy on balanced datasets, even at 40% augmentation ratio, compared to models trained solely on authentic data. Under data imbalance conditions, DA-Nema achieved a 75.3% accuracy rate in identifying 18 species of plant-parasitic nematodes, which significantly impact crops and ecosystems, marking an 18.7% improvement over baseline models. These competitive results underscore DA-Nema’s robust capacity for dataset augmentation, effectively addressing the pervasive issue of data scarcity in plant-parasitic nematode identification. Consequently, this advances the state of the art in computational biology. Furthermore, DA-Nema introduces innovative methodologies in semi-supervised learning and automated feature extraction, with the potential to significantly enhance agricultural diagnostics and management practices.

扩散增强的线虫数据集改进了线虫的少射分类
准确鉴定植物寄生线虫对减轻作物损失和维持农业生态平衡至关重要。虽然深度学习提供了一个很有前途的自动化解决方案,但它受到数据稀缺的阻碍。为了解决这个问题,本研究引入了扩散增强线虫数据集(DA-Nema),这是一种利用扩散模型的新型离线数据增强策略。DA-Nema结合了语义适应、形态约束和风格转移来生成高保真图像,从而丰富了线虫数据集。实验结果表明,使用DA-Nema生成的图像显示出最低的fr起始距离(FID)分数,表明了更好的视觉真实感和与原始数据分布的紧密一致性。专家对线虫图像的评估证实了这些发现,强调了DA-Nema增强的视觉保真度和特征识别。在分类任务中,与仅使用真实数据训练的模型相比,使用da - nema增强数据训练的模型在平衡数据集上的准确率仅降低了2%,即使提高了40%。在数据不平衡条件下,DA-Nema对18种植物寄生线虫的识别准确率达到75.3%,显著影响作物和生态系统,比基线模型提高了18.7%。这些具有竞争力的结果强调了DA-Nema在数据增强方面的强大能力,有效地解决了植物寄生线虫鉴定中普遍存在的数据稀缺问题。因此,这推动了计算生物学的发展。此外,DA-Nema在半监督学习和自动特征提取方面引入了创新方法,有可能显著增强农业诊断和管理实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Intelligence
Applied Intelligence 工程技术-计算机:人工智能
CiteScore
6.60
自引率
20.80%
发文量
1361
审稿时长
5.9 months
期刊介绍: With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信