{"title":"Diffusion-augmented nematode dataset improves few-shot classification of nematodes","authors":"Ying Zhu, Pengjun Wang, Jiayan Zhuang, Jiangjian Xiao, Jianfeng Gu, Weilun Ren, Xiong Ouyang","doi":"10.1007/s10489-025-06783-w","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate identification of plant-parasitic nematodes is essential for mitigating crop losses and maintaining agro-ecological balance. While deep learning offers a promising automated solution, it is hindered by data scarcity. To address this, the present study introduces the Diffusion-Augmented Nematode Dataset (DA-Nema), a novel offline data augmentation strategy that leverages diffusion models. DA-Nema employs a combination of semantic adaptation, morphological constraints, and style transfer to generate high-fidelity images, thereby enriching nematode datasets. Experimental results reveal that images generated using DA-Nema exhibit the lowest Fréchet Inception Distance (FID) scores, indicating superior visual realism and close alignment with the original data distribution. Expert evaluation of the nematode images corroborated these findings, highlighting DA-Nema’s enhanced visual fidelity and feature discernment. In classification tasks, models trained on DA-Nema-augmented data demonstrated only a 2% reduction in accuracy on balanced datasets, even at 40% augmentation ratio, compared to models trained solely on authentic data. Under data imbalance conditions, DA-Nema achieved a 75.3% accuracy rate in identifying 18 species of plant-parasitic nematodes, which significantly impact crops and ecosystems, marking an 18.7% improvement over baseline models. These competitive results underscore DA-Nema’s robust capacity for dataset augmentation, effectively addressing the pervasive issue of data scarcity in plant-parasitic nematode identification. Consequently, this advances the state of the art in computational biology. Furthermore, DA-Nema introduces innovative methodologies in semi-supervised learning and automated feature extraction, with the potential to significantly enhance agricultural diagnostics and management practices.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06783-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate identification of plant-parasitic nematodes is essential for mitigating crop losses and maintaining agro-ecological balance. While deep learning offers a promising automated solution, it is hindered by data scarcity. To address this, the present study introduces the Diffusion-Augmented Nematode Dataset (DA-Nema), a novel offline data augmentation strategy that leverages diffusion models. DA-Nema employs a combination of semantic adaptation, morphological constraints, and style transfer to generate high-fidelity images, thereby enriching nematode datasets. Experimental results reveal that images generated using DA-Nema exhibit the lowest Fréchet Inception Distance (FID) scores, indicating superior visual realism and close alignment with the original data distribution. Expert evaluation of the nematode images corroborated these findings, highlighting DA-Nema’s enhanced visual fidelity and feature discernment. In classification tasks, models trained on DA-Nema-augmented data demonstrated only a 2% reduction in accuracy on balanced datasets, even at 40% augmentation ratio, compared to models trained solely on authentic data. Under data imbalance conditions, DA-Nema achieved a 75.3% accuracy rate in identifying 18 species of plant-parasitic nematodes, which significantly impact crops and ecosystems, marking an 18.7% improvement over baseline models. These competitive results underscore DA-Nema’s robust capacity for dataset augmentation, effectively addressing the pervasive issue of data scarcity in plant-parasitic nematode identification. Consequently, this advances the state of the art in computational biology. Furthermore, DA-Nema introduces innovative methodologies in semi-supervised learning and automated feature extraction, with the potential to significantly enhance agricultural diagnostics and management practices.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.