ECF-DETR:基于IoU和分类指导评价的增强跨层融合变压器花粉检测

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Baokai Zu, Xu Li, Yafang Li, Hongyuan Wang, Jianqiang Li
{"title":"ECF-DETR:基于IoU和分类指导评价的增强跨层融合变压器花粉检测","authors":"Baokai Zu,&nbsp;Xu Li,&nbsp;Yafang Li,&nbsp;Hongyuan Wang,&nbsp;Jianqiang Li","doi":"10.1016/j.neucom.2025.130892","DOIUrl":null,"url":null,"abstract":"<div><div>Pollen allergy is one of the most common seasonal diseases, often triggering a variety of symptoms that severely affect both the physical and mental health of individuals. Therefore, rapid and accurate pollen detection is of great importance for preventing allergic reactions and protecting public health. However, because of the complexity of the pollen sampling process, the captured images often contain various impurities such as plant debris and dust. In addition, pollen grains are typically small, irregular in shape, and exhibit significant individual differences, making it difficult for existing models to effectively extract both global and local features, which limits detection performance. The Transformer architecture, with its powerful long-range dependency modeling capabilities, offers a promising solution to these challenges. To address these issues, this paper introduces a Transformer-based pollen detection framework named ECF-DETR. This method tackles key challenges such as limited training data, high annotation costs, and the mismatch between classification confidence and bounding box precision by introducing two core components: the Enhanced Cross-layer Location Information Fusion (E-CLIF) mechanism and the IoU and Classification Guided Evaluation (IoCE) strategy. E-CLIF adopts a hybrid matching strategy to increase the number of positive samples and fuses multi-layer spatial features to alleviate data scarcity. Meanwhile, IoCE jointly considers classification scores and IoU values to effectively mitigate the inconsistency between classification and localization. Extensive experiments conducted on a self-constructed pollen dataset in Beijing demonstrate that the proposed ECF-DETR achieves an Average Precision (<span><math><mrow><mi>A</mi><mi>P</mi></mrow></math></span>) of 78.8%, outperforming the baseline DETR with the Improved deNoising anchOr box (DINO) by 1.0%, and achieving a 0.3% gain over the advanced Align-DETR framework, respectively. These findings confirm the feasibility and effectiveness of Transformer-based methods for practical pollen detection applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"650 ","pages":"Article 130892"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ECF-DETR: Enhanced Cross-layer Fusion Transformer for Pollen Detection with IoU and Classification Guided Evaluation\",\"authors\":\"Baokai Zu,&nbsp;Xu Li,&nbsp;Yafang Li,&nbsp;Hongyuan Wang,&nbsp;Jianqiang Li\",\"doi\":\"10.1016/j.neucom.2025.130892\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pollen allergy is one of the most common seasonal diseases, often triggering a variety of symptoms that severely affect both the physical and mental health of individuals. Therefore, rapid and accurate pollen detection is of great importance for preventing allergic reactions and protecting public health. However, because of the complexity of the pollen sampling process, the captured images often contain various impurities such as plant debris and dust. In addition, pollen grains are typically small, irregular in shape, and exhibit significant individual differences, making it difficult for existing models to effectively extract both global and local features, which limits detection performance. The Transformer architecture, with its powerful long-range dependency modeling capabilities, offers a promising solution to these challenges. To address these issues, this paper introduces a Transformer-based pollen detection framework named ECF-DETR. This method tackles key challenges such as limited training data, high annotation costs, and the mismatch between classification confidence and bounding box precision by introducing two core components: the Enhanced Cross-layer Location Information Fusion (E-CLIF) mechanism and the IoU and Classification Guided Evaluation (IoCE) strategy. E-CLIF adopts a hybrid matching strategy to increase the number of positive samples and fuses multi-layer spatial features to alleviate data scarcity. Meanwhile, IoCE jointly considers classification scores and IoU values to effectively mitigate the inconsistency between classification and localization. Extensive experiments conducted on a self-constructed pollen dataset in Beijing demonstrate that the proposed ECF-DETR achieves an Average Precision (<span><math><mrow><mi>A</mi><mi>P</mi></mrow></math></span>) of 78.8%, outperforming the baseline DETR with the Improved deNoising anchOr box (DINO) by 1.0%, and achieving a 0.3% gain over the advanced Align-DETR framework, respectively. These findings confirm the feasibility and effectiveness of Transformer-based methods for practical pollen detection applications.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"650 \",\"pages\":\"Article 130892\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225015644\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015644","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

花粉过敏是最常见的季节性疾病之一,经常引发各种症状,严重影响个体的身心健康。因此,快速准确的花粉检测对于预防过敏反应和保护公众健康具有重要意义。然而,由于花粉采样过程的复杂性,捕获的图像通常包含各种杂质,如植物碎片和灰尘。此外,花粉粒通常较小,形状不规则,个体差异显著,现有模型难以有效地提取全局和局部特征,从而限制了检测性能。具有强大的远程依赖关系建模功能的Transformer体系结构为这些挑战提供了一个有希望的解决方案。为了解决这些问题,本文介绍了一种基于变压器的花粉检测框架ECF-DETR。该方法通过引入增强的跨层位置信息融合(E-CLIF)机制和IoU和分类引导评估(IoCE)策略,解决了训练数据有限、标注成本高以及分类置信度和边界盒精度不匹配等关键问题。E-CLIF采用混合匹配策略增加正样本数量,融合多层空间特征,缓解数据稀缺性。同时,IoCE联合考虑分类分数和IoU值,有效缓解了分类与定位不一致的问题。在北京自行构建的花粉数据集上进行的大量实验表明,所提出的ECF-DETR的平均精度(AP)为78.8%,比改进的去噪锚框(DINO)的基线DETR提高了1.0%,比先进的aligni -DETR框架分别提高了0.3%。这些发现证实了基于变压器的花粉检测方法在实际应用中的可行性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
ECF-DETR: Enhanced Cross-layer Fusion Transformer for Pollen Detection with IoU and Classification Guided Evaluation
Pollen allergy is one of the most common seasonal diseases, often triggering a variety of symptoms that severely affect both the physical and mental health of individuals. Therefore, rapid and accurate pollen detection is of great importance for preventing allergic reactions and protecting public health. However, because of the complexity of the pollen sampling process, the captured images often contain various impurities such as plant debris and dust. In addition, pollen grains are typically small, irregular in shape, and exhibit significant individual differences, making it difficult for existing models to effectively extract both global and local features, which limits detection performance. The Transformer architecture, with its powerful long-range dependency modeling capabilities, offers a promising solution to these challenges. To address these issues, this paper introduces a Transformer-based pollen detection framework named ECF-DETR. This method tackles key challenges such as limited training data, high annotation costs, and the mismatch between classification confidence and bounding box precision by introducing two core components: the Enhanced Cross-layer Location Information Fusion (E-CLIF) mechanism and the IoU and Classification Guided Evaluation (IoCE) strategy. E-CLIF adopts a hybrid matching strategy to increase the number of positive samples and fuses multi-layer spatial features to alleviate data scarcity. Meanwhile, IoCE jointly considers classification scores and IoU values to effectively mitigate the inconsistency between classification and localization. Extensive experiments conducted on a self-constructed pollen dataset in Beijing demonstrate that the proposed ECF-DETR achieves an Average Precision (AP) of 78.8%, outperforming the baseline DETR with the Improved deNoising anchOr box (DINO) by 1.0%, and achieving a 0.3% gain over the advanced Align-DETR framework, respectively. These findings confirm the feasibility and effectiveness of Transformer-based methods for practical pollen detection applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信