Baokai Zu, Xu Li, Yafang Li, Hongyuan Wang, Jianqiang Li
{"title":"ECF-DETR:基于IoU和分类指导评价的增强跨层融合变压器花粉检测","authors":"Baokai Zu, Xu Li, Yafang Li, Hongyuan Wang, Jianqiang Li","doi":"10.1016/j.neucom.2025.130892","DOIUrl":null,"url":null,"abstract":"<div><div>Pollen allergy is one of the most common seasonal diseases, often triggering a variety of symptoms that severely affect both the physical and mental health of individuals. Therefore, rapid and accurate pollen detection is of great importance for preventing allergic reactions and protecting public health. However, because of the complexity of the pollen sampling process, the captured images often contain various impurities such as plant debris and dust. In addition, pollen grains are typically small, irregular in shape, and exhibit significant individual differences, making it difficult for existing models to effectively extract both global and local features, which limits detection performance. The Transformer architecture, with its powerful long-range dependency modeling capabilities, offers a promising solution to these challenges. To address these issues, this paper introduces a Transformer-based pollen detection framework named ECF-DETR. This method tackles key challenges such as limited training data, high annotation costs, and the mismatch between classification confidence and bounding box precision by introducing two core components: the Enhanced Cross-layer Location Information Fusion (E-CLIF) mechanism and the IoU and Classification Guided Evaluation (IoCE) strategy. E-CLIF adopts a hybrid matching strategy to increase the number of positive samples and fuses multi-layer spatial features to alleviate data scarcity. Meanwhile, IoCE jointly considers classification scores and IoU values to effectively mitigate the inconsistency between classification and localization. Extensive experiments conducted on a self-constructed pollen dataset in Beijing demonstrate that the proposed ECF-DETR achieves an Average Precision (<span><math><mrow><mi>A</mi><mi>P</mi></mrow></math></span>) of 78.8%, outperforming the baseline DETR with the Improved deNoising anchOr box (DINO) by 1.0%, and achieving a 0.3% gain over the advanced Align-DETR framework, respectively. These findings confirm the feasibility and effectiveness of Transformer-based methods for practical pollen detection applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"650 ","pages":"Article 130892"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ECF-DETR: Enhanced Cross-layer Fusion Transformer for Pollen Detection with IoU and Classification Guided Evaluation\",\"authors\":\"Baokai Zu, Xu Li, Yafang Li, Hongyuan Wang, Jianqiang Li\",\"doi\":\"10.1016/j.neucom.2025.130892\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Pollen allergy is one of the most common seasonal diseases, often triggering a variety of symptoms that severely affect both the physical and mental health of individuals. Therefore, rapid and accurate pollen detection is of great importance for preventing allergic reactions and protecting public health. However, because of the complexity of the pollen sampling process, the captured images often contain various impurities such as plant debris and dust. In addition, pollen grains are typically small, irregular in shape, and exhibit significant individual differences, making it difficult for existing models to effectively extract both global and local features, which limits detection performance. The Transformer architecture, with its powerful long-range dependency modeling capabilities, offers a promising solution to these challenges. To address these issues, this paper introduces a Transformer-based pollen detection framework named ECF-DETR. This method tackles key challenges such as limited training data, high annotation costs, and the mismatch between classification confidence and bounding box precision by introducing two core components: the Enhanced Cross-layer Location Information Fusion (E-CLIF) mechanism and the IoU and Classification Guided Evaluation (IoCE) strategy. E-CLIF adopts a hybrid matching strategy to increase the number of positive samples and fuses multi-layer spatial features to alleviate data scarcity. Meanwhile, IoCE jointly considers classification scores and IoU values to effectively mitigate the inconsistency between classification and localization. Extensive experiments conducted on a self-constructed pollen dataset in Beijing demonstrate that the proposed ECF-DETR achieves an Average Precision (<span><math><mrow><mi>A</mi><mi>P</mi></mrow></math></span>) of 78.8%, outperforming the baseline DETR with the Improved deNoising anchOr box (DINO) by 1.0%, and achieving a 0.3% gain over the advanced Align-DETR framework, respectively. These findings confirm the feasibility and effectiveness of Transformer-based methods for practical pollen detection applications.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"650 \",\"pages\":\"Article 130892\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225015644\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225015644","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
ECF-DETR: Enhanced Cross-layer Fusion Transformer for Pollen Detection with IoU and Classification Guided Evaluation
Pollen allergy is one of the most common seasonal diseases, often triggering a variety of symptoms that severely affect both the physical and mental health of individuals. Therefore, rapid and accurate pollen detection is of great importance for preventing allergic reactions and protecting public health. However, because of the complexity of the pollen sampling process, the captured images often contain various impurities such as plant debris and dust. In addition, pollen grains are typically small, irregular in shape, and exhibit significant individual differences, making it difficult for existing models to effectively extract both global and local features, which limits detection performance. The Transformer architecture, with its powerful long-range dependency modeling capabilities, offers a promising solution to these challenges. To address these issues, this paper introduces a Transformer-based pollen detection framework named ECF-DETR. This method tackles key challenges such as limited training data, high annotation costs, and the mismatch between classification confidence and bounding box precision by introducing two core components: the Enhanced Cross-layer Location Information Fusion (E-CLIF) mechanism and the IoU and Classification Guided Evaluation (IoCE) strategy. E-CLIF adopts a hybrid matching strategy to increase the number of positive samples and fuses multi-layer spatial features to alleviate data scarcity. Meanwhile, IoCE jointly considers classification scores and IoU values to effectively mitigate the inconsistency between classification and localization. Extensive experiments conducted on a self-constructed pollen dataset in Beijing demonstrate that the proposed ECF-DETR achieves an Average Precision () of 78.8%, outperforming the baseline DETR with the Improved deNoising anchOr box (DINO) by 1.0%, and achieving a 0.3% gain over the advanced Align-DETR framework, respectively. These findings confirm the feasibility and effectiveness of Transformer-based methods for practical pollen detection applications.
期刊介绍:
Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.