Enhancing object detection with large kernel convolution and cross convolution

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-06-25 DOI:10.1016/j.dsp.2025.105433

Yaqian Li , Guoping Liu , Haibin Li , Wenming Zhang , Xiaoyang Shen

{"title":"Enhancing object detection with large kernel convolution and cross convolution","authors":"Yaqian Li , Guoping Liu , Haibin Li , Wenming Zhang , Xiaoyang Shen","doi":"10.1016/j.dsp.2025.105433","DOIUrl":null,"url":null,"abstract":"<div><div>Existing object detection models often struggle with detecting small objects due to their limited ability to capture sufficient contextual information. In this paper, we introduce a lightweight object detection model that leverages large kernel convolution with attention (LKA) and a hierarchical feature fusion group (HFFG) to address this issue. The LKA module employs large kernel convolution to capture long-range dependencies and contextual information, combined with depthwise separate convolution to maintain a lightweight design. An incorporated attention mechanism further enables the modal to adaptively focus on key areas, thereby improving detection performance for small objects. The HFFG module, which integrates Cross Convolution Blocks, explores and retains structural information across different scales. By effectively extracting structural details, our model exhibits enhanced performance on object of various sizes. Extensive experiments on the VisDrone2019 and PASACAL VOC datasets demonstrate that our model achieves an outstanding mAP of 23.4 %, surpassing the baseline YOLOX-s model by +1.5 %. These results not only validate the effectiveness but also demonstrate its robustness and generalization capability.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"167 ","pages":"Article 105433"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425004555","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Existing object detection models often struggle with detecting small objects due to their limited ability to capture sufficient contextual information. In this paper, we introduce a lightweight object detection model that leverages large kernel convolution with attention (LKA) and a hierarchical feature fusion group (HFFG) to address this issue. The LKA module employs large kernel convolution to capture long-range dependencies and contextual information, combined with depthwise separate convolution to maintain a lightweight design. An incorporated attention mechanism further enables the modal to adaptively focus on key areas, thereby improving detection performance for small objects. The HFFG module, which integrates Cross Convolution Blocks, explores and retains structural information across different scales. By effectively extracting structural details, our model exhibits enhanced performance on object of various sizes. Extensive experiments on the VisDrone2019 and PASACAL VOC datasets demonstrate that our model achieves an outstanding mAP of 23.4 %, surpassing the baseline YOLOX-s model by +1.5 %. These results not only validate the effectiveness but also demonstrate its robustness and generalization capability.

查看原文本刊更多论文

利用大核卷积和交叉卷积增强目标检测

由于现有的对象检测模型捕获足够的上下文信息的能力有限，因此常常难以检测小对象。在本文中，我们引入了一个轻量级的目标检测模型，该模型利用大核卷积与注意（LKA）和分层特征融合组（HFFG）来解决这个问题。LKA模块使用大型内核卷积来捕获远程依赖关系和上下文信息，并结合深度独立卷积来维护轻量级设计。集成的注意机制进一步使模态能够自适应地关注关键区域，从而提高对小物体的检测性能。HFFG模块集成了交叉卷积块，可以在不同尺度上探索和保留结构信息。通过有效地提取结构细节，我们的模型在不同尺寸的物体上表现出更好的性能。在VisDrone2019和PASACAL VOC数据集上进行的大量实验表明，我们的模型实现了23.4%的出色mAP，比基线YOLOX-s模型高出+ 1.5%。这些结果不仅验证了该方法的有效性，而且证明了其鲁棒性和泛化能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,