{"title":"Towards more accurate object detection via encoding reinforcement and multi-channel enhancement","authors":"Weina Wang, Shuangyong Li, Huxidan Jumahong","doi":"10.1007/s10489-024-06200-8","DOIUrl":null,"url":null,"abstract":"<div><p>The existing object detection networks typically apply small kernel convolution that can extract sufficient features for recognizing targets but have poor long-range dependency capability and smaller receptive fields. This paper proposes an object detection network with structure featuring large kernel convolutions and multiple channels. Firstly, the encoding reinforcement module using large kernel convolutions is designed to enlarge the receptive field and improve global feature extraction. Then, the channel enhancement module is constructed to enhance structural information learning. In addition, the encoding reinforcement and channel enhancement are designed in a lightweight way. Finally, the WIOU loss function is introduced to enhance the model’s robustness in poor-quality datasets. In the experiments, the proposed model can achieve optimal performance with similar parameters or computational complexity to existing CNN-based lightweight models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 2","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06200-8","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The existing object detection networks typically apply small kernel convolution that can extract sufficient features for recognizing targets but have poor long-range dependency capability and smaller receptive fields. This paper proposes an object detection network with structure featuring large kernel convolutions and multiple channels. Firstly, the encoding reinforcement module using large kernel convolutions is designed to enlarge the receptive field and improve global feature extraction. Then, the channel enhancement module is constructed to enhance structural information learning. In addition, the encoding reinforcement and channel enhancement are designed in a lightweight way. Finally, the WIOU loss function is introduced to enhance the model’s robustness in poor-quality datasets. In the experiments, the proposed model can achieve optimal performance with similar parameters or computational complexity to existing CNN-based lightweight models.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.