Qing-Huang Song, Boyuan Wang, Yuandong Ma, Mengjie Hu, Chun Liu
{"title":"DL-YOLOX: Real-time object detection via adjustable dilated enhancement for autonomous driving scene","authors":"Qing-Huang Song, Boyuan Wang, Yuandong Ma, Mengjie Hu, Chun Liu","doi":"10.1177/01423312241239020","DOIUrl":null,"url":null,"abstract":"In the domain of autonomous driving, object detection presents several complex challenges, particularly concerning the accurate identification of small and salient objects. This paper introduces DL-YOLOX (Dilated Enhancement YOLOX), which flexibly uses dilated convolution to enhance features to achieve the purpose of improving small objects and silent objects. As we all know, a large receptive field covers a larger area and has greater contextual information, which is more advantageous for detecting large targets. A small receptive field helps capture local details and has better detection capabilities for detecting small targets. To bolster the representation of objects across various scales, we propose the integration of Dilated Adaptive Feature Fusion (DAFF) which has the ability to adaptively fuse features with different receptive fields. This innovative fusion mechanism allows for a more comprehensive understanding of objects, enabling improved detection accuracy even for objects of varying sizes. In addition, we tackle the issue of small object loss during feature propagation by introducing Stack Dilated Module (SDM), a powerful module that mitigates this phenomenon and contributes to better detection performance. Moreover, we endeavor to enhance small object detection further by replacing the conventional Intersection over Union (IoU) metric with Normalized Gaussian Wasserstein Distance (NWD), a novel distance metric that proves to be more effective in accurately gauging small object detection, thus elevating the precision of our algorithm. To thoroughly evaluate the robustness and generalization capabilities of our proposed method, we conduct extensive experiments on two benchmark datasets, namely MS COCO 2017 and BDD100K. The results from our evaluation not only affirm the significant improvements achieved in multi-scale object detection but also highlight the real-time capability of our approach. The impressive performance across these datasets demonstrates the promising potential of DL-YOLOX in revolutionizing object detection techniques in the context of autonomous driving.","PeriodicalId":49426,"journal":{"name":"Transactions of the Institute of Measurement and Control","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions of the Institute of Measurement and Control","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/01423312241239020","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In the domain of autonomous driving, object detection presents several complex challenges, particularly concerning the accurate identification of small and salient objects. This paper introduces DL-YOLOX (Dilated Enhancement YOLOX), which flexibly uses dilated convolution to enhance features to achieve the purpose of improving small objects and silent objects. As we all know, a large receptive field covers a larger area and has greater contextual information, which is more advantageous for detecting large targets. A small receptive field helps capture local details and has better detection capabilities for detecting small targets. To bolster the representation of objects across various scales, we propose the integration of Dilated Adaptive Feature Fusion (DAFF) which has the ability to adaptively fuse features with different receptive fields. This innovative fusion mechanism allows for a more comprehensive understanding of objects, enabling improved detection accuracy even for objects of varying sizes. In addition, we tackle the issue of small object loss during feature propagation by introducing Stack Dilated Module (SDM), a powerful module that mitigates this phenomenon and contributes to better detection performance. Moreover, we endeavor to enhance small object detection further by replacing the conventional Intersection over Union (IoU) metric with Normalized Gaussian Wasserstein Distance (NWD), a novel distance metric that proves to be more effective in accurately gauging small object detection, thus elevating the precision of our algorithm. To thoroughly evaluate the robustness and generalization capabilities of our proposed method, we conduct extensive experiments on two benchmark datasets, namely MS COCO 2017 and BDD100K. The results from our evaluation not only affirm the significant improvements achieved in multi-scale object detection but also highlight the real-time capability of our approach. The impressive performance across these datasets demonstrates the promising potential of DL-YOLOX in revolutionizing object detection techniques in the context of autonomous driving.
期刊介绍:
Transactions of the Institute of Measurement and Control is a fully peer-reviewed international journal. The journal covers all areas of applications in instrumentation and control. Its scope encompasses cutting-edge research and development, education and industrial applications.