Wanzhen Zhou , Junjie Wang , Xi Meng , Jianxia Wang , Yufei Song , Zhiguo Liu
{"title":"MP-YOLO: multidimensional feature fusion based layer adaptive pruning YOLO for dense vehicle object detection algorithm","authors":"Wanzhen Zhou , Junjie Wang , Xi Meng , Jianxia Wang , Yufei Song , Zhiguo Liu","doi":"10.1016/j.jvcir.2025.104560","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, artificial intelligence technology has been applied in the research and development of autonomous vehicles. However, the high energy consumption of artificial intelligence models and the high precision requirements of object detection in autonomous driving have led to a stagnation in the development of autonomous vehicles. To alleviate the above problems, we optimize YOLOv8 and propose a lightweight vehicle object detection algorithm, MP-YOLO (Multidimensional feature fusion and layer adaptive pruning YOLO), to adapt to edge devices with limited storage while meeting the requirements for detection accuracy. Firstly, two multi-scale feature fusion modules, MSFB and HFF, are proposed to merge features of different dimensions, enhancing the model’s feature learning capability. Secondly, a detection head at a scale of 160*160 is added to improve small object detection capability. Thirdly, the WIoU loss function replaces the original CIOU loss function in YOLOv8 to address the issue of high overlap among road objects. Lastly, using the Layer Adaptive Sparsity for Magnitude-based Pruning (LAMP) method to significantly reduce model size. The MP-YOLO model was tested on the latest automatic driving dataset DAIR-V2X, and the results showed that the performance of the proposed MP-YOLO exceeded the original model, with improvements of 4.7 % in AP<sub>50</sub> and 4.2 % in AP, and the model size changed from the initial 6 MB to 2.2 MB. It is superior to other classical detection models in terms of volume and accuracy, and meets the requirements of deployment on edge devices. The source code is available at <span><span>https://github.com/Wang-jj-zs/MP-YOLO/tree/master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54755,"journal":{"name":"Journal of Visual Communication and Image Representation","volume":"112 ","pages":"Article 104560"},"PeriodicalIF":3.1000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Visual Communication and Image Representation","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1047320325001749","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, artificial intelligence technology has been applied in the research and development of autonomous vehicles. However, the high energy consumption of artificial intelligence models and the high precision requirements of object detection in autonomous driving have led to a stagnation in the development of autonomous vehicles. To alleviate the above problems, we optimize YOLOv8 and propose a lightweight vehicle object detection algorithm, MP-YOLO (Multidimensional feature fusion and layer adaptive pruning YOLO), to adapt to edge devices with limited storage while meeting the requirements for detection accuracy. Firstly, two multi-scale feature fusion modules, MSFB and HFF, are proposed to merge features of different dimensions, enhancing the model’s feature learning capability. Secondly, a detection head at a scale of 160*160 is added to improve small object detection capability. Thirdly, the WIoU loss function replaces the original CIOU loss function in YOLOv8 to address the issue of high overlap among road objects. Lastly, using the Layer Adaptive Sparsity for Magnitude-based Pruning (LAMP) method to significantly reduce model size. The MP-YOLO model was tested on the latest automatic driving dataset DAIR-V2X, and the results showed that the performance of the proposed MP-YOLO exceeded the original model, with improvements of 4.7 % in AP50 and 4.2 % in AP, and the model size changed from the initial 6 MB to 2.2 MB. It is superior to other classical detection models in terms of volume and accuracy, and meets the requirements of deployment on edge devices. The source code is available at https://github.com/Wang-jj-zs/MP-YOLO/tree/master.
期刊介绍:
The Journal of Visual Communication and Image Representation publishes papers on state-of-the-art visual communication and image representation, with emphasis on novel technologies and theoretical work in this multidisciplinary area of pure and applied research. The field of visual communication and image representation is considered in its broadest sense and covers both digital and analog aspects as well as processing and communication in biological visual systems.