Dongliang Ma , Fang Zhao , Ye Li , Xin Qu , Xin Jiang , Hao Wu , Xi Chen , Min Liu
{"title":"基于超像素特征金字塔网络的可扩展单目三维探测器","authors":"Dongliang Ma , Fang Zhao , Ye Li , Xin Qu , Xin Jiang , Hao Wu , Xi Chen , Min Liu","doi":"10.1016/j.asoc.2025.113389","DOIUrl":null,"url":null,"abstract":"<div><div>Monocular 3D object detection plays a pivotal role in vehicle perception systems. Current methods frequently struggle to effectively extract scene-level semantic information, and the availability of monocular 3D detectors tailored to diverse embedded devices with varying computing power may still be limited. This paper introduces MonoYolo, a scalable detector designed for practicality and efficiency with varying resource constraints. In particular, we design a Superpixel Feature Pyramid Network (SFPN) that automatically groups pixels with similar attributes together. Experimental results on KITTI and nuScenes datasets showcase the advantageous performance of MonoYolo over superior monocular detectors for large models, while the lightweight model maintains real-time detection capabilities. Meanwhile, the proposed SFPN offers a seamless integration into existing image-only 3D detectors, presenting a plug-and-play solution for enhanced monocular 3D object detection performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113389"},"PeriodicalIF":7.2000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A scalable monocular 3D detector with Superpixel Feature Pyramid Network\",\"authors\":\"Dongliang Ma , Fang Zhao , Ye Li , Xin Qu , Xin Jiang , Hao Wu , Xi Chen , Min Liu\",\"doi\":\"10.1016/j.asoc.2025.113389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Monocular 3D object detection plays a pivotal role in vehicle perception systems. Current methods frequently struggle to effectively extract scene-level semantic information, and the availability of monocular 3D detectors tailored to diverse embedded devices with varying computing power may still be limited. This paper introduces MonoYolo, a scalable detector designed for practicality and efficiency with varying resource constraints. In particular, we design a Superpixel Feature Pyramid Network (SFPN) that automatically groups pixels with similar attributes together. Experimental results on KITTI and nuScenes datasets showcase the advantageous performance of MonoYolo over superior monocular detectors for large models, while the lightweight model maintains real-time detection capabilities. Meanwhile, the proposed SFPN offers a seamless integration into existing image-only 3D detectors, presenting a plug-and-play solution for enhanced monocular 3D object detection performance.</div></div>\",\"PeriodicalId\":50737,\"journal\":{\"name\":\"Applied Soft Computing\",\"volume\":\"180 \",\"pages\":\"Article 113389\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1568494625007008\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494625007008","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A scalable monocular 3D detector with Superpixel Feature Pyramid Network
Monocular 3D object detection plays a pivotal role in vehicle perception systems. Current methods frequently struggle to effectively extract scene-level semantic information, and the availability of monocular 3D detectors tailored to diverse embedded devices with varying computing power may still be limited. This paper introduces MonoYolo, a scalable detector designed for practicality and efficiency with varying resource constraints. In particular, we design a Superpixel Feature Pyramid Network (SFPN) that automatically groups pixels with similar attributes together. Experimental results on KITTI and nuScenes datasets showcase the advantageous performance of MonoYolo over superior monocular detectors for large models, while the lightweight model maintains real-time detection capabilities. Meanwhile, the proposed SFPN offers a seamless integration into existing image-only 3D detectors, presenting a plug-and-play solution for enhanced monocular 3D object detection performance.
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.