MonoGhost：用于自动驾驶的轻量级单目 GhostNet 3D 物体属性估计

IF 2.9 Q2 ROBOTICS

Robotics Pub Date : 2023-11-17 DOI:10.3390/robotics12060155

Ahmed El-Dawy, A. El-Zawawi, Mohamed El-Habrouk

{"title":"MonoGhost：用于自动驾驶的轻量级单目 GhostNet 3D 物体属性估计","authors":"Ahmed El-Dawy, A. El-Zawawi, Mohamed El-Habrouk","doi":"10.3390/robotics12060155","DOIUrl":null,"url":null,"abstract":"Effective environmental perception is critical for autonomous driving; thus, the perception system requires collecting 3D information of the surrounding objects, such as their dimensions, locations, and orientation in space. Recently, deep learning has been widely used in perception systems that convert image features from a camera into semantic information. This paper presents the MonoGhost network, a lightweight Monocular GhostNet deep learning technique for full 3D object properties estimation from a single frame monocular image. Unlike other techniques, the proposed MonoGhost network first estimates relatively reliable 3D object properties depending on efficient feature extractor. The proposed MonoGhost network estimates the orientation of the 3D object as well as the 3D dimensions of that object, resulting in reasonably small errors in the dimensions estimations versus other networks. These estimations, combined with the translation projection constraints imposed by the 2D detection coordinates, allow for the prediction of a robust and dependable Bird’s Eye View bounding box. The experimental outcomes prove that the proposed MonoGhost network performs better than other state-of-the-art networks in the Bird’s Eye View of the KITTI dataset benchmark by scoring 16.73% on the moderate class and 15.01% on the hard class while preserving real-time requirements.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"48 3","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MonoGhost: Lightweight Monocular GhostNet 3D Object Properties Estimation for Autonomous Driving\",\"authors\":\"Ahmed El-Dawy, A. El-Zawawi, Mohamed El-Habrouk\",\"doi\":\"10.3390/robotics12060155\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Effective environmental perception is critical for autonomous driving; thus, the perception system requires collecting 3D information of the surrounding objects, such as their dimensions, locations, and orientation in space. Recently, deep learning has been widely used in perception systems that convert image features from a camera into semantic information. This paper presents the MonoGhost network, a lightweight Monocular GhostNet deep learning technique for full 3D object properties estimation from a single frame monocular image. Unlike other techniques, the proposed MonoGhost network first estimates relatively reliable 3D object properties depending on efficient feature extractor. The proposed MonoGhost network estimates the orientation of the 3D object as well as the 3D dimensions of that object, resulting in reasonably small errors in the dimensions estimations versus other networks. These estimations, combined with the translation projection constraints imposed by the 2D detection coordinates, allow for the prediction of a robust and dependable Bird’s Eye View bounding box. The experimental outcomes prove that the proposed MonoGhost network performs better than other state-of-the-art networks in the Bird’s Eye View of the KITTI dataset benchmark by scoring 16.73% on the moderate class and 15.01% on the hard class while preserving real-time requirements.\",\"PeriodicalId\":37568,\"journal\":{\"name\":\"Robotics\",\"volume\":\"48 3\",\"pages\":\"\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/robotics12060155\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/robotics12060155","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

有效的环境感知对于自动驾驶至关重要；因此，感知系统需要收集周围物体的三维信息，如它们在空间中的尺寸、位置和方向。最近，深度学习被广泛应用于感知系统中，它能将摄像头的图像特征转换为语义信息。本文介绍了 MonoGhost 网络，这是一种轻量级的单目 GhostNet 深度学习技术，用于从单帧单目图像估算全三维物体属性。与其他技术不同，本文提出的 MonoGhost 网络首先依靠高效的特征提取器估算出相对可靠的三维物体属性。拟议的 MonoGhost 网络能估算出三维物体的方向以及该物体的三维尺寸，与其他网络相比，其尺寸估算误差相当小。这些估计值与二维检测坐标施加的平移投影约束相结合，可以预测出稳健可靠的鸟瞰边界框。实验结果证明，拟议的 MonoGhost 网络在 KITTI 数据集鸟瞰基准测试中的表现优于其他最先进的网络，在中等级别中得分率为 16.73%，在困难级别中得分率为 15.01%，同时还能满足实时性要求。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

MonoGhost: Lightweight Monocular GhostNet 3D Object Properties Estimation for Autonomous Driving

Effective environmental perception is critical for autonomous driving; thus, the perception system requires collecting 3D information of the surrounding objects, such as their dimensions, locations, and orientation in space. Recently, deep learning has been widely used in perception systems that convert image features from a camera into semantic information. This paper presents the MonoGhost network, a lightweight Monocular GhostNet deep learning technique for full 3D object properties estimation from a single frame monocular image. Unlike other techniques, the proposed MonoGhost network first estimates relatively reliable 3D object properties depending on efficient feature extractor. The proposed MonoGhost network estimates the orientation of the 3D object as well as the 3D dimensions of that object, resulting in reasonably small errors in the dimensions estimations versus other networks. These estimations, combined with the translation projection constraints imposed by the 2D detection coordinates, allow for the prediction of a robust and dependable Bird’s Eye View bounding box. The experimental outcomes prove that the proposed MonoGhost network performs better than other state-of-the-art networks in the Bird’s Eye View of the KITTI dataset benchmark by scoring 16.73% on the moderate class and 15.01% on the hard class while preserving real-time requirements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Robotics Mathematics-Control and Optimization

CiteScore

6.70

自引率

8.10%

发文量

114

审稿时长

11 weeks

期刊介绍： Robotics publishes original papers, technical reports, case studies, review papers and tutorials in all the aspects of robotics. Special Issues devoted to important topics in advanced robotics will be published from time to time. It particularly welcomes those emerging methodologies and techniques which bridge theoretical studies and applications and have significant potential for real-world applications. It provides a forum for information exchange between professionals, academicians and engineers who are working in the area of robotics, helping them to disseminate research findings and to learn from each other’s work. Suitable topics include, but are not limited to: -intelligent robotics, mechatronics, and biomimetics -novel and biologically-inspired robotics -modelling, identification and control of robotic systems -biomedical, rehabilitation and surgical robotics -exoskeletons, prosthetics and artificial organs -AI, neural networks and fuzzy logic in robotics -multimodality human-machine interaction -wireless sensor networks for robot navigation -multi-sensor data fusion and SLAM