Multi‐object road waste detection and classification based on binocular vision

The Journal of Engineering Pub Date : 2024-05-01 DOI:10.1049/tje2.12389

He Guo, Lumin Chen

{"title":"Multi‐object road waste detection and classification based on binocular vision","authors":"He Guo, Lumin Chen","doi":"10.1049/tje2.12389","DOIUrl":null,"url":null,"abstract":"A road multi‐object detection algorithm is one of the core algorithms for intelligent road cleaning robots relying on machine vision. Most existing object detection algorithms analyse all image regions and finally calculate the category and location of each object. However, it is not necessary to analyse all areas of the image when detecting objects on the road surface where the background changes little, and the number of objects is small because there will be a lot of invalid calculations. If we can perform targeted local analysis on images instead of analysing all image regions, it will improve the detection efficiency. Therefore, this paper proposes a multi‐object detection method using a binocular camera and a convolutional neural network (CNN) that effectively reduces invalid calculations during the detection and improves detection efficiency. In the developed method, the binocular vision image acquired by the binocular camera is stereo matched and equalized, while linear regression and coordinate transformation eliminate the angle of the camera pair concerning the road surface. Then, the coordinates of the regions of interest (ROI) is calculated in the left vision image and the features within the ROI is extracted from the corresponding CNN's feature map. Next, ROI pooling resizes the extracted feature maps of different sizes to the same size, which are then input to the fully connected layers to output the results. The proposed binocular network and faster R‐CNN (VGG16) are trained and tested on a dataset involving 1000 road waste images. The experimental results demonstrate that the developed binocular network improves the detection accuracy and speed by 28.56% and 78.39%, respectively, compared with faster R‐CNN (VGG16), providing a reliable basis for a machine vision‐based intelligent road cleaning robot.","PeriodicalId":510109,"journal":{"name":"The Journal of Engineering","volume":"28 9","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/tje2.12389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A road multi‐object detection algorithm is one of the core algorithms for intelligent road cleaning robots relying on machine vision. Most existing object detection algorithms analyse all image regions and finally calculate the category and location of each object. However, it is not necessary to analyse all areas of the image when detecting objects on the road surface where the background changes little, and the number of objects is small because there will be a lot of invalid calculations. If we can perform targeted local analysis on images instead of analysing all image regions, it will improve the detection efficiency. Therefore, this paper proposes a multi‐object detection method using a binocular camera and a convolutional neural network (CNN) that effectively reduces invalid calculations during the detection and improves detection efficiency. In the developed method, the binocular vision image acquired by the binocular camera is stereo matched and equalized, while linear regression and coordinate transformation eliminate the angle of the camera pair concerning the road surface. Then, the coordinates of the regions of interest (ROI) is calculated in the left vision image and the features within the ROI is extracted from the corresponding CNN's feature map. Next, ROI pooling resizes the extracted feature maps of different sizes to the same size, which are then input to the fully connected layers to output the results. The proposed binocular network and faster R‐CNN (VGG16) are trained and tested on a dataset involving 1000 road waste images. The experimental results demonstrate that the developed binocular network improves the detection accuracy and speed by 28.56% and 78.39%, respectively, compared with faster R‐CNN (VGG16), providing a reliable basis for a machine vision‐based intelligent road cleaning robot.

查看原文本刊更多论文

基于双目视觉的多目标道路垃圾检测与分类

道路多目标检测算法是依靠机器视觉实现智能道路清洁机器人的核心算法之一。现有的物体检测算法大多分析所有图像区域，最后计算出每个物体的类别和位置。然而，在检测背景变化较小、物体数量较少的路面上的物体时，没有必要对图像的所有区域进行分析，因为这样会产生大量无效计算。如果我们能对图像进行有针对性的局部分析，而不是分析所有图像区域，就能提高检测效率。因此，本文提出了一种使用双目摄像头和卷积神经网络（CNN）的多目标检测方法，可有效减少检测过程中的无效计算，提高检测效率。在所开发的方法中，双目摄像头获取的双目视觉图像经过了立体匹配和均衡化处理，同时通过线性回归和坐标变换消除了摄像头对与路面的夹角。然后，计算左视觉图像中感兴趣区域（ROI）的坐标，并从相应的 CNN 特征图中提取 ROI 内的特征。接下来，ROI 池将提取的不同大小的特征图调整为相同大小，然后将其输入到全连接层以输出结果。在涉及 1000 张道路垃圾图像的数据集上，对所提出的双目网络和速度更快的 R-CNN (VGG16) 进行了训练和测试。实验结果表明，与速度更快的 R-CNN (VGG16) 相比，所开发的双目网络的检测精度和速度分别提高了 28.56% 和 78.39%，为基于机器视觉的智能道路清洁机器人提供了可靠的基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

The Journal of Engineering

自引率

0.00%

发文量