{"title":"使用改进的PointRCNN进行3D目标检测","authors":"Kazuki Fukitani, Ishiyama Shin, Huimin Lu, Shuo Yang, Tohru Kamiya, Yoshihisa Nakatoh, Seiichi Serikawa","doi":"10.1016/j.cogr.2022.12.001","DOIUrl":null,"url":null,"abstract":"<div><p>Recently, two-dimensional object detection (2D object detection) has been introduced in numerous applications such as building exterior diagnosis, crime prevention and surveillance, and medical fields. However, the distance (depth) information is not enough for indoor robot navigation, robot grasping, autonomous running, and so on, with conventional object detection. Therefore, in order to improve the accuracy of 3D object detection, this paper proposes an improvement of Point RCNN, which is a segmentation-based method using RPNs and has performed well in 3D detection benchmarks on the KITTI dataset commonly used in recognition tasks for automatic driving. The proposed improvement is to improve the network in the first stage of generating 3D box candidates in order to solve the problem of frequent false positives. Specifically, we added a Squeeze and Excitation (SE) Block to the network of pointnet++ that performs feature extraction in the first stage and changed the activation function from ReLU to Mish. Experiments were conducted on the KITTI dataset, which is commonly used in research aimed at automated driving, and an accurate comparison was conducted using AP. The proposed method outperforms the conventional method by several percent on all three difficulty levels.</p></div>","PeriodicalId":100288,"journal":{"name":"Cognitive Robotics","volume":"2 ","pages":"Pages 242-254"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667241322000222/pdfft?md5=976fa9833e04a5bb9d3751cbbe165535&pid=1-s2.0-S2667241322000222-main.pdf","citationCount":"0","resultStr":"{\"title\":\"3D object detection using improved PointRCNN\",\"authors\":\"Kazuki Fukitani, Ishiyama Shin, Huimin Lu, Shuo Yang, Tohru Kamiya, Yoshihisa Nakatoh, Seiichi Serikawa\",\"doi\":\"10.1016/j.cogr.2022.12.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Recently, two-dimensional object detection (2D object detection) has been introduced in numerous applications such as building exterior diagnosis, crime prevention and surveillance, and medical fields. However, the distance (depth) information is not enough for indoor robot navigation, robot grasping, autonomous running, and so on, with conventional object detection. Therefore, in order to improve the accuracy of 3D object detection, this paper proposes an improvement of Point RCNN, which is a segmentation-based method using RPNs and has performed well in 3D detection benchmarks on the KITTI dataset commonly used in recognition tasks for automatic driving. The proposed improvement is to improve the network in the first stage of generating 3D box candidates in order to solve the problem of frequent false positives. Specifically, we added a Squeeze and Excitation (SE) Block to the network of pointnet++ that performs feature extraction in the first stage and changed the activation function from ReLU to Mish. Experiments were conducted on the KITTI dataset, which is commonly used in research aimed at automated driving, and an accurate comparison was conducted using AP. The proposed method outperforms the conventional method by several percent on all three difficulty levels.</p></div>\",\"PeriodicalId\":100288,\"journal\":{\"name\":\"Cognitive Robotics\",\"volume\":\"2 \",\"pages\":\"Pages 242-254\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2667241322000222/pdfft?md5=976fa9833e04a5bb9d3751cbbe165535&pid=1-s2.0-S2667241322000222-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667241322000222\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Robotics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667241322000222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Recently, two-dimensional object detection (2D object detection) has been introduced in numerous applications such as building exterior diagnosis, crime prevention and surveillance, and medical fields. However, the distance (depth) information is not enough for indoor robot navigation, robot grasping, autonomous running, and so on, with conventional object detection. Therefore, in order to improve the accuracy of 3D object detection, this paper proposes an improvement of Point RCNN, which is a segmentation-based method using RPNs and has performed well in 3D detection benchmarks on the KITTI dataset commonly used in recognition tasks for automatic driving. The proposed improvement is to improve the network in the first stage of generating 3D box candidates in order to solve the problem of frequent false positives. Specifically, we added a Squeeze and Excitation (SE) Block to the network of pointnet++ that performs feature extraction in the first stage and changed the activation function from ReLU to Mish. Experiments were conducted on the KITTI dataset, which is commonly used in research aimed at automated driving, and an accurate comparison was conducted using AP. The proposed method outperforms the conventional method by several percent on all three difficulty levels.