{"title":"一种基于端到端CNN回归的抓取姿态检测方案","authors":"Hu Cheng, M. Meng","doi":"10.1109/ROBIO.2018.8665219","DOIUrl":null,"url":null,"abstract":"In this paper, we proposed a solution to the problem of grasp pose detection with a convolutional neural network (CNN) trained and tested on the Cornell Grasp Dataset. We treat this task as a regression problem so that our network outputs the location, rotation and size of the grasp directly in a RGB or RGBD image. A novel loss is defined in the back propagation that makes the network select the grasp closest to the ground truth. This loss can prevent the predicted grasp from falling into the average location of the multiple grasp ground truth. We train the network by two cascade steps to make the network learn to predict the locations and rotations of the grasp, respectively. Because the prediction of the rotation is relatively difficult for the objects with irregular shapes, the weights for the loss of the grasp angle are increased during the second step by multiplying a scale factor. The proposed training process is simple and the pipeline is clean as our model is trained from end to end. We achieved a 90.4% grasp prediction accuracy in our experiments. In addition, we proposed a joint training network that generates quantity grasp candidates and classifies them as good or not good for the multiple grasp predictions.","PeriodicalId":417415,"journal":{"name":"2018 IEEE International Conference on Robotics and Biomimetics (ROBIO)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A Grasp Pose Detection Scheme with an End-to-End CNN Regression Approach\",\"authors\":\"Hu Cheng, M. Meng\",\"doi\":\"10.1109/ROBIO.2018.8665219\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we proposed a solution to the problem of grasp pose detection with a convolutional neural network (CNN) trained and tested on the Cornell Grasp Dataset. We treat this task as a regression problem so that our network outputs the location, rotation and size of the grasp directly in a RGB or RGBD image. A novel loss is defined in the back propagation that makes the network select the grasp closest to the ground truth. This loss can prevent the predicted grasp from falling into the average location of the multiple grasp ground truth. We train the network by two cascade steps to make the network learn to predict the locations and rotations of the grasp, respectively. Because the prediction of the rotation is relatively difficult for the objects with irregular shapes, the weights for the loss of the grasp angle are increased during the second step by multiplying a scale factor. The proposed training process is simple and the pipeline is clean as our model is trained from end to end. We achieved a 90.4% grasp prediction accuracy in our experiments. In addition, we proposed a joint training network that generates quantity grasp candidates and classifies them as good or not good for the multiple grasp predictions.\",\"PeriodicalId\":417415,\"journal\":{\"name\":\"2018 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Robotics and Biomimetics (ROBIO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBIO.2018.8665219\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Robotics and Biomimetics (ROBIO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBIO.2018.8665219","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Grasp Pose Detection Scheme with an End-to-End CNN Regression Approach
In this paper, we proposed a solution to the problem of grasp pose detection with a convolutional neural network (CNN) trained and tested on the Cornell Grasp Dataset. We treat this task as a regression problem so that our network outputs the location, rotation and size of the grasp directly in a RGB or RGBD image. A novel loss is defined in the back propagation that makes the network select the grasp closest to the ground truth. This loss can prevent the predicted grasp from falling into the average location of the multiple grasp ground truth. We train the network by two cascade steps to make the network learn to predict the locations and rotations of the grasp, respectively. Because the prediction of the rotation is relatively difficult for the objects with irregular shapes, the weights for the loss of the grasp angle are increased during the second step by multiplying a scale factor. The proposed training process is simple and the pipeline is clean as our model is trained from end to end. We achieved a 90.4% grasp prediction accuracy in our experiments. In addition, we proposed a joint training network that generates quantity grasp candidates and classifies them as good or not good for the multiple grasp predictions.