{"title":"Deep Learning based Visual Object Recognition for Manipulator Grasps","authors":"Min-Fan Ricky Lee, Fu-Yao Hsu, Hoang-Phuong Doan, Quang-Duy To, Yavier Kristanto","doi":"10.1109/MESA55290.2022.10004478","DOIUrl":null,"url":null,"abstract":"The visual object recognition using a CCD camera for manipulator grasps suffers from uncertainty (e.g., illumination, viewpoint, occlusion, and appearance). The conventional machine learning approach for classification requires a definite feature extraction before the model learning. The accuracy of feature extraction is affected in the presence of those uncertainties. A deep-learning-based approach for a manipulator is proposed (YOLOv4 framework, 167 layers) for the classification of various brands of condoms. This neural network architecture is improved based on the PRNet V3 and CSPNet which reduces computation without affecting the convergence of loss function during the learning. Three primary metrics (accuracy, precision, and recall) are used to evaluate the proposed model's prediction. The testing scenario includes the variation of working distance and viewpoint between the CCD camera and the object. The experiment results show the proposed Yolo v4 outperforms the other architectures (Yolo v3, Retina Net, ResNet-50, and ResNet-l0l).","PeriodicalId":410029,"journal":{"name":"2022 18th IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications (MESA)","volume":"337 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th IEEE/ASME International Conference on Mechatronic and Embedded Systems and Applications (MESA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MESA55290.2022.10004478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The visual object recognition using a CCD camera for manipulator grasps suffers from uncertainty (e.g., illumination, viewpoint, occlusion, and appearance). The conventional machine learning approach for classification requires a definite feature extraction before the model learning. The accuracy of feature extraction is affected in the presence of those uncertainties. A deep-learning-based approach for a manipulator is proposed (YOLOv4 framework, 167 layers) for the classification of various brands of condoms. This neural network architecture is improved based on the PRNet V3 and CSPNet which reduces computation without affecting the convergence of loss function during the learning. Three primary metrics (accuracy, precision, and recall) are used to evaluate the proposed model's prediction. The testing scenario includes the variation of working distance and viewpoint between the CCD camera and the object. The experiment results show the proposed Yolo v4 outperforms the other architectures (Yolo v3, Retina Net, ResNet-50, and ResNet-l0l).