Meygen D. Cruz, J. Keh, Maverick Rivera, N. Velasco, John Anthony C. Jose, E. Sybingco, E. Dadios, Wira Madria, Angelimarie Miguel
{"title":"自动拟合:用于拟合边界框注释的人机协作特性","authors":"Meygen D. Cruz, J. Keh, Maverick Rivera, N. Velasco, John Anthony C. Jose, E. Sybingco, E. Dadios, Wira Madria, Angelimarie Miguel","doi":"10.1109/HNICEM51456.2020.9400067","DOIUrl":null,"url":null,"abstract":"Large high-quality annotated datasets are essential in training deep learning models, but are expensive and time-consuming to create. A large chunk of time in the annotation process goes into adjusting bounding boxes to fit the desired object. In this paper, we propose the facilitation of human machine collaboration through the creation of an Auto-Fit feature which automatically tightens an initial bounding box around an object being annotated. The challenge lies in making this feature class agnostic in order to allow its usage regardless of the type of object being annotated. This is achieved through the use of various computer vision algorithms to extract the desired object as a foreground mask, determine the coordinates of its extremities, and redraw the bounding box based on these new coordinates. The best results were achieved with the Grabcut algorithm, which attained an accuracy of 84.69% on small boxes. The Pytorch implementation of ResNet-101 pre-trained on the COCO train2017 dataset is also used as a foreground extractor in one iteration of the implementation, in order to provide a baseline comparison between the performance of a computer vision-based solution versus one based on a standalone object detection model. This garnered an accuracy of 83.04% on small boxes, showing that the computer vision-based solution is able to surpass the accuracy of a standalone object detection model.","PeriodicalId":230810,"journal":{"name":"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Auto-Fit: A Human-Machine Collaboration Feature for Fitting Bounding Box Annotations\",\"authors\":\"Meygen D. Cruz, J. Keh, Maverick Rivera, N. Velasco, John Anthony C. Jose, E. Sybingco, E. Dadios, Wira Madria, Angelimarie Miguel\",\"doi\":\"10.1109/HNICEM51456.2020.9400067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large high-quality annotated datasets are essential in training deep learning models, but are expensive and time-consuming to create. A large chunk of time in the annotation process goes into adjusting bounding boxes to fit the desired object. In this paper, we propose the facilitation of human machine collaboration through the creation of an Auto-Fit feature which automatically tightens an initial bounding box around an object being annotated. The challenge lies in making this feature class agnostic in order to allow its usage regardless of the type of object being annotated. This is achieved through the use of various computer vision algorithms to extract the desired object as a foreground mask, determine the coordinates of its extremities, and redraw the bounding box based on these new coordinates. The best results were achieved with the Grabcut algorithm, which attained an accuracy of 84.69% on small boxes. The Pytorch implementation of ResNet-101 pre-trained on the COCO train2017 dataset is also used as a foreground extractor in one iteration of the implementation, in order to provide a baseline comparison between the performance of a computer vision-based solution versus one based on a standalone object detection model. This garnered an accuracy of 83.04% on small boxes, showing that the computer vision-based solution is able to surpass the accuracy of a standalone object detection model.\",\"PeriodicalId\":230810,\"journal\":{\"name\":\"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HNICEM51456.2020.9400067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment, and Management (HNICEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HNICEM51456.2020.9400067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Auto-Fit: A Human-Machine Collaboration Feature for Fitting Bounding Box Annotations
Large high-quality annotated datasets are essential in training deep learning models, but are expensive and time-consuming to create. A large chunk of time in the annotation process goes into adjusting bounding boxes to fit the desired object. In this paper, we propose the facilitation of human machine collaboration through the creation of an Auto-Fit feature which automatically tightens an initial bounding box around an object being annotated. The challenge lies in making this feature class agnostic in order to allow its usage regardless of the type of object being annotated. This is achieved through the use of various computer vision algorithms to extract the desired object as a foreground mask, determine the coordinates of its extremities, and redraw the bounding box based on these new coordinates. The best results were achieved with the Grabcut algorithm, which attained an accuracy of 84.69% on small boxes. The Pytorch implementation of ResNet-101 pre-trained on the COCO train2017 dataset is also used as a foreground extractor in one iteration of the implementation, in order to provide a baseline comparison between the performance of a computer vision-based solution versus one based on a standalone object detection model. This garnered an accuracy of 83.04% on small boxes, showing that the computer vision-based solution is able to surpass the accuracy of a standalone object detection model.