Guanbo Wang, H. Ding, Zhijun Yang, Bo Li, Yihao Wang, Liyong Bao
{"title":"TRC-YOLO: A real-time detection method for lightweight targets based on mobile devices","authors":"Guanbo Wang, H. Ding, Zhijun Yang, Bo Li, Yihao Wang, Liyong Bao","doi":"10.1049/cvi2.12072","DOIUrl":"https://doi.org/10.1049/cvi2.12072","url":null,"abstract":"","PeriodicalId":301341,"journal":{"name":"IET Comput. Vis.","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133836561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BDC: Bounding-Box Deep Calibration for High Performance Face Detection","authors":"Shi Luo, Xiong-fei Li, Xiaoli Zhang","doi":"10.1049/cvi2.12122","DOIUrl":"https://doi.org/10.1049/cvi2.12122","url":null,"abstract":"The Fundamental Research Funds for the Central Universities, JLU; The Graduate Innovation Fund of Jilin University; The ‘Thirteenth Five‐Year Pla’ Scientific Research Planning Project of Education Department of Jilin Province, Grant/Award Numbers: JKH20200678KJ, JJKH20200997KJ; The National Key Research and Development Project of China, Grant/Award Number: 2019YFC0409105; The National Natural Science Foundation of China, Grant/Award Number: 61801190; The Industrial Technology Research and Development Funds of Jilin Province, Grant/Award Number: 2019C054‐3 Abstract Modern convolutional neural networks (CNNs)‐based face detectors have achieved tremendous strides due to large annotated datasets. However, misaligned results with high detection confidence but low localization accuracy restrict the further improvement of detection performance. In this paper, the authors first predict high confidence detection results on the training set itself. Surprisingly, a considerable part of them exist in the same misalignment problem. Then, the authors carefully examine these cases and point out that annotation misalignment is the main reason. Later, a comprehensive discussion is given for the replacement rationality between predicted and annotated bounding‐boxes. Finally, the authors propose a novel Bounding‐Box Deep Calibration (BDC) method to reasonably replace misaligned annotations with model predicted bounding‐boxes and offer calibrated annotations for the training set. Extensive experiments on multiple detectors and two popular benchmark datasets show the effectiveness of BDC on improving models' precision and recall rate, without adding extra inference time and memory consumption. Our simple and effective method provides a general strategy for improving face detection, especially for light‐weight detectors in real‐time situations.","PeriodicalId":301341,"journal":{"name":"IET Comput. Vis.","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114144104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"End-to-end global to local convolutional neural network learning for hand pose recovery in depth data","authors":"Meysam Madadi, Sergio Escalera, Xavier Baró, Jordi Gonzàlez","doi":"10.1049/cvi2.12064","DOIUrl":"https://doi.org/10.1049/cvi2.12064","url":null,"abstract":"","PeriodicalId":301341,"journal":{"name":"IET Comput. Vis.","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120248542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaolong Guo, Xiaosong Lan, Kunfeng Wang, Shuxiao Li
{"title":"Contour Loss for Instance Segmentation via k-step Distance Transformation Image","authors":"Xiaolong Guo, Xiaosong Lan, Kunfeng Wang, Shuxiao Li","doi":"10.1049/cvi2.12114","DOIUrl":"https://doi.org/10.1049/cvi2.12114","url":null,"abstract":"Instance segmentation aims to locate targets in the image and segment each target area at pixel level, which is one of the most important tasks in computer vision. Mask R-CNN is a classic method of instance segmentation, but we find that its predicted masks are unclear and inaccurate near contours. To cope with this problem, we draw on the idea of contour matching based on distance transformation image and propose a novel loss function, called contour loss. Contour loss is designed to specifically optimize the contour parts of the predicted masks, thus can assure more accurate instance segmentation. In order to make the proposed contour loss to be jointly trained under modern neural network frameworks, we design a differentiable k-step distance transformation image calculation module, which can approximately compute truncated distance transformation images of the predicted mask and corresponding ground-truth mask online. The proposed contour loss can be integrated into existing instance segmentation methods such as Mask R-CNN, and combined with their original loss functions without modification of the inference network structures, thus has strong versatility. Experimental results on COCO show that contour loss is effective, which can further improve instance segmentation performances.","PeriodicalId":301341,"journal":{"name":"IET Comput. Vis.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124964834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
IET Comput. Vis.Pub Date : 2019-01-22DOI: 10.12783/dtcse/ammms2018/27315
Zhenshan Bao, Bowen Li, Wen-bo Zhang
{"title":"Robustness of ToF and stereo fusion for high-accuracy depth map","authors":"Zhenshan Bao, Bowen Li, Wen-bo Zhang","doi":"10.12783/dtcse/ammms2018/27315","DOIUrl":"https://doi.org/10.12783/dtcse/ammms2018/27315","url":null,"abstract":"Depth map can be used in many applications such as robotic navigation, driverless, video production and 3D reconstruction. Currently, both active depth cameras and passive stereo vision systems are the main technical means to obtain depth information, but each of the two systems alone has its own limitations. In this paper, the Time-of-Flight (ToF) and stereo fusion framework is proposed to solve the limitations of these two systems. The scheme in this paper contains \"the prior fusion stage\" and \"the post fusion stage\". Using depth information design energy function from ToF cameras to boost the stereo matching of passive stereo in \"the prior fusion stage\". During \"the post fusion stage\", the weighting function is designed according to the depth map of the stereo matching and the credibility map of the ToF depth map, and then the adaptive weighted depth fusion is performed. Experimental results clearly show that our method can obtain high-precision and high-resolution depth maps with better robustness than the other fusion approaches.","PeriodicalId":301341,"journal":{"name":"IET Comput. Vis.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131844100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}