2022 19th Conference on Robots and Vision (CRV)最新文献_第3页

Understanding the impact of image and input resolution on deep digital pathology patch classifiers 了解图像和输入分辨率对深度数字病理贴片分类器的影响

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-04-29 DOI: 10.1109/CRV55824.2022.00028

Eu Wern Teh, Graham W. Taylor

引用次数: 0

A Simple Method to Boost Human Pose Estimation Accuracy by Correcting the Joint Regressor for the Human3.6m Dataset 基于Human3.6m数据集的联合回归校正提高人体姿态估计精度的简单方法

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-04-29 DOI: 10.1109/CRV55824.2022.00009

Eric Hedlin, Helge Rhodin, K. M. Yi

引用次数: 3

CellDefectNet: A Machine-designed Attention Condenser Network for Electroluminescence-based Photovoltaic Cell Defect Inspection CellDefectNet:一种基于电致发光的光伏电池缺陷检测的机器设计关注聚光镜网络

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-04-25 DOI: 10.1109/CRV55824.2022.00036

Carol Xu, M. Famouri, Gautam Bathla, Saeejith Nair, M. Shafiee, Alexander Wong

{"title":"CellDefectNet: A Machine-designed Attention Condenser Network for Electroluminescence-based Photovoltaic Cell Defect Inspection","authors":"Carol Xu, M. Famouri, Gautam Bathla, Saeejith Nair, M. Shafiee, Alexander Wong","doi":"10.1109/CRV55824.2022.00036","DOIUrl":"https://doi.org/10.1109/CRV55824.2022.00036","url":null,"abstract":"Photovoltaic cells are electronic devices that convert light energy to electricity, forming the backbone of solar energy harvesting systems. An essential step in the manufacturing process for photovoltaic cells is visual quality inspection using electroluminescence imaging to identify defects such as cracks, finger interruptions, and broken cells. A big challenge faced by industry in photovoltaic cell visual inspection is the fact that it is currently done manually by human inspectors, which is extremely time consuming, laborious, and prone to human error. While deep learning approaches holds great potential to automating this inspection, the hardware resource-constrained manufac-turing scenario makes it challenging for deploying complex deep neural network architectures. In this work, we introduce CellDefectNet, a highly efficient attention condenser network designed via machine-driven design exploration specifically for electroluminesence-based photovoltaic cell defect detection on the edge. We demonstrate the efficacy of CellDetectNet on a benchmark dataset comprising of a diversity of photovoltaic cells captured using electroluminescence imagery, achieving an accuracy of $sim 86.3%$ while possessing just 410K parameters $(sim 13times$ lower than EfficientNet-B0, respectively) and $sim 115mathrm{M}$ FLOPs $(sim 12times$ lower than EfficientNet-B0) and $sim 13times$ faster on an ARM Cortex A-72 embedded processor when compared to EfficientNet-B0.","PeriodicalId":131142,"journal":{"name":"2022 19th Conference on Robots and Vision (CRV)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124906812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Improving tracking with a tracklet associator 使用tracklet关联器改进跟踪

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-04-22 DOI: 10.1109/CRV55824.2022.00030

R'emi Nahon, Guillaume-Alexandre Bilodeau, G. Pesant

{"title":"Improving tracking with a tracklet associator","authors":"R'emi Nahon, Guillaume-Alexandre Bilodeau, G. Pesant","doi":"10.1109/CRV55824.2022.00030","DOIUrl":"https://doi.org/10.1109/CRV55824.2022.00030","url":null,"abstract":"Multiple object tracking (MOT) is a task in computer vision that aims to detect the position of various objects in videos and to associate them to a unique identity. We propose an approach based on Constraint Programming $(CP)$ whose goal is to be grafted to any existing tracker in order to improve its object association results. We developed a modular algorithm divided into three independent phases. The first phase consists in recovering the tracklets pro-vided by a base tracker and to cut them at the places where uncertain associations are spotted, for exam-ple, when tracklets overlap, which may cause identity switches. In the second phase, we associate the previ-ously constructed tracklets using a Belief Propagation Constraint Programming algorithm, where we pro-pose various constraints that assign scores to each of the tracklets based on multiple characteristics, such as their dynamics or the distance between them in time and space. Finally, the third phase is a rudimen-tary interpolation model to fill in the remaining holes in the trajectories we built. Experiments show that our model leads to improvements in the results for all three of the state-of-the-art trackers on which we tested it (3 to 4 points gained on HOTA and IDF1).","PeriodicalId":131142,"journal":{"name":"2022 19th Conference on Robots and Vision (CRV)","volume":"08 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133056200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive Memory Management for Video Object Segmentation 视频对象分割的自适应内存管理

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-04-13 DOI: 10.1109/CRV55824.2022.00018

Ali Pourganjalikhan, Charalambos (Charis) Poullis

{"title":"Adaptive Memory Management for Video Object Segmentation","authors":"Ali Pourganjalikhan, Charalambos (Charis) Poullis","doi":"10.1109/CRV55824.2022.00018","DOIUrl":"https://doi.org/10.1109/CRV55824.2022.00018","url":null,"abstract":"Matching-based networks have achieved state-of-the-art performance for video object segmentation (VOS) tasks by storing every-k frames in an external memory bank for future inference. Storing the intermediate frames' predictions provides the network with richer cues for segmenting an object in the current frame. However, the size of the memory bank gradually increases with the length of the video, which slows down inference speed and makes it impractical to handle arbitrary length videos. This paper proposes an adaptive memory bank strategy for matching-based networks for semi-supervised video object segmentation (VOS) that can handle videos of arbitrary length by discarding obsolete features. Features are indexed based on their importance in the segmentation of the objects in previous frames. Based on the index, we discard unimportant features to accommodate new features. We present our experiments on DAVIS 2016, DAVIS 2017, and Youtube-VOS that demonstrate that our method outperforms state-of-the-art that employ first-and-latest strategy with fixed-sized memory banks and achieves comparable performance to the every-k strategy with increasing-sized memory banks. Furthermore, experiments show that our method increases inference speed by up to 80% over the every-k and 35% over first-and-latest strategies.","PeriodicalId":131142,"journal":{"name":"2022 19th Conference on Robots and Vision (CRV)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131397955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Monocular Robot Navigation with Self-Supervised Pretrained Vision Transformers 基于自监督预训练视觉变压器的单目机器人导航

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2022-03-07 DOI: 10.1109/CRV55824.2022.00033

Miguel A. Saavedra-Ruiz, Sacha Morin, L. Paull

引用次数: 2

Attention based Occlusion Removal for Hybrid Telepresence Systems 基于注意力的混合网真系统遮挡去除

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2021-12-02 DOI: 10.1109/CRV55824.2022.00029

Surabhi Gupta, Ashwath Shetty, Avinash Sharma

引用次数: 2

M2A: Motion Aware Attention for Accurate Video Action Recognition M2A:精确视频动作识别的动作感知注意力

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2021-11-18 DOI: 10.1109/CRV55824.2022.00019

Brennan Gebotys, Alexander Wong, David A Clausi

{"title":"M2A: Motion Aware Attention for Accurate Video Action Recognition","authors":"Brennan Gebotys, Alexander Wong, David A Clausi","doi":"10.1109/CRV55824.2022.00019","DOIUrl":"https://doi.org/10.1109/CRV55824.2022.00019","url":null,"abstract":"Advancements in attention mechanisms have led to significant performance improvements in a variety of areas in machine learning due to its ability to enable the dynamic modeling of temporal sequences. A particular area in computer vision that is likely to benefit greatly from the incorporation of attention mechanisms in video action recognition. However, much of the current research's focus on attention mechanisms have been on spatial and temporal attention, which are unable to take advantage of the inherent motion found in videos. Motivated by this, we develop a new attention mechanism called Motion Aware Attention (M2A) that explicitly incorporates motion characteris-tics. More specifically, M2A extracts motion information between consecutive frames and utilizes attention to focus on the motion patterns found across frames to accurately recognize actions in videos. The proposed M2A mechanism is simple to implement and can be easily incorporated into any neural network backbone architecture. We show that incorporating motion mechanisms with attention mechanisms using the proposed M2A mechanism can lead to a $+15%$ to $+26%$ improvement in top-1 accuracy across different backbone architectures, with only a small in-crease in computational complexity. We further compared the performance of M2A with other state-of-the-art motion and at-tention mechanisms on the Something-Something V1 video action recognition benchmark. Experimental results showed that M2A can lead to further improvements when combined with other temporal mechanisms and that it outperforms other motion-only or attention-only mechanisms by as much as $+60%$ in top-1 accuracy for specific classes in the benchmark. We make our code available at: https://github.com/gebob19/M2A.","PeriodicalId":131142,"journal":{"name":"2022 19th Conference on Robots and Vision (CRV)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125263242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Temporal Convolutions for Multi-Step Quadrotor Motion Prediction 多步四旋翼运动预测的时间卷积

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2021-10-08 DOI: 10.1109/CRV55824.2022.00013

Sam Looper, Steven L. Waslander

引用次数: 1

ROS-X-Habitat: Bridging the ROS Ecosystem with Embodied AI ROS- x - habitat:连接ROS生态系统与嵌入式人工智能

2022 19th Conference on Robots and Vision (CRV) Pub Date : 2021-09-16 DOI: 10.1109/CRV55824.2022.00012

Guanxiong Chen, Haoyu Yang, Ian M. Mitchell

引用次数: 3