2021 IEEE Winter Conference on Applications of Computer Vision (WACV)最新文献_第10页

MART: Motion-Aware Recurrent Neural Network for Robust Visual Tracking 用于鲁棒视觉跟踪的运动感知递归神经网络

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00061

Heng Fan, Haibin Ling

引用次数: 0

Supervoxel Attention Graphs for Long-Range Video Modeling 用于远程视频建模的超体素注意图

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00020

Yang Wang, Gedas Bertasius, Tae-Hyun Oh, A. Gupta, Minh Hoai, L. Torresani

{"title":"Supervoxel Attention Graphs for Long-Range Video Modeling","authors":"Yang Wang, Gedas Bertasius, Tae-Hyun Oh, A. Gupta, Minh Hoai, L. Torresani","doi":"10.1109/WACV48630.2021.00020","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00020","url":null,"abstract":"A significant challenge in video understanding is posed by the high dimensionality of the input, which induces large computational cost and high memory footprints. Deep convolutional models operating on video apply pooling and striding to reduce feature dimensionality and to increase the receptive field. However, despite these strategies, modern approaches cannot effectively leverage spatiotemporal structure over long temporal extents. In this paper we introduce an approach that reduces a video of 10 seconds to a sparse graph of only 160 feature nodes such that efficient inference in this graph produces state-of-the-art accuracy on challenging action recognition datasets. The nodes of our graph are semantic supervoxels that capture the spatiotemporal structure of objects and motion cues in the video, while edges between nodes encode spatiotemporal relations and feature similarity. We demonstrate that a shallow network that interleaves graph convolution and graph pooling on this compact representation implements an effective mechanism of relational reasoning yielding strong recognition results on both Charades and Something-Something.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126765784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Style Consistent Image Generation for Nuclei Instance Segmentation 核实例分割的样式一致图像生成

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00404

Xuan Gong, Shuyan Chen, Baochang Zhang, D. Doermann

引用次数: 18

Self Supervision for Attention Networks 注意网络的自我监督

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00077

Badri N. Patro, Vinay P. Namboodiri

{"title":"Self Supervision for Attention Networks","authors":"Badri N. Patro, Vinay P. Namboodiri","doi":"10.1109/WACV48630.2021.00077","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00077","url":null,"abstract":"In recent years, the attention mechanism has become a fairly popular concept and has proven to be successful in many machine learning applications. However, deep learning models do not employ supervision for these attention mechanisms which can improve the model’s performance significantly. Therefore, in this paper, we tackle this limitation and propose a novel method to improve the attention mechanism by inducing \"self-supervision\". We devise a technique to generate desirable attention maps for any model that utilizes an attention module. This is achieved by examining the model’s output for different regions sampled from the input and obtaining the attention probability distributions that enhance the proficiency of the model. The attention distributions thus obtained are used for supervision. We rely on the fact, that attenuation of the unimportant parts, allows a model to attend to more salient regions, thus strengthening the prediction accuracy. The quantitative and qualitative results published in this paper show that this method successfully improves the attention mechanism as well as the model’s accuracy. In addition to the task of Visual Question Answering(VQA), we also show results on the task of Image classification and Text classification to prove that our method can be generalized to any vision and language model that uses an attention module.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"17 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114024705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Intra-class Part Swapping for Fine-Grained Image Classification 类内部件交换用于细粒度图像分类

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00325

Lianbo Zhang, Shaoli Huang, Wei Liu

{"title":"Intra-class Part Swapping for Fine-Grained Image Classification","authors":"Lianbo Zhang, Shaoli Huang, Wei Liu","doi":"10.1109/WACV48630.2021.00325","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00325","url":null,"abstract":"Recent works such as Mixup and CutMix have demonstrated the effectiveness of augmenting training data for deep models. These methods generate new data by generally blending random image contents and mixing their labels proportionally. However, this strategy tends to produce unreasonable training samples for fine-grained recognition, leading to limited improvement. This is because mixing random image contents may potentially produce images containing destructed object structures. Further, as the category differences mainly reside in small part regions, mixing labels proportionally to the number of mixed pixels might result in label noisy problem. To augment more reasonable training data, we propose Intra-class Part Swapping (InPS) that produces new data by performing attention-guided content swapping on input pairs from the same class. Compared with previous approaches, InPS avoids introducing noisy labels and ensures a likely holistic structure of objects in generated images. We demonstrate InPS outperforms the most recent augmentation approaches in both fine-grained recognition and weakly object localization. Further, by simply incorporating the mid-level feature learning, our proposed method achieves state-of-the-art performance in the literature while maintaining the simplicity and inference efficiency. Our code is publicly available†.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114290146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

ChartOCR: Data Extraction from Charts Images via a Deep Hybrid Framework ChartOCR:基于深度混合框架的图表图像数据提取

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00196

Junyu Luo, Zekun Li, Jinpeng Wang, Chin-Yew Lin

{"title":"ChartOCR: Data Extraction from Charts Images via a Deep Hybrid Framework","authors":"Junyu Luo, Zekun Li, Jinpeng Wang, Chin-Yew Lin","doi":"10.1109/WACV48630.2021.00196","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00196","url":null,"abstract":"Chart images are commonly used for data visualization. Automatically reading the chart values is a key step for chart content understanding. Charts have a lot of variations in style (e.g. bar chart, line chart, pie chart and etc.), which makes pure rule-based data extraction methods difficult to handle. However, it is also improper to directly apply end- to-end deep learning solutions since these methods usually deal with specific types of charts. In this paper, we propose an unified method ChartOCR to extract data from various types of charts. We show that by combing deep framework and rule-based methods, we can achieve a satisfying generalization ability and obtain accurate and semantic-rich intermediate results. Our method extracts the key points that define the chart components. By adjusting the prior rules, the framework can be applied to different chart types. Experiments show that our method achieves state-of-the- art performance with fast processing speed on two public datasets. Besides, we also introduce and evaluate on a large dataset ExcelChart400K for training deep models on chart images. The code and the dataset are publicly available at https://github.com/soap117/DeepRule.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115919473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

Defect-GAN: High-Fidelity Defect Synthesis for Automated Defect Inspection gan:用于自动缺陷检测的高保真缺陷综合

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00257

Gongjie Zhang, Kaiwen Cui, Tzu-Yi Hung, Shijian Lu

{"title":"Defect-GAN: High-Fidelity Defect Synthesis for Automated Defect Inspection","authors":"Gongjie Zhang, Kaiwen Cui, Tzu-Yi Hung, Shijian Lu","doi":"10.1109/WACV48630.2021.00257","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00257","url":null,"abstract":"Automated defect inspection is critical for effective and efficient maintenance, repair, and operations in advanced manufacturing. On the other hand, automated defect inspection is often constrained by the lack of defect samples, especially when we adopt deep neural networks for this task. This paper presents Defect-GAN, an automated defect synthesis network that generates realistic and diverse defect samples for training accurate and robust defect inspection networks. Defect-GAN learns through defacement and restoration processes, where the defacement generates defects on normal surface images while the restoration removes defects to generate normal images. It employs a novel compositional layer-based architecture for generating realistic defects within various image backgrounds with different textures and appearances. It can also mimic the stochastic variations of defects and offer flexible control over the locations and categories of the generated defects within the image background. Extensive experiments show that Defect-GAN is capable of synthesizing various defects with superior diversity and fidelity. In addition, the synthesized defect samples demonstrate their effectiveness in training better defect inspection networks.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"09 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122121894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

RODNet: Radar Object Detection using Cross-Modal Supervision RODNet:使用跨模态监督的雷达目标检测

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00055

Yizhou Wang, Zhongyu Jiang, Xiangyu Gao, Jenq-Neng Hwang, Guanbin Xing, Hui Liu

{"title":"RODNet: Radar Object Detection using Cross-Modal Supervision","authors":"Yizhou Wang, Zhongyu Jiang, Xiangyu Gao, Jenq-Neng Hwang, Guanbin Xing, Hui Liu","doi":"10.1109/WACV48630.2021.00055","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00055","url":null,"abstract":"Radar is usually more robust than the camera in severe driving scenarios, e.g., weak/strong lighting and bad weather. However, unlike RGB images captured by a camera, the semantic information from the radar signals is noticeably difficult to extract. In this paper, we propose a deep radar object detection network (RODNet), to effectively detect objects purely from the carefully processed radar frequency data in the format of range-azimuth frequency heatmaps (RAMaps). Three different 3D autoencoder based architectures are introduced to predict object confidence distribution from each snippet of the input RAMaps. The final detection results are then calculated using our post-processing method, called location-based non-maximum suppression (L-NMS). Instead of using burdensome human-labeled ground truth, we train the RODNet using the annotations generated automatically by a novel 3D localization method using a camera-radar fusion (CRF) strategy. To train and evaluate our method, we build a new dataset – CRUW, containing synchronized videos and RAMaps in various driving scenarios. After intensive experiments, our RODNet shows favorable object detection performance without the presence of the camera.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128573981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51

Facial Expression Recognition in the Wild via Deep Attentive Center Loss 通过深度注意力中心丧失的野生面部表情识别

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00245

A. Farzaneh, Xiaojun Qi

{"title":"Facial Expression Recognition in the Wild via Deep Attentive Center Loss","authors":"A. Farzaneh, Xiaojun Qi","doi":"10.1109/WACV48630.2021.00245","DOIUrl":"https://doi.org/10.1109/WACV48630.2021.00245","url":null,"abstract":"Learning discriminative features for Facial Expression Recognition (FER) in the wild using Convolutional Neural Networks (CNNs) is a non-trivial task due to the significant intra-class variations and inter-class similarities. Deep Metric Learning (DML) approaches such as center loss and its variants jointly optimized with softmax loss have been adopted in many FER methods to enhance the discriminative power of learned features in the embedding space. However, equally supervising all features with the metric learning method might include irrelevant features and ultimately degrade the generalization ability of the learning algorithm. We propose a Deep Attentive Center Loss (DACL) method to adaptively select a subset of significant feature elements for enhanced discrimination. The proposed DACL integrates an attention mechanism to estimate attention weights correlated with feature importance using the intermediate spatial feature maps in CNN as context. The estimated weights accommodate the sparse formulation of center loss to selectively achieve intra-class compactness and inter-class separation for the relevant information in the embedding space. An extensive study on two widely used wild FER datasets demonstrates the superiority of the proposed DACL method compared to state-of-the-art methods.","PeriodicalId":236300,"journal":{"name":"2021 IEEE Winter Conference on Applications of Computer Vision (WACV)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116198547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 119

Foreground color prediction through inverse compositing 通过反向合成预测前景颜色

2021 IEEE Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2021-01-01 DOI: 10.1109/WACV48630.2021.00165

Sebastian Lutz, A. Smolic

引用次数: 0