2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)最新文献

筛选
英文 中文
Single Patch Based 3D High-Fidelity Mask Face Anti-Spoofing 基于单补丁的3D高保真掩模人脸防欺骗
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00099
Samuel Huang, Wen-Huang Cheng, Robert Cheng
{"title":"Single Patch Based 3D High-Fidelity Mask Face Anti-Spoofing","authors":"Samuel Huang, Wen-Huang Cheng, Robert Cheng","doi":"10.1109/ICCVW54120.2021.00099","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00099","url":null,"abstract":"Face anti-spoofing is rapidly increasing in importance as facial recognition systems have become common in the financial and security fields. Among all kinds of attack, 3D high-fidelity masks are especially hard to defend. Recently, CASIA introduced a large scale dataset CASIA-SURF HiFiMask, which comprises of 54,600 videos recorded from 75 subjects with 225 high-fidelity masks. In this paper, we design a lightweight network with single patch input on the basis of CDCN++, and supervise it by focal loss. The proposed method achieves the Average Classification Error Rate (ACER) of 3.215 on the Protocol 3 of CASIASURF HiFiMask dataset and ranks the third best model in the Chalearn 3D High-Fidelity Mask Face Presentation Attack Detection Challenge at ICCV 2021.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131341721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Manipulating Image Style Transformation via Latent-Space SVM 利用潜在空间支持向量机操纵图像样式变换
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00218
Qiudan Wang
{"title":"Manipulating Image Style Transformation via Latent-Space SVM","authors":"Qiudan Wang","doi":"10.1109/ICCVW54120.2021.00218","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00218","url":null,"abstract":"Deep Neural Networks have been proved as the go-to approach in modeling data distribution in a latent space, especially in Neural Style Transfer (NST), which casts a specific style extracted from a source image to another target image by calibrating the style and content information in a latent space. While existing methods focuses on different ways to extract features that more precisely describe style or content information to improve existing NST pipelines, the latent space of the NST model has not been well-explored. In this paper, we show that different half-spaces in the latent space are actually associated with particular styles of a network’s generated images. The corresponding constraints of these half-spaces can be computed by using linear classifiers, e.g. a Support Vector Machines (SVM). Leveraging the understanding of the relation between half-spaces in the latent space and output style, we propose the Linear Modification for Latent Representations (LMLR), a method that effectively increases or decreases the level of stylizing in the output image for any given NST model. We empirically evaluate our method on several state-of-the-art NST models and show that LMLR can manipulate the level of stylizing in the output image.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"272 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132013124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modal Relational Reasoning Network for Visual Question Answering 面向视觉问答的跨模态关系推理网络
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00441
Hongyu Chen, Ruifang Liu, Bo Peng
{"title":"Cross-modal Relational Reasoning Network for Visual Question Answering","authors":"Hongyu Chen, Ruifang Liu, Bo Peng","doi":"10.1109/ICCVW54120.2021.00441","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00441","url":null,"abstract":"Visual Question Answering (VQA) is a challenging task that requires a cross-modal understanding of images and questions with relational reasoning leading to the correct answer. To bridge the semantic gap between these two modalities, previous works focus on the word-region alignments of all possible pairs without attending more attention to the corresponding word and object. Treating all pairs equally without consideration of relation consistency hinders the model’s performance. In this paper, to align the relation-consistent pairs and integrate the interpretability of VQA systems, we propose a Cross-modal Relational Reasoning Network (CRRN), to mask the inconsistent attention map and highlight the full latent alignments of corresponding word-region pairs. Specifically, we present two relational masks for inter-modal and intra-modal highlighting, inferring the more and less important words in sentences or regions in images. The attention interrelationship of consistent pairs can be enhanced with the shift of learning focus by masking the unaligned relations. Then, we propose two novel losses ℒCMAM and ℒSMAM with explicit supervision to capture the fine-grained interplay between vision and language. We have conduct thorough experiments to prove the effectiveness and achieve the competitive performance for reaching 61.74% on GQA benchmark.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129401819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
DriPE: A Dataset for Human Pose Estimation in Real-World Driving Settings DriPE:一个真实驾驶环境中人体姿态估计的数据集
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00321
Romain Guesdon, C. Crispim, L. Tougne
{"title":"DriPE: A Dataset for Human Pose Estimation in Real-World Driving Settings","authors":"Romain Guesdon, C. Crispim, L. Tougne","doi":"10.1109/ICCVW54120.2021.00321","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00321","url":null,"abstract":"The task of 2D human pose estimation has known a significant gain of performance with the advent of deep learning. This task aims to estimate the body keypoints of people in an image or a video. However, real-life applications of such methods bring new challenges that are under-represented in the general context datasets. For instance, driver status monitoring on consumer road vehicles introduces new difficulties, like self- and background body-part occlusions, varying illumination conditions, cramped view angles, etc. These monitoring conditions are currently absent in general purposes datasets. This paper proposes two main contributions. Firstly, we introduce DriPE (Driver Pose Estimation), a new dataset to foster the development and evaluation of methods for human pose estimation of drivers in consumer vehicles. This is the first publicly available dataset depicting drivers in real scenes. It contains 10k images of 19 different driver subjects, manually annotated with human body keypoints and an object bounding box. Secondly, we propose a new keypoint-based metric for human pose estimation. This metric highlights the limitations of current metrics for HPE evaluation and of current deep neural networks on pose estimation, both on general and driving-related datasets.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131037600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
SketchyDepth: from Scene Sketches to RGB-D Images SketchyDepth:从场景草图到RGB-D图像
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00274
G. Berardi, Samuele Salti, L. D. Stefano
{"title":"SketchyDepth: from Scene Sketches to RGB-D Images","authors":"G. Berardi, Samuele Salti, L. D. Stefano","doi":"10.1109/ICCVW54120.2021.00274","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00274","url":null,"abstract":"Sketch-based content generation is a creative and fun activity suited to casual and professional users that has many different applications. Today it is possible to generate the geometry and appearance of a single object by sketching it. Yet, only the appearance can be synthesized from a sketch of a whole scene. In this paper we propose the first method to generate both the depth map and image of a whole scene from a sketch. We demonstrate how generating geometrical information as a depth map is beneficial from a twofold perspective. On one hand, it improves the quality of the image synthesized from the sketch. On the other, it unlocks depth-enabled creative effects like Bokeh, fog, light variation, 3D photos and many others, which help enhancing the final output in a controlled way. We validate our method showing how generating depth maps directly from sketches produces better qualitative results with respect to alternative methods, i.e. running MiDaS after image generation. Finally we introduce depth sketching, a depth manipulation technique to further condition image generation without the need of additional annotation or training.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131134444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Where Did I See It? Object Instance Re-Identification with Attention 我在哪里看到的?具有注意的对象实例重新识别
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00038
Vaibhav Bansal, G. Foresti, N. Martinel
{"title":"Where Did I See It? Object Instance Re-Identification with Attention","authors":"Vaibhav Bansal, G. Foresti, N. Martinel","doi":"10.1109/ICCVW54120.2021.00038","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00038","url":null,"abstract":"Existing methods dealing with object instance re-identification (OIRe-ID) look for the best visual features match of a target object within a set of frames. Due to the nature of the problem, relying only on the visual appearance of object instances is likely to provide many false matches when there are multiple objects with similar appearance or multiple instances of same object class present in the scene. We focus on a rigid scene setup and to limit the negative effects of the aforementioned cases, we propose to exploit the background information. We believe that this would be particularly helpful in a rigid environment with a lot of reoccurring identical models of objects since it would provide rich context information. We introduce an attention-based mechanism to the existing Mask R-CNN architecture such that we learn to encode the important and distinct information in the background jointly with the foreground features relevant to rigid real-world scenarios. To evaluate the proposed approach, we run compelling experiments on the ScanNet dataset. Results demonstrate that we outperform significantly compared to different baselines and SOTA methods.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132995912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Real-Time Cell Counting in Unlabeled Microscopy Images 未标记显微镜图像中的实时细胞计数
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00083
Yuang Zhu, Zhao Chen, Yuxin Zheng, Qinghua Zhang, Xuan Wang
{"title":"Real-Time Cell Counting in Unlabeled Microscopy Images","authors":"Yuang Zhu, Zhao Chen, Yuxin Zheng, Qinghua Zhang, Xuan Wang","doi":"10.1109/ICCVW54120.2021.00083","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00083","url":null,"abstract":"Deep learning is largely applied to cell counting in microscopy images. However, most of the existing cell counting models are fully supervised and trained off-line. They adopt the usual training-testing framework, whereas the models are trained in advance to infer numbers of cells in test images. They require large amounts of manually labeled data for training but lack the ability to adapt to newly-collected unlabeled images that are fed to processing systems dynamically. To solve these problems, we propose a novel framework for real-time (RT) cell counting with density maps (DM). It is a semisupervised system which enables training with upcoming unlabeled images and predicting their cell counts simultaneously. It is also flexible enough to allow almost any cell counting model to be embedded within it. With a reliable and automatic training set renewing mechanism, it ensures counting accuracy while optimizing the models by both historical data and new images. To deal with cell variability and image complexity, we propose a Semisupervised Graph-Based Network (SGN) for within the RT counting framework. It leverages a count-sensitive measurement to construct dynamic graphs of DM patches. With the graph constraint, it regularizes an encoder-decoder to represent underlying data structures and gain robustness for cell counting. We have realized SGN along with several baseline networks and state-of-the-art methods within the RT counting framework. Experimental results validate the effectiveness and robustness of SGN. They also demonstrate the feasibility, efficacy and generalizability of the proposed framework for cell counting in unlabeled images.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131659711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepDraper: Fast and Accurate 3D Garment Draping over a 3D Human Body DeepDraper:在3D人体上快速准确的3D服装褶皱
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00163
Lokender Tiwari, B. Bhowmick
{"title":"DeepDraper: Fast and Accurate 3D Garment Draping over a 3D Human Body","authors":"Lokender Tiwari, B. Bhowmick","doi":"10.1109/ICCVW54120.2021.00163","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00163","url":null,"abstract":"Draping a 3D human mesh has garnered broad interest due to its wide applicability in virtual try-on, animations, etc. The 3D garment deformations produced by the existing methods are often inconsistent with the body shape, pose, and measurements. This paper proposes a single unified learning-based framework (DeepDraper) to predict garment deformation as a function of body shape, pose, measurements, and garment styles. We train the DeepDraper with coupled geometric and multi-view perceptual losses. Unlike existing methods, we additionally model garment deformations as a function of standard body measurements, which generally a buyer or a designer uses to buy or design perfect fit clothes. As a result, DeepDraper significantly outperforms the state-of-the-art deep network-based approaches in terms of fitness and realism and generalizes well to the unseen style of the garments. In addition to that, DeepDraper is ~ 10 times smaller in size and ~ 23 times faster than the closest state-of-the-art method (TailorNet), which favors its use in real-time applications with less computational power. Despite being trained on the static poses of the TailorNet [32] dataset, DeepDraper generalizes well to unseen body shapes, poses, and garment styles and produces temporally coherent garment deformations on the pose sequences even from the unseen AMASS [25] dataset.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131908904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road Segmentation in Hazardous Environments SS-SFDA:危险环境下道路分割的自监督无源域自适应
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00339
D. Kothandaraman, Rohan Chandra, Dinesh Manocha
{"title":"SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road Segmentation in Hazardous Environments","authors":"D. Kothandaraman, Rohan Chandra, Dinesh Manocha","doi":"10.1109/ICCVW54120.2021.00339","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00339","url":null,"abstract":"We present a novel approach for unsupervised road segmentation in adverse weather conditions such as rain or fog. This includes a new algorithm for source-free domain adaptation (SFDA) using self-supervised learning. More-over, our approach uses several techniques to address various challenges in SFDA and improve performance, including online generation of pseudo-labels and self-attention as well as use of curriculum learning, entropy minimization and model distillation. We have evaluated the performance on 6 datasets corresponding to real and synthetic adverse weather conditions. Our method outperforms all prior works on unsupervised road segmentation and SFDA by at least 10.26%, and improves the training time by 18−180×. Moreover, our self-supervised algorithm exhibits similar accuracy performance in terms of mIOU score as compared to prior supervised methods.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114769736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
PatchAugment: Local Neighborhood Augmentation in Point Cloud Classification PatchAugment:点云分类中的局部邻域增强
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00240
Shivanand Venkanna Sheshappanavar, Vinit Veerendraveer, C. Kambhamettu
{"title":"PatchAugment: Local Neighborhood Augmentation in Point Cloud Classification","authors":"Shivanand Venkanna Sheshappanavar, Vinit Veerendraveer, C. Kambhamettu","doi":"10.1109/ICCVW54120.2021.00240","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00240","url":null,"abstract":"Recent deep neural network models trained on smaller and less diverse datasets use data augmentation to alleviate limitations such as overfitting, reduced robustness, and lower generalization. Methods using 3D datasets are among the most common to use data augmentation techniques such as random point drop, scaling, translation, rotations, and jittering. However, these data augmentation techniques are fixed and are often applied to the entire object, ignoring the object’s local geometry. Different local neighborhoods on the object surface hold a different amount of geometric complexity. Applying the same data augmentation techniques at the object level is less effective in augmenting local neighborhoods with complex structures. This paper presents PatchAugment, a data augmentation framework to apply different augmentation techniques to the local neighborhoods. Our experimental studies on PointNet++ and DGCNN models demonstrate the effectiveness of PatchAugment on the task of 3D Point Cloud Classification. We evaluated our technique against these models using four benchmark datasets, ModelNet40 (synthetic), ModelNetlO (synthetic), SHREC’16 (synthetic) and ScanObjectNN (real-world).","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117333675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信