2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)最新文献

筛选
英文 中文
Plots to Previews: Towards Automatic Movie Preview Retrieval using Publicly Available Meta-data 图到预览:利用公开可用的元数据实现电影预览的自动检索
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00359
Bhagyashree Gaikwad, Ankit Sontakke, Manasi S. Patwardhan, N. Pedanekar, S. Karande
{"title":"Plots to Previews: Towards Automatic Movie Preview Retrieval using Publicly Available Meta-data","authors":"Bhagyashree Gaikwad, Ankit Sontakke, Manasi S. Patwardhan, N. Pedanekar, S. Karande","doi":"10.1109/ICCVW54120.2021.00359","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00359","url":null,"abstract":"‘Preview’, a concept popularized by Netflix, is a contiguous scene of a movie or a TV show highlighting its story, characters, and tone, thus helping viewers to make quick viewing decisions. To create previews, one needs scenelevel semantic annotations related to the story, characters, and tone. Soliciting such annotations is an involved exercise and these are expensive to generate automatically. Instead, we aim at creating previews by availing readily available scene meta-data, while avoiding dependency on semantic scene-level annotations. We hypothesize that movie scenes that best match publicly available IMDb plot summaries can make good previews. We use 51 movies from the MovieGraph dataset, and find that a match of the plot summaries with scene dialogues, available through subtitles, is adequate to create usable movie previews, without the need for other semantic annotations. We validate the hypothesis by comparing ratings for scenes selected by the proposed method to those for scenes selected randomly, obtained from regular viewers as well as an expert. We report that even with this ‘minimalist’ approach, we can select at least one good preview scene for 26 out of 51 movies, as agreed upon by a critical expert judgment. Error analysis of the scenes indicates that features related to the plot structure might be needed to further improve the results.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115697050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Balanced Masked and Standard Face Recognition 平衡蒙面和标准人脸识别
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00174
Delong Qi, Kangli Hu, Weijun Tan, Qi Yao, Jingfeng Liu
{"title":"Balanced Masked and Standard Face Recognition","authors":"Delong Qi, Kangli Hu, Weijun Tan, Qi Yao, Jingfeng Liu","doi":"10.1109/ICCVW54120.2021.00174","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00174","url":null,"abstract":"We present the improved network architecture, data augmentation, and training strategies for the Webface track and Insightface/Glint360K track of the masked face recognition challenge of ICCV2021. One of the key goals is to have a balanced performance of masked and standard face recognition. In order to prevent the overfitting for the masked face recognition, we control the total number of masked faces by not more than 10% of the total face recognition in the training dataset. We propose a few key changes to the face recognition network including a new stem unit, drop block, face detection and alignment using YOLO5Face, feature concatenation, a cycle cosine learning rate etc. With this strategy, we achieve good and balanced performance for both masked and standard face recognition.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"616 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123070133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Predictive Coding with Topographic Variational Autoencoders 预测编码与地形变分自编码器
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00127
Thomas Anderson Keller, M. Welling
{"title":"Predictive Coding with Topographic Variational Autoencoders","authors":"Thomas Anderson Keller, M. Welling","doi":"10.1109/ICCVW54120.2021.00127","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00127","url":null,"abstract":"Predictive coding is a model of visual processing which suggests that the brain is a generative model of input, with prediction error serving as a signal for both learning and attention. In this work, we show how the equivariant capsules learned by a Topographic Variational Autoen-coder can be extended to fit within the predictive coding framework by treating the slow rolling of capsule activations as the forward prediction operator. We demonstrate quantitatively that such an extension leads to improved sequence modeling compared with both topographic and non-topographic baselines, and that the resulting forward predictions are qualitatively more coherent with the provided partial input transformations.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123181060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The 2nd Challenge on Remote Physiological Signal Sensing (RePSS) 远程生理信号传感(RePSS)的第二次挑战
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00273
Xiaobai Li, Haomiao Sun, Zhaodong Sun, Hu Han, A. Dantcheva, S. Shan, Guoying Zhao
{"title":"The 2nd Challenge on Remote Physiological Signal Sensing (RePSS)","authors":"Xiaobai Li, Haomiao Sun, Zhaodong Sun, Hu Han, A. Dantcheva, S. Shan, Guoying Zhao","doi":"10.1109/ICCVW54120.2021.00273","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00273","url":null,"abstract":"Remote measurement of physiological signals from videos is an emerging topic. The topic draws great interest, but the lack of publicly available benchmark databases and a fair validation platform are hindering its further development. The RePSS Challenge is organized as an annual event for this concern. Here the 2nd RePSS is organized in conjunction with ICCV 2021. The 2nd RePSS contains two competition tracks. Track 1 is to measure inter-beat-intervals (IBI) from facial videos, which requires accurate measurement of each individual pulse peak. Track 2 is about respiration measurement from facial videos, as respiration is another important physiological index related to both health and emotional status. One new dataset is built and shared for Track 2. This paper presents an overview of the challenge, including data, protocol, results, and discussion. We highlighted the top-ranked solutions to provide insight for researchers, and we also outline future directions for this topic and this challenge.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"2011 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114668350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
PP-NAS: Searching for Plug-and-Play Blocks on Convolutional Neural Network PP-NAS:在卷积神经网络上搜索即插即用块
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00045
Biluo Shen, Anqi Xiao, Jie Tian, Z. Hu
{"title":"PP-NAS: Searching for Plug-and-Play Blocks on Convolutional Neural Network","authors":"Biluo Shen, Anqi Xiao, Jie Tian, Z. Hu","doi":"10.1109/ICCVW54120.2021.00045","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00045","url":null,"abstract":"Multi-scale features are of great importance in modern convolutional neural networks and show consistent performance gains on many vision tasks. Therefore, many plug-and-play blocks are introduced to upgrade existing convolutional neural networks for stronger multi-scale representation ability. However, the design of plug-and-play blocks is getting more complex and these manually designed blocks are not optimal. In this work, we propose PP-NAS to develop plug-and-play blocks based on neural architecture search. Specifically, we design a new search space and develop the corresponding search algorithm. Extensive experiments on CIFAR10, CIFAR100, and ImageNet show that PP-NAS can find a series of novel blocks that outperform manually designed ones. Transfer learning results on representative computer vision tasks including object detection and semantic segmentation further verify the superiority of the PP-NAS over the state-of-the-art CNNs (e.g., ResNet, Res2Net). Our code will be made avaliable at https://github.com/sbl1996/PP-NAS.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116662497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient Uncertainty Estimation in Semantic Segmentation via Distillation 基于蒸馏的语义分割的高效不确定性估计
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00343
Christopher J. Holder, M. Shafique
{"title":"Efficient Uncertainty Estimation in Semantic Segmentation via Distillation","authors":"Christopher J. Holder, M. Shafique","doi":"10.1109/ICCVW54120.2021.00343","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00343","url":null,"abstract":"Deep neural networks typically make predictions with little regard for the probability that a prediction might be incorrect. Attempts to address this often involve input data undergoing multiple forward passes, either of multiple models or of multiple configurations of a single model, and consensus among outputs is used as a measure of confidence. This can be computationally expensive, as the time taken to process a single input sample increases linearly with the number of output samples being generated, an important consideration in real-time scenarios such as autonomous driving, and so we propose Uncertainty Distillation as a more efficient method for quantifying prediction uncertainty. Inspired by the concept of Knowledge Distillation, whereby the performance of a compact model is improved by training it to mimic the outputs of a larger model, we train a compact model to mimic the output distribution of a large ensemble of models, such that for each output there is a prediction and a predicted level of uncertainty for that prediction. We apply Uncertainty Distillation in the context of a semantic segmentation task for autonomous vehicle scene understanding and demonstrate a capability to reliably predict pixelwise uncertainty over the resultant class probability map. We also show that the aggregate pixel uncertainty across an image can be used as a metric for reliable detection of out-of-distribution data.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117278273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection 合成时间异常引导端到端视频异常检测
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00028
M. Astrid, M. Zaheer, Seung-Ik Lee
{"title":"Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection","authors":"M. Astrid, M. Zaheer, Seung-Ik Lee","doi":"10.1109/ICCVW54120.2021.00028","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00028","url":null,"abstract":"Due to the limited availability of anomaly examples, video anomaly detection is often seen as one-class classification (OCC) problem. A popular way to tackle this problem is by utilizing an autoencoder (AE) trained only on normal data. At test time, the AE is then expected to reconstruct the normal input well while reconstructing the anomalies poorly. However, several studies show that, even with normal data only training, AEs can often start reconstructing anomalies as well which depletes their anomaly detection performance. To mitigate this, we propose a temporal pseudo anomaly synthesizer that generates fake-anomalies using only normal data. An AE is then trained to maximize the reconstruction loss on pseudo anomalies while minimizing this loss on normal data. This way, the AE is encouraged to produce distinguishable reconstructions for normal and anomalous frames. Extensive experiments and analysis on three challenging video anomaly datasets demonstrate the effectiveness of our approach to improve the basic AEs in achieving superiority against several existing state-of-the-art models.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127472412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Progressive Unsupervised Deep Transfer Learning for Forest Mapping in Satellite Image 基于渐进式无监督深度迁移学习的卫星图像森林映射
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00089
Nouman Ahmed, Sudipan Saha, M. Shahzad, M. Fraz, Xiao Xiang Zhu
{"title":"Progressive Unsupervised Deep Transfer Learning for Forest Mapping in Satellite Image","authors":"Nouman Ahmed, Sudipan Saha, M. Shahzad, M. Fraz, Xiao Xiang Zhu","doi":"10.1109/ICCVW54120.2021.00089","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00089","url":null,"abstract":"Automated forest mapping is important to understand our forests that play a key role in ecological system. However, efforts towards forest mapping is impeded by difficulty to collect labeled forest images that show large intraclass variation. Recently unsupervised learning has shown promising capability when exploiting limited labeled data. Motivated by this, we propose a progressive unsupervised deep transfer learning method for forest mapping. The proposed method exploits a pre-trained model that is subsequently fine-tuned over the target forest domain. We propose two different fine-tuning mechanism, one works in a totally unsupervised setting by jointly learning the parameters of CNN and the k-means based cluster assignments of the resulting features and the other one works in a semi-supervised setting by exploiting the extracted knearest neighbor based pseudo labels. The proposed progressive scheme is evaluated on publicly available EuroSAT dataset using the relevant base model trained on BigEarthNet labels. The results show that the proposed method greatly improves the forest regions classification accuracy as compared to the unsupervised baseline, nearly approaching the supervised classification approach.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125054679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
UAC: An Uncertainty-Aware Face Clustering Algorithm 一种不确定性感知人脸聚类算法
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00388
Biplob K. Debnath, G. Coviello, Yi Yang, S. Chakradhar
{"title":"UAC: An Uncertainty-Aware Face Clustering Algorithm","authors":"Biplob K. Debnath, G. Coviello, Yi Yang, S. Chakradhar","doi":"10.1109/ICCVW54120.2021.00388","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00388","url":null,"abstract":"We investigate ways to leverage uncertainty in face images to improve the quality of the face clusters. We observe that popular clustering algorithms do not produce better quality clusters when clustering probabilistic face representations that implicitly model uncertainty – these algorithms predict up to 9.6X more clusters than the ground truth for the IJB-A benchmark. We empirically analyze the causes for this unexpected behavior and identify excessive false-positives and false-negatives (when comparing face-pairs) as the main reasons for poor quality clustering. Based on this insight, we propose an uncertainty-aware clustering algorithm, UAC, which explicitly leverages uncertainty information during clustering to decide when a pair of faces are similar or when a predicted cluster should be discarded. UAC considers (a) uncertainty of faces in face-pairs, (b) bins face-pairs into different categories based on an uncertainty threshold, (c) intelligently varies the similarity threshold during clustering to reduce false-negatives and false-positives, and (d) discards predicted clusters that exhibit a high measure of uncertainty. Extensive experimental results on several popular benchmarks and comparisons with state-of-the-art clustering methods show that UAC produces significantly better clusters by leveraging uncertainty in face images – predicted number of clusters is up to 0.18X more of the ground truth for the IJB-A benchmark.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123428204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A System for Fusing Color and Near-Infrared Images in Radiance Domain 彩色图像和近红外图像在辐射域的融合系统
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00229
K. Ng, Jinglin Shen, C. Ho
{"title":"A System for Fusing Color and Near-Infrared Images in Radiance Domain","authors":"K. Ng, Jinglin Shen, C. Ho","doi":"10.1109/ICCVW54120.2021.00229","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00229","url":null,"abstract":"We designed and demonstrated a system that fused color and near-infrared (NIR) images in the radiance domain. The system is designed to enhance image quality captured in outdoor environments, especially in hazy weather conditions. Previous dehazing methods based on RGB-NIR fusion exist but have rarely addressed the issue of color fidelity and potential see-through effect of fusing with NIR image. The proposed system can dehaze and enhance image details while maintaining the color fidelity and protect privacy. By working in the radiance domain, the system could handle large brightness differences among the color and NIR images and achieve High Dynamic Range (HDR). We proposed two methods to correct the fusion color: linear scalings when raw images were used and color swapping with base-detail image decomposition in the presence of nonlinearity in the ISP pipeline. The system also had two clothing see-through prevention mechanisms to avoid ethical issue arising from the see-through effect of NIR image.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123558905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信