2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)最新文献

筛选
英文 中文
A Single-Stage, Bottom-up Approach for Occluded VIS using Spatio-temporal Embeddings 基于时空嵌入的单阶段自底向上遮挡VIS方法
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00431
A. Athar, Sabarinath Mahadevan, Aljosa Osep, L. Leal-Taixé, B. Leibe
{"title":"A Single-Stage, Bottom-up Approach for Occluded VIS using Spatio-temporal Embeddings","authors":"A. Athar, Sabarinath Mahadevan, Aljosa Osep, L. Leal-Taixé, B. Leibe","doi":"10.1109/ICCVW54120.2021.00431","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00431","url":null,"abstract":"The task of Video Instance Segmentation (VIS) involves segmenting, tracking and classifying all object instances present in a given video clip. Occluded VIS is a more challenging extension of this task which involves longer video sequences where objects undergo significant occlusions over time. Most existing approaches to VIS involve multiple networks which separately handle segmenting, tracking and classifying object instances, and potentially a set of heuristics to combine the individual network outputs. By contrast, we employ just one, single-stage network without any heuristics or post-processing for the end-to-end task. Our approach is called ’STEm-Seg’, which is a bottom-up method for Segmenting object instances in videos using Spatio-Temporal Embeddings. We achieve 3rd place in the Occluded VIS challenge with an mAP score of 21.6% on the test set.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131870683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Interactive Labeling for Human Pose Estimation in Surveillance Videos 监控视频中人体姿态估计的交互式标记
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00190
Mickael Cormier, Fabian Röpke, T. Golda, J. Beyerer
{"title":"Interactive Labeling for Human Pose Estimation in Surveillance Videos","authors":"Mickael Cormier, Fabian Röpke, T. Golda, J. Beyerer","doi":"10.1109/ICCVW54120.2021.00190","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00190","url":null,"abstract":"Automatically detecting and estimating the movement of persons in real-world uncooperative scenarios is very challenging in great part due to limited and unreliably annotated data. For instance annotating a single human body pose for activity recognition requires 40-60 seconds in complex sequences, leading to long-winded and costly annotation processes. Therefore increasing the sizes of annotated datasets through crowdsourcing or automated annotation is often used at a great financial costs, without reliable validation processes and inadequate annotation tools greatly impacting the annotation quality. In this work we combine multiple techniques into a single web-based general-purpose annotation application. Pre-trained machine learning models enable annotators to interactively detect pedestrians, re-identify them throughout the sequence, estimate their poses, and correct annotation suggestions in the same interface. Annotations are then inter- and extrapolated between frames. The application is evaluated through several user studies and the results are extensively analyzed. Experiments demonstrate a 55% reduction in annotation time for less complex scenarios while simultaneously decreasing perceived annotator workload.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129079565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Supporting Reference Imagery for Digital Drawing 支持数字绘图的参考图像
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00276
Josh Holinaty, Alec Jacobson, Fanny Chevalier
{"title":"Supporting Reference Imagery for Digital Drawing","authors":"Josh Holinaty, Alec Jacobson, Fanny Chevalier","doi":"10.1109/ICCVW54120.2021.00276","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00276","url":null,"abstract":"There is little understanding in the challenges artists face when using reference imagery while creating drawings digitally. How can this part of the creative process be better supported during the act of drawing? We conduct formative interviews with artists and reveal many adopt ad hoc strategies when integrating reference into their workflows. Interview results inform the design of a novel sketching interface in form of a technology probe to capture how artists use and access reference imagery, while also addressing opportunities to better support the use of reference, such as just-in-time presentation of imagery, automatic transparency to assist tracing, and features to mitigate design fixation. To capture how reference is used, we tasked artists to complete a series of digital drawings using our probe, with each task having particular reference needs. Artists were quick to adopt and appreciate the novel solutions provided by our probe, and we identified common strategies that can be exploited to support reference imagery in future creative tools.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"501 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134220492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Finite Aperture Stereo: 3D Reconstruction of Macro-Scale Scenes 有限光圈立体:宏观场景的三维重建
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00280
M. Bailey, A. Hilton, Jean-Yves Guillemaut
{"title":"Finite Aperture Stereo: 3D Reconstruction of Macro-Scale Scenes","authors":"M. Bailey, A. Hilton, Jean-Yves Guillemaut","doi":"10.1109/ICCVW54120.2021.00280","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00280","url":null,"abstract":"While the accuracy of multi-view stereo (MVS) has continued to advance, its performance reconstructing challenging scenes from images with a limited depth of field is generally poor. Typical implementations assume a pinhole camera model, and therefore treat defocused regions as a source of outlier. In this paper, we address these limitations by instead modelling the camera as a thick lens. Doing so allows us to exploit the complementary nature of stereo and defocus information, and overcome constraints imposed by traditional MVS methods. Using our novel reconstruction framework, we recover complete 3D models of complex macro-scale scenes. Our approach demonstrates robustness to view-dependent materials, and outperforms state-of-the-art MVS and depth from defocus across a range of real and synthetic datasets.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133425099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi-weather city: Adverse weather stacking for autonomous driving 多天气城市:自动驾驶的恶劣天气叠加
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00325
Valentina Mus, I. Fursa, P. Newman, Fabio Cuzzolin, Andrew Bradley
{"title":"Multi-weather city: Adverse weather stacking for autonomous driving","authors":"Valentina Mus, I. Fursa, P. Newman, Fabio Cuzzolin, Andrew Bradley","doi":"10.1109/ICCVW54120.2021.00325","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00325","url":null,"abstract":"Autonomous vehicles make use of sensors to perceive the world around them, with heavy reliance on vision-based sensors such as RGB cameras. Unfortunately, since these sensors are affected by adverse weather, perception pipelines require extensive training on visual data under harsh conditions in order to improve the robustness of downstream tasks - data that is difficult and expensive to acquire. Based on GAN and CycleGAN architectures, we propose an overall (modular) architecture for constructing datasets, which allows one to add, swap out and combine components in order to generate images with diverse weather conditions. Starting from a single dataset with ground-truth, we generate 7 versions of the same data in diverse weather, and propose an extension to augment the generated conditions, thus resulting in a total of 14 adverse weather conditions, requiring a single ground truth. We test the quality of the generated conditions both in terms of perceptual quality and suitability for training downstream tasks, using real world, out-of-distribution adverse weather extracted from various datasets. We show improvements in both object detection and instance segmentation across all conditions, in many cases exceeding 10 percentage points increase in AP, and provide the materials and instructions needed to re-construct the multi-weather dataset, based upon the original Cityscapes dataset.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133923924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Convolutional Filter Approximation Using Fractional Calculus 使用分数阶微积分的卷积滤波器近似
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00047
J. Zamora-Esquivel, Jesus Adan Cruz Vargas, A. Rhodes, L. Nachman, Narayan Sundararajan
{"title":"Convolutional Filter Approximation Using Fractional Calculus","authors":"J. Zamora-Esquivel, Jesus Adan Cruz Vargas, A. Rhodes, L. Nachman, Narayan Sundararajan","doi":"10.1109/ICCVW54120.2021.00047","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00047","url":null,"abstract":"We introduce a generalized fractional convolutional filter (FF) with the flexibility to behave as any novel, customized, or well-known filter (e.g. Gaussian, Sobel, and Laplacian). Our method can be trained using only five parameters – regardless of the kernel size. Furthermore, these kernels can be used in place of traditional kernels in any CNN topology. We demonstrate a nominal 5X parameter compression per kernel as compared to a traditional (5 × 5) convolutional kernel, and in the generalized case, a compression from N × N to 6 trainable parameters per kernel. We furthermore achieve 43X compression for 3D convolutional filters compared with conventional (7 × 7 × 7) 3D filters. Using fractional filters, we set a new MNIST record for the fewest number of parameters required to achieve above 99% classification accuracy with only 3, 750 trainable parameters. In addition to providing a generalizable method for CNN model compression, FFs present a compelling use case for the compression of CNNs that require large kernel sizes (e.g. medical imaging, semantic segmentation).","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133775724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Identification and Measurement of Individual Roots in Minirhizotron Images of Dense Root Systems 密集根系微影图像中单根的识别与测量
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00153
Alexander Gillert, Bo Peters, U. V. Lukas, J. Kreyling
{"title":"Identification and Measurement of Individual Roots in Minirhizotron Images of Dense Root Systems","authors":"Alexander Gillert, Bo Peters, U. V. Lukas, J. Kreyling","doi":"10.1109/ICCVW54120.2021.00153","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00153","url":null,"abstract":"Semantic segmentation networks are prone to oversegmentation in areas where objects are tightly clustered. In minirhizotron images with densely packed plant root systems this can lead to a failure to separate individual roots, thereby skewing the root length and width measurements.We propose to deal with this problem by adding additional output heads to the segmentation model, one of which is used with a ridge detection algorithm as an intermediate step and a second one that directly estimates root width. With this method we are able to improve detection and width measurements in densely packed roots systems without negative effects on sparse root systems.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115973104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Optical Braille Recognition Using Object Detection Neural Network 基于目标检测神经网络的光学盲文识别
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00200
Ilya G. Ovodov
{"title":"Optical Braille Recognition Using Object Detection Neural Network","authors":"Ilya G. Ovodov","doi":"10.1109/ICCVW54120.2021.00200","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00200","url":null,"abstract":"Optical Braille recognition methods generally rely heavily on a Braille text’s geometric structure. They run into problems if this structure is distorted. Thus, they find it difficult to cope with images of book pages taken with a smartphone.We propose an optical Braille recognition method that uses an object detection convolutional neural network to detect whole Braille characters at once. The proposed algorithm is robust to deformations and perspective distortions of a Braille page displayed on an image. The algorithm is suitable for recognizing braille texts captured with a smartphone camera in domestic conditions. It can handle curved pages and images with perspective distortion. The proposed algorithm shows high performance and accuracy compared to existing methods.Additionally, we produced a new dataset containing 240 photos of Braille texts with annotation for each Braille letter. Both the proposed algorithm and the dataset are available at GitHub.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116904154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A New Deep Learning Engine for CoralNet 一种新的CoralNet深度学习引擎
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00412
Qimin Chen, Oscar Beijbom, Stephen Chan, J. Bouwmeester, D. Kriegman
{"title":"A New Deep Learning Engine for CoralNet","authors":"Qimin Chen, Oscar Beijbom, Stephen Chan, J. Bouwmeester, D. Kriegman","doi":"10.1109/ICCVW54120.2021.00412","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00412","url":null,"abstract":"CoralNet is a cloud-based website and platform for manual, semi-automatic and automatic analysis of coral reef images. Users access CoralNet through optimized web-based workflows for common tasks, and other systems can interface through API’s. Today, marine scientists are widely using CoralNet, and nearly 3,000 registered users have up-loaded 1,741,855 images from 2,040 distinct sources with over 65 million annotations. CoralNet is hosted on AWS, is free for users, and the code is open source 1. In January 2021, we released CoralNet 1.0 which has a new machine learning engine. This paper provides an overview of that engine, and the process of choosing the particular architecture, its training, and a comparison to some of the most promising architectures. In a nutshell, CoralNet 1.0 uses transfer learning with an EfficientNet-B0 backbone that is trained on 16M labelled patches from benthic images and a hierarchical Multi-layer Perceptron classifier that is trained on source-specific labelled data. When evaluated on a hold-out test set of 26 sources, the error rate of CoralNet 1.0 was 18.4% (relative) lower than CoralNet Beta.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116168248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Visual interpretability analysis of Deep CNNs using an Adaptive Threshold method on Diabetic Retinopathy images 基于自适应阈值方法的深度cnn对糖尿病视网膜病变图像的视觉可解释性分析
2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00058
George Ioannou, Tasos Papagiannis, Thanos Tagaris, Georgios Alexandridis, A. Stafylopatis
{"title":"Visual interpretability analysis of Deep CNNs using an Adaptive Threshold method on Diabetic Retinopathy images","authors":"George Ioannou, Tasos Papagiannis, Thanos Tagaris, Georgios Alexandridis, A. Stafylopatis","doi":"10.1109/ICCVW54120.2021.00058","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00058","url":null,"abstract":"Deep neural networks have been dominating the field of computer vision, achieving exceptional performance on object detection and pattern recognition. However, despite the highly accurate predictions of these models, the continuous increase in depth and complexity comes at the cost of interpretability, making the task of explaining the reasoning behind these predictions very challenging. In this paper, an analysis of state-of-the-art approaches towards the direction of interpreting the networks’ representations, is carried out over two Diabetic Retinopathy image datasets, IDRiD and DDR. Furthermore, these techniques are compared in the task of image segmentation of the same datasets. This is to discover which method can produce the better attention maps that can solve the problem of segmentation without actually training the network for the specific task. To accomplish that we propose an adaptive threshold method that transforms the attention masks in a more suitable representation for segmentation. Experiments over multiple architectures were conducted to ensure the robustness of the results.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115391457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信