2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)最新文献_第7页

A Single-Stage, Bottom-up Approach for Occluded VIS using Spatio-temporal Embeddings 基于时空嵌入的单阶段自底向上遮挡VIS方法

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00431

A. Athar, Sabarinath Mahadevan, Aljosa Osep, L. Leal-Taixé, B. Leibe

引用次数: 3

Interactive Labeling for Human Pose Estimation in Surveillance Videos 监控视频中人体姿态估计的交互式标记

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00190

Mickael Cormier, Fabian Röpke, T. Golda, J. Beyerer

{"title":"Interactive Labeling for Human Pose Estimation in Surveillance Videos","authors":"Mickael Cormier, Fabian Röpke, T. Golda, J. Beyerer","doi":"10.1109/ICCVW54120.2021.00190","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00190","url":null,"abstract":"Automatically detecting and estimating the movement of persons in real-world uncooperative scenarios is very challenging in great part due to limited and unreliably annotated data. For instance annotating a single human body pose for activity recognition requires 40-60 seconds in complex sequences, leading to long-winded and costly annotation processes. Therefore increasing the sizes of annotated datasets through crowdsourcing or automated annotation is often used at a great financial costs, without reliable validation processes and inadequate annotation tools greatly impacting the annotation quality. In this work we combine multiple techniques into a single web-based general-purpose annotation application. Pre-trained machine learning models enable annotators to interactively detect pedestrians, re-identify them throughout the sequence, estimate their poses, and correct annotation suggestions in the same interface. Annotations are then inter- and extrapolated between frames. The application is evaluated through several user studies and the results are extensively analyzed. Experiments demonstrate a 55% reduction in annotation time for less complex scenarios while simultaneously decreasing perceived annotator workload.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129079565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Supporting Reference Imagery for Digital Drawing 支持数字绘图的参考图像

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00276

Josh Holinaty, Alec Jacobson, Fanny Chevalier

引用次数: 4

Finite Aperture Stereo: 3D Reconstruction of Macro-Scale Scenes 有限光圈立体:宏观场景的三维重建

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00280

M. Bailey, A. Hilton, Jean-Yves Guillemaut

引用次数: 1

Multi-weather city: Adverse weather stacking for autonomous driving 多天气城市:自动驾驶的恶劣天气叠加

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00325

Valentina Mus, I. Fursa, P. Newman, Fabio Cuzzolin, Andrew Bradley

{"title":"Multi-weather city: Adverse weather stacking for autonomous driving","authors":"Valentina Mus, I. Fursa, P. Newman, Fabio Cuzzolin, Andrew Bradley","doi":"10.1109/ICCVW54120.2021.00325","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00325","url":null,"abstract":"Autonomous vehicles make use of sensors to perceive the world around them, with heavy reliance on vision-based sensors such as RGB cameras. Unfortunately, since these sensors are affected by adverse weather, perception pipelines require extensive training on visual data under harsh conditions in order to improve the robustness of downstream tasks - data that is difficult and expensive to acquire. Based on GAN and CycleGAN architectures, we propose an overall (modular) architecture for constructing datasets, which allows one to add, swap out and combine components in order to generate images with diverse weather conditions. Starting from a single dataset with ground-truth, we generate 7 versions of the same data in diverse weather, and propose an extension to augment the generated conditions, thus resulting in a total of 14 adverse weather conditions, requiring a single ground truth. We test the quality of the generated conditions both in terms of perceptual quality and suitability for training downstream tasks, using real world, out-of-distribution adverse weather extracted from various datasets. We show improvements in both object detection and instance segmentation across all conditions, in many cases exceeding 10 percentage points increase in AP, and provide the materials and instructions needed to re-construct the multi-weather dataset, based upon the original Cityscapes dataset.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133923924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Convolutional Filter Approximation Using Fractional Calculus 使用分数阶微积分的卷积滤波器近似

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00047

J. Zamora-Esquivel, Jesus Adan Cruz Vargas, A. Rhodes, L. Nachman, Narayan Sundararajan

{"title":"Convolutional Filter Approximation Using Fractional Calculus","authors":"J. Zamora-Esquivel, Jesus Adan Cruz Vargas, A. Rhodes, L. Nachman, Narayan Sundararajan","doi":"10.1109/ICCVW54120.2021.00047","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00047","url":null,"abstract":"We introduce a generalized fractional convolutional filter (FF) with the flexibility to behave as any novel, customized, or well-known filter (e.g. Gaussian, Sobel, and Laplacian). Our method can be trained using only five parameters – regardless of the kernel size. Furthermore, these kernels can be used in place of traditional kernels in any CNN topology. We demonstrate a nominal 5X parameter compression per kernel as compared to a traditional (5 × 5) convolutional kernel, and in the generalized case, a compression from N × N to 6 trainable parameters per kernel. We furthermore achieve 43X compression for 3D convolutional filters compared with conventional (7 × 7 × 7) 3D filters. Using fractional filters, we set a new MNIST record for the fewest number of parameters required to achieve above 99% classification accuracy with only 3, 750 trainable parameters. In addition to providing a generalizable method for CNN model compression, FFs present a compelling use case for the compression of CNNs that require large kernel sizes (e.g. medical imaging, semantic segmentation).","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133775724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Identification and Measurement of Individual Roots in Minirhizotron Images of Dense Root Systems 密集根系微影图像中单根的识别与测量

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00153

Alexander Gillert, Bo Peters, U. V. Lukas, J. Kreyling

引用次数: 3

Optical Braille Recognition Using Object Detection Neural Network 基于目标检测神经网络的光学盲文识别

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00200

Ilya G. Ovodov

引用次数: 4

A New Deep Learning Engine for CoralNet 一种新的CoralNet深度学习引擎

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00412

Qimin Chen, Oscar Beijbom, Stephen Chan, J. Bouwmeester, D. Kriegman

{"title":"A New Deep Learning Engine for CoralNet","authors":"Qimin Chen, Oscar Beijbom, Stephen Chan, J. Bouwmeester, D. Kriegman","doi":"10.1109/ICCVW54120.2021.00412","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00412","url":null,"abstract":"CoralNet is a cloud-based website and platform for manual, semi-automatic and automatic analysis of coral reef images. Users access CoralNet through optimized web-based workflows for common tasks, and other systems can interface through API’s. Today, marine scientists are widely using CoralNet, and nearly 3,000 registered users have up-loaded 1,741,855 images from 2,040 distinct sources with over 65 million annotations. CoralNet is hosted on AWS, is free for users, and the code is open source 1. In January 2021, we released CoralNet 1.0 which has a new machine learning engine. This paper provides an overview of that engine, and the process of choosing the particular architecture, its training, and a comparison to some of the most promising architectures. In a nutshell, CoralNet 1.0 uses transfer learning with an EfficientNet-B0 backbone that is trained on 16M labelled patches from benthic images and a hierarchical Multi-layer Perceptron classifier that is trained on source-specific labelled data. When evaluated on a hold-out test set of 26 sources, the error rate of CoralNet 1.0 was 18.4% (relative) lower than CoralNet Beta.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116168248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Visual interpretability analysis of Deep CNNs using an Adaptive Threshold method on Diabetic Retinopathy images 基于自适应阈值方法的深度cnn对糖尿病视网膜病变图像的视觉可解释性分析

2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) Pub Date : 2021-10-01 DOI: 10.1109/ICCVW54120.2021.00058

George Ioannou, Tasos Papagiannis, Thanos Tagaris, Georgios Alexandridis, A. Stafylopatis

{"title":"Visual interpretability analysis of Deep CNNs using an Adaptive Threshold method on Diabetic Retinopathy images","authors":"George Ioannou, Tasos Papagiannis, Thanos Tagaris, Georgios Alexandridis, A. Stafylopatis","doi":"10.1109/ICCVW54120.2021.00058","DOIUrl":"https://doi.org/10.1109/ICCVW54120.2021.00058","url":null,"abstract":"Deep neural networks have been dominating the field of computer vision, achieving exceptional performance on object detection and pattern recognition. However, despite the highly accurate predictions of these models, the continuous increase in depth and complexity comes at the cost of interpretability, making the task of explaining the reasoning behind these predictions very challenging. In this paper, an analysis of state-of-the-art approaches towards the direction of interpreting the networks’ representations, is carried out over two Diabetic Retinopathy image datasets, IDRiD and DDR. Furthermore, these techniques are compared in the task of image segmentation of the same datasets. This is to discover which method can produce the better attention maps that can solve the problem of segmentation without actually training the network for the specific task. To accomplish that we propose an adaptive threshold method that transforms the attention masks in a more suitable representation for segmentation. Experiments over multiple architectures were conducted to ensure the robustness of the results.","PeriodicalId":226794,"journal":{"name":"2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115391457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2