2021 17th International Conference on Machine Vision and Applications (MVA)最新文献

筛选
英文 中文
Adversarial Defense Through High Frequency Loss Variational Autoencoder Decoder and Bayesian Update With Collective Voting 基于高频损耗变分自编码器解码器和贝叶斯集体投票更新的对抗防御
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511384
Zhixun He, Mukesh Singhal
{"title":"Adversarial Defense Through High Frequency Loss Variational Autoencoder Decoder and Bayesian Update With Collective Voting","authors":"Zhixun He, Mukesh Singhal","doi":"10.23919/MVA51890.2021.9511384","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511384","url":null,"abstract":"In recent years, Deep Neural Network (DNN) approaches for computer vision tasks have shown tremendous promise and potential. However, they are vulnerable to data that are carefully crafted with adversarial attacks, which can cause mis-prediction and raise security risk to real-world deep learning systems. To make the DNN-based approaches more robust, we propose a defense strategy based on High Frequency Loss Variational Autoencoder Decoder (VAE) and randomization among multiple post-VAE classifiers' predictions. The main contributions of the proposed defense framework are: 1) a new adversarial defense framework that features randomization process to effectively mitigate adversarial attacks; 2) reconstruction of high-quality images from adversarial samples with the VAE enhanced with spatial frequency loss; 3) use of a Bayesian process to jointly combine the collective voting results and the targeted classifier's prediction for final decision. We evaluate our approach and compare it with existing approaches on CIFAR10 and Fashion-MNIST data sets. The experimental study shows that the proposed method outperforms existing methods.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115585502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning VAE with Categorical Labels for Generating Conditional Handwritten Characters 用分类标签学习VAE生成条件手写体字符
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511404
Keita Goto, Nakamasa Inoue
{"title":"Learning VAE with Categorical Labels for Generating Conditional Handwritten Characters","authors":"Keita Goto, Nakamasa Inoue","doi":"10.23919/MVA51890.2021.9511404","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511404","url":null,"abstract":"The variational autoencoder (VAE) has succeeded in learning disentangled latent representations from data without supervision. Well disentangled representations can express interpretable semantic value, which is useful for various tasks, including image generation. However, the conventional VAE model is not suitable for data generation with specific category labels because it is challenging to acquire categorical information as latent variables. Therefore, we propose a framework for learning label representations in a VAE by using supervised categorical labels associated with data. Through experiments, we show that this framework is useful for generating data belonging to a specific category. Furthermore, we found that our framework successfully disentangled latent factors from similar data of different classes.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129863877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multiple Fisheye Camera Calibration and Stereo Measurement Methods for Uniform Distance Errors throughout Imaging Ranges 在整个成像范围内均匀距离误差的多鱼眼相机校准和立体测量方法
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511376
Nobuhiko Wakai, Takeo Azuma, K. Nobori
{"title":"Multiple Fisheye Camera Calibration and Stereo Measurement Methods for Uniform Distance Errors throughout Imaging Ranges","authors":"Nobuhiko Wakai, Takeo Azuma, K. Nobori","doi":"10.23919/MVA51890.2021.9511376","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511376","url":null,"abstract":"This paper proposes calibration and stereo measurement methods that enable accurate distance and uniform distribution of the distance error throughout imaging ranges. In stereo measurement using two fisheye cameras, the distance error varies greatly depending on the measurement direction. To reduce the distance error, the proposed method introduces an effectual baseline weight into the stereo measurement using three or more fisheye cameras and their calibration. Accurate distance is obtained because this effectual baseline weight is the optimum weight in the maximum likelihood estimation. Experimental results show that the proposed methods can obtain an accurate distance with a 94% reduction in error and make the distribution of the distance error uniform.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121073359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding the Reason for Misclassification by Generating Counterfactual Images 通过生成反事实图像来理解错误分类的原因
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511352
Muneaki Suzuki, Yoshitaka Kameya, Takuro Kutsuna, N. Mitsumoto
{"title":"Understanding the Reason for Misclassification by Generating Counterfactual Images","authors":"Muneaki Suzuki, Yoshitaka Kameya, Takuro Kutsuna, N. Mitsumoto","doi":"10.23919/MVA51890.2021.9511352","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511352","url":null,"abstract":"Explainable AI (XAI) methods contribute to understanding the behavior of deep neural networks (DNNs), and have attracted interest recently. For example, in image classification tasks, attribution maps have been used to indicate the pixels of an input image that are important to the output decision. Oftentimes, however, it is difficult to understand the reason for misclassification only from a single attribution map. In this paper, in order to enhance the information related to the reason for misclassification, we propose to generate several counterfactual images using generative adversarial networks (GANs). We empirically show that these counterfactual images and their attribution maps improve the interpretability of misclassified images. Furthermore, we additionally propose to generate transitional images by gradually changing the configurations of a GAN in order to understand clearly which part of the misclassified image cause the misclassification.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122786649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Seeing Farther Than Supervision: Self-supervised Depth Completion in Challenging Environments 超越监督:挑战性环境下的自我监督深度完井
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511354
Seiya Ito, Naoshi Kaneko, K. Sumi
{"title":"Seeing Farther Than Supervision: Self-supervised Depth Completion in Challenging Environments","authors":"Seiya Ito, Naoshi Kaneko, K. Sumi","doi":"10.23919/MVA51890.2021.9511354","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511354","url":null,"abstract":"This paper tackles the problem of learning a depth completion network from a series of RGB images and short-range depth measurements as a new setting for depth completion. Commodity RGB-D sensors used in indoor environments can provide dense depth measurements; however, their acquisition distance is limited. Recent depth completion methods train CNNs to estimate dense depth maps in a supervised/self-supervised manner while utilizing sparse depth measurements. For self-supervised learning, indoor environments are challenging due to many non-textured regions, leading to the problem of inconsistency. To overcome this problem, we propose a self-supervised depth completion method that utilizes optical flow from two RGB-D images. Because optical flow provides accurate and robust correspondences, the ego-motion can be estimated stably, which can reduce the difficulty of depth completion learning in indoor environments. Experimental results show that the proposed method outperforms the previous self-supervised method in the new depth completion setting and produces qualitatively adequate estimates.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125921633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Group Activity Recognition Using Joint Learning of Individual Action Recognition and People Grouping 基于个体行为识别和人群分组联合学习的群体活动识别
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511390
Chihiro Nakatani, Kohei Sendo, N. Ukita
{"title":"Group Activity Recognition Using Joint Learning of Individual Action Recognition and People Grouping","authors":"Chihiro Nakatani, Kohei Sendo, N. Ukita","doi":"10.23919/MVA51890.2021.9511390","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511390","url":null,"abstract":"This paper proposes joint learning of individual action recognition and people grouping for improving group activity recognition. By sharing the information between two similar tasks (i.e., individual action recognition and people grouping) through joint learning, errors of these two tasks are mutually corrected. This joint learning also improves the accuracy of group activity recognition. Our proposed method is designed to consist of any individual action recognition methods as a component. The effectiveness is validated with various IAR methods. By employing existing group activity recognition methods for ensembling with the proposed method, we achieved the best performance compared to the similar SOTA group activity recognition methods.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132747185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Position Estimation of Pedestrians in Surveillance Video Using Face Detection and Simple Camera Calibration 基于人脸检测和简单摄像机标定的监控视频中行人位置估计
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511348
Toshio Sato, Xin Qi, Keping Yu, Zheng Wen, Yutaka Katsuyama, Takuro Sato
{"title":"Position Estimation of Pedestrians in Surveillance Video Using Face Detection and Simple Camera Calibration","authors":"Toshio Sato, Xin Qi, Keping Yu, Zheng Wen, Yutaka Katsuyama, Takuro Sato","doi":"10.23919/MVA51890.2021.9511348","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511348","url":null,"abstract":"Pedestrian position estimation in videos is an important technique for enhancing surveillance system applications. Although many studies estimate pedestrian positions by using human body detection, its usage is limited when the entire body expands outside of the field of view. Camera calibration is also important for realizing accurate position estimation. Most surveillance cameras are not adjusted, and it is necessary to establish a method for easy camera calibration after installation. In this paper, we propose an estimation method for pedestrian positions using face detection and anthropometric properties such as statistical face lengths. We also investigate a simple method for camera calibration that is suitable for actual uses. We evaluate the position estimation accuracy by using indoor surveillance videos.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115187614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMA-Depth: A New Monocular Depth Estimation Model Using Hierarchical Multi-Scale Attention HMA-Depth:一种新的基于层次多尺度注意的单目深度估计模型
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511345
Zhaofeng Niu, Yuichiro Fujimoto, M. Kanbara, H. Kato
{"title":"HMA-Depth: A New Monocular Depth Estimation Model Using Hierarchical Multi-Scale Attention","authors":"Zhaofeng Niu, Yuichiro Fujimoto, M. Kanbara, H. Kato","doi":"10.23919/MVA51890.2021.9511345","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511345","url":null,"abstract":"Monocular depth estimation is an essential technique for tasks like 3D reconstruction. Although many works have emerged in recent years, they can be improved by better utilizing the multi-scale information of the input images, which is proved to be one of the keys in generating high-quality depth estimations. In this paper, we propose a new monocular depth estimation method named HMA-Depth, in which we follow the encoder-decoder scheme and combine several techniques such as skip connections and the atrous spatial pyramid pooling. To obtain more precise local information from the image while keeping a good understanding of the global context, a hierarchical multi-scale attention module is adopted and its outputs are combined to generate the final output that is with both good details and good overall accuracy. Experimental results on two commonly-used datasets prove that HMA-Depth can outperform the existing approaches. Code is available11https://github.com/saranew/HMADepth.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115196631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal Extension for Encoder-Decoder-based Crowd Counting Approaches 基于编码器-解码器的人群计数方法的时间扩展
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511351
T. Golda, F. Krüger, J. Beyerer
{"title":"Temporal Extension for Encoder-Decoder-based Crowd Counting Approaches","authors":"T. Golda, F. Krüger, J. Beyerer","doi":"10.23919/MVA51890.2021.9511351","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511351","url":null,"abstract":"Crowd counting is an important aspect to safety monitoring at mass events and can be used to initiate safety measures in time. State-of-the-art encoder-decoder architectures are able to estimate the number of people in a scene precisely. However, since most of the proposed methods are based to solely operate on single-image features, we observe that estimated counts for aerial video sequences are inherently noisy, which in turn reduces the significance of the overall estimates. In this paper, we propose a simple temporal extension to said encoder-decoder architectures that incorporates local context from multiple frames into the estimation process. By applying the temporal extension a state-of-the-art architectures and exploring multiple configuration settings, we find that the resulting estimates are more precise and smoother over time.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115690810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ROT-Harris: A Dynamic Approach to Asynchronous Interest Point Detection ROT-Harris:异步兴趣点检测的动态方法
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511407
S. Harrigan, S. Coleman, M. Ker, P. Yogarajah, Z. Fang, Chengdong Wu
{"title":"ROT-Harris: A Dynamic Approach to Asynchronous Interest Point Detection","authors":"S. Harrigan, S. Coleman, M. Ker, P. Yogarajah, Z. Fang, Chengdong Wu","doi":"10.23919/MVA51890.2021.9511407","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511407","url":null,"abstract":"Event-based vision sensors are a paradigm shift in the way that visual information is obtained and processed. These devices are capable of low-latency transmission of data which represents the scene dynamics. Additionally, low-power benefits make the sensors popular in finite-power scenarios such as high-speed robotics or machine vision applications where latency in visual information is desired to be minimal. The core datatype of such vision sensors is the ‘event’ which is an asynchronous per-pixel signal indicating a change in light intensity at an instance in time corresponding to the spatial location of that sensor on the array. A popular approach to event-based processing is to map events onto a 2D plane over time which is comparable with traditional imaging techniques. However, this paper presents a disruptive approach to event data processing that uses a tree-based filter framework that directly processes raw event data to extract events corresponding to interest point features, which is then combined with a Harris interest point approach to isolate features. We hypothesise that since the tree structure contains the same spatial information as a 2D surface mapping, Harris may be applied directly to the content of the tree, bypassing the need for transformation to the 2D plane. Results illustrate that the proposed approach performs better than other state-of-the-art approaches with limited compromise on the run-time performance.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127265674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信