2021 17th International Conference on Machine Vision and Applications (MVA)最新文献

筛选
英文 中文
Data Augmentation for Human Motion Prediction 人体运动预测的数据增强
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511368
Takahiro Maeda, N. Ukita
{"title":"Data Augmentation for Human Motion Prediction","authors":"Takahiro Maeda, N. Ukita","doi":"10.23919/MVA51890.2021.9511368","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511368","url":null,"abstract":"Human motion prediction is seldom deployed to real-world tasks due to difficulty in collecting a huge amount of motion data. We propose two motion data augmentation approaches using Variational AutoEn-coder (VAE) and Inverse Kinematics (IK). Our VAE-based generative model with adversarial training and sampling near samples generates various motions even with insufficient original motion data. Our IK-based augmentation scheme allows us to semi-automatically generate a variety of motions. Furthermore, we correct unrealistic artifacts in the augmented motions. As a result, our method outperforms previous noise-based motion augmentation methods.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129428280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lossless AI: Toward Guaranteeing Consistency between Inferences Before and After Quantization via Knowledge Distillation 无损人工智能:通过知识蒸馏保证量化前后推理的一致性
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511383
T. Okuno, Yohei Nakata, Yasunori Ishii, Sotaro Tsukizawa
{"title":"Lossless AI: Toward Guaranteeing Consistency between Inferences Before and After Quantization via Knowledge Distillation","authors":"T. Okuno, Yohei Nakata, Yasunori Ishii, Sotaro Tsukizawa","doi":"10.23919/MVA51890.2021.9511383","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511383","url":null,"abstract":"Deep learning model compression is necessary for real-time inference on edge devices, which have limited hardware resources. Conventional methods have only focused on suppressing degradation in terms of accuracy. Even if a compressed model has almost equivalent accuracy to its reference model, the inference results may change when we focus on individual samples or objects. Such a change is a crucial challenge for the quality assurance of embedded products because of unexpected behavior for specific applications on edge devices. Therefore, we propose a concept called “Loss-less AI” to guarantee consistency between the inference results of reference and compressed models. In this paper, we propose a training method to align inference results between reference and quantized models by applying knowledge distillation that batch normalization statistics are frozen at moving average values from the middle of training. We evaluated the proposed method on several classification datasets and network architectures. In all cases, our method suppressed the inferred class mismatch between reference and quantized models whereas conventional quantization-aware training did not.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131072735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Analysis of Evaluation Metrics with the Distance between Positive Pairs and Negative Pairs in Deep Metric Learning 深度度量学习中具有正负对距离的评价指标分析
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511393
Hajime Oi, Rei Kawakami, T. Naemura
{"title":"Analysis of Evaluation Metrics with the Distance between Positive Pairs and Negative Pairs in Deep Metric Learning","authors":"Hajime Oi, Rei Kawakami, T. Naemura","doi":"10.23919/MVA51890.2021.9511393","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511393","url":null,"abstract":"Deep metric learning (DML) acquires embeddings via deep learning, where distances among samples of the same class are shorter than those of different classes. The previous DML studies proposed new metrics to overcome the issues of general metrics, but they have the following two problems; one is that they consider only a small portion of the whole distribution of the data, and the other is that their scores cannot be directly compared among methods when the number of classes is different. To analyze these issues, we consider the histograms of the inner products between arbitrary positive pairs and those of negative pairs. We can evaluate the entire distribution by measuring the distance between the two histograms. By normalizing the histograms by their areas, we can also cancel the effect of the number of classes. In experiments, visualizations of the histograms revealed that the embeddings of the existing DML methods do not generalize well to the validation set. We also confirmed that the evaluation of the distance between the positive and negative histograms is less affected by the variation in the number of classes compared with Recall@1 and MAP@R.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123256463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Cut and paste curriculum learning with hard negative mining for point-of-sale systems 对销售点系统进行硬负面挖掘的剪切粘贴课程学习
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511391
Jaechul Kim, Xiaoyan Dai, Yi-Jwu Hsieh, Hiroki Tanimoto, H. Fujiyoshi
{"title":"Cut and paste curriculum learning with hard negative mining for point-of-sale systems","authors":"Jaechul Kim, Xiaoyan Dai, Yi-Jwu Hsieh, Hiroki Tanimoto, H. Fujiyoshi","doi":"10.23919/MVA51890.2021.9511391","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511391","url":null,"abstract":"Although point-of-sale (POS) systems generally use barcodes, progress in automation in recent years has come to require real-time performance. Since these systems use machine learning models to detect products from images, the models need to be retrained frequently to support the continual release of new products. Thus, methods for efficiently training a model from a limited amount of data are needed. Curriculum learning was developed to achieve this kind of efficient machine learning. However, curriculum learning in general has the problem that early learning progress is slow. Therefore, we developed a new curriculum learning method using hard negative mining to boost the learning progress. This method provides a remarkable learning effect through simple cut and paste. We test our method on various test data, and the proposed method is found to achieve better performance at the same learning epoch compared with conventional cut and paste methods. We expect our method to contribute to the realization of real-time and easy-to-operate POS systems.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127825997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-based Crack Width Estimation using Rectangle Transform 基于模型的矩形变换裂缝宽度估计
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511346
C. Benz, V. Rodehorst
{"title":"Model-based Crack Width Estimation using Rectangle Transform","authors":"C. Benz, V. Rodehorst","doi":"10.23919/MVA51890.2021.9511346","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511346","url":null,"abstract":"The automated image-based robust estimation of crack widths in concrete structures forms a significant component in the automation of structural health monitoring. The proposed method, called rectangle transform, uses the gray-scale profile extracted perpendicularly to the direction of crack propagation. Based on the concept of an idealized profile, it transforms the empirical profile into an equal-area rectangle from which the width is inferred. On the available dataset and compared to two other approaches, it shows at least par performance for widths larger two pixels and distinctly better performance on widths smaller equal two pixels. Moreover, it is more robust towards blurred input.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115434691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Relational Subgraph for Graph-based Path Prediction 基于图的路径预测的关系子图
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511362
Masaki Miyata, Katsutoshi Shiraki, H. Minoura, Tsubasa Hirakawa, Takayoshi Yamashita, H. Fujiyoshi
{"title":"Relational Subgraph for Graph-based Path Prediction","authors":"Masaki Miyata, Katsutoshi Shiraki, H. Minoura, Tsubasa Hirakawa, Takayoshi Yamashita, H. Fujiyoshi","doi":"10.23919/MVA51890.2021.9511362","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511362","url":null,"abstract":"Path prediction methods using graph convolutional networks (GCNs) that represent pedestrians' relationships by graphs have been proposed. These GCN-based methods consider only the distance information for the relationship between pedestrians, and the visibility state and other relationships are not taken into account. In this paper, we propose a path prediction method that represents the detailed relationship between pedestrians by introducing relational subgraphs. Each subgraph is constructed on different relationships. The proposed method inputs these relational subgraphs and the distance graph into GCNs and we extract features. Then, the features are input to a temporal convolutional network, which outputs multivariate Gaussian parameters to predict the future path. The experimental results with ETH and UCY datasets show that the proposed method outperforms the conventional method using only the distance information.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"210 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121890380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video Summarization With Frame Index Vision Transformer 基于帧索引视觉转换器的视频摘要
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511350
Tzu-Chun Hsu, Yiping Liao, Chun-Rong Huang
{"title":"Video Summarization With Frame Index Vision Transformer","authors":"Tzu-Chun Hsu, Yiping Liao, Chun-Rong Huang","doi":"10.23919/MVA51890.2021.9511350","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511350","url":null,"abstract":"In this paper, we propose a novel frame index vision transformer for video summarization. Given training frames, we linearly project the content of the frames to obtain frame embedding. By incorporating the frame embedding with the index embedding and class embedding, the proposed frame index vision transformer can be efficiently and effectively applied to learn the importance of the input frames. As shown in the experimental results, the proposed method outperforms the state-of-the-art deep learning methods including recurrent neural network (RNN) and convolutional neural network (CNN) based methods in both of the SumMe and TVSum datasets. In addition, our method can achieve real-time computational efficiency during testing.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127704015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Recurrent RLCN-Guided Attention Network for Single Image Deraining 用于单幅图像训练的递归rlcn引导注意网络
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511405
Yizhou Li, Yusuke Monno, M. Okutomi
{"title":"Recurrent RLCN-Guided Attention Network for Single Image Deraining","authors":"Yizhou Li, Yusuke Monno, M. Okutomi","doi":"10.23919/MVA51890.2021.9511405","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511405","url":null,"abstract":"Single image deraining is an important yet challenging task due to the ill-posed nature of the problem to derive the rain-free clean image from a rainy image. In this paper, we propose Recurrent RLCN-Guided Attention Network (RRANet) for single image deraining. Our main technical contributions lie in threefold: (i) We propose rectified local contrast normalization (RLCN) to apply to the input rainy image to effectively mark candidates of rain regions. (ii) We propose RLCN-guided attention module (RLCN-GAM) to learn an effective attention map for the deraining without the necessity of ground-truth rain masks. (iii) We incorporate RLCN-GAM into a recurrent neural network to progressively derive the rainy-to-clean image mapping. The quantitative and qualitative evaluations using representative deraining benchmark datasets demonstrate that our proposed RRANet outperforms existing state-of-the-art deraining methods, where it is particularly noteworthy that our method clearly achieves the best performance on a realworld dataset.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123048790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Optical Model for Show-through Cancellation in Ancient Document Imaging with Dark and Bright Mounts 古代文献暗明暗座成像透显对消的光学模型
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511377
Yuri Ueno, Kenichiro Tanaka, Takuya Funatomi, Y. Mukaigawa
{"title":"An Optical Model for Show-through Cancellation in Ancient Document Imaging with Dark and Bright Mounts","authors":"Yuri Ueno, Kenichiro Tanaka, Takuya Funatomi, Y. Mukaigawa","doi":"10.23919/MVA51890.2021.9511377","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511377","url":null,"abstract":"There is a need to make ancient paper documents from Asian countries such as Japan and China more readable. Many of them have writing or illustrations on both sides for reuse, which can show through the paper. In such documents, it is often difficult to distinguish just the front image because of the ink show-through from the back side. Our aim is to obtain clearer front images by removing the show-through from the back side. Hence, we propose an imaging method that uses dark and bright mounts as well as a prediction step based on a simple optical model. We successfully remove the show-through from the back and extract the images of the front of the document.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122597472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expandable Spherical Projection and Feature Fusion Methods for Object Detection from Fisheye Images 鱼眼图像目标检测的可扩展球面投影和特征融合方法
2021 17th International Conference on Machine Vision and Applications (MVA) Pub Date : 2021-07-25 DOI: 10.23919/MVA51890.2021.9511379
Songeun Kim, Soon-Yong Park
{"title":"Expandable Spherical Projection and Feature Fusion Methods for Object Detection from Fisheye Images","authors":"Songeun Kim, Soon-Yong Park","doi":"10.23919/MVA51890.2021.9511379","DOIUrl":"https://doi.org/10.23919/MVA51890.2021.9511379","url":null,"abstract":"One of the key requirements for enhanced autonomous driving systems is accurate detection of the objects from a wide range of view. Large-angle images from a fisheye lens camera can be an effective solution for automotive applications. However, it comes with the cost of strong radial distortions. In particular, the fisheye camera has a photographic effect of exaggerating the size of objects in central regions of the image, while making objects near the marginal area appear smaller. Therefore, we propose the Expandable Spherical Projection that expands center or margin regions to produce straight edges of de-warped objects with less unwanted background in the bounding boxes. In addition to this, we analyze the influence of multi-scale feature fusion in a real-time object detector, which learns to extract more meaningful information for small objects. We present three different types of concatenated YOLOv3-SPP architectures. Moreover, we demonstrate the effectiveness of our proposed projection and feature-fusion using multiple fisheye lens datasets, which shows up to 4.7% AP improvement compared to fisheye images and baseline model.","PeriodicalId":312481,"journal":{"name":"2021 17th International Conference on Machine Vision and Applications (MVA)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121681499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信