2023 18th International Conference on Machine Vision and Applications (MVA)最新文献

筛选
英文 中文
Domain Adaptation from Visible-Light to FIR with Reliable Pseudo Labels 从可见光到FIR的可靠伪标签域自适应
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216102
Juki Tanimoto, Haruya Kyutoku, Keisuke Doman, Y. Mekada
{"title":"Domain Adaptation from Visible-Light to FIR with Reliable Pseudo Labels","authors":"Juki Tanimoto, Haruya Kyutoku, Keisuke Doman, Y. Mekada","doi":"10.23919/MVA57639.2023.10216102","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216102","url":null,"abstract":"Deep learning object detection models using visible-light cameras are easily affected by weather and lighting conditions, whereas those using far-infrared cameras are less affected by such conditions. This paper proposes a domain adaptation method using pseudo labels from a visible-light camera toward an accurate object detection from far-infrared images. Our method projects visible light-domain detection results onto far-infrared images, and uses them as pseudo labels for training a far-infrared detection model. We confirmed the effectiveness of our method through experiments.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115276711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intra-frame Skeleton Constraints Modeling and Grouping Strategy Based Multi-Scale Graph Convolution Network for 3D Human Motion Prediction 基于帧内骨架约束建模和分组策略的多尺度图卷积网络三维人体运动预测
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216076
Zhihan Zhuang, Yuan Li, Songlin Du, T. Ikenaga
{"title":"Intra-frame Skeleton Constraints Modeling and Grouping Strategy Based Multi-Scale Graph Convolution Network for 3D Human Motion Prediction","authors":"Zhihan Zhuang, Yuan Li, Songlin Du, T. Ikenaga","doi":"10.23919/MVA57639.2023.10216076","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216076","url":null,"abstract":"Attention-based feed-forward networks and graph convolution networks have recently shown great promise in 3D skeleton-based human motion prediction for their good performance in learning temporal and spatial relations. However, previous methods have two critical issues: first, spatial dependencies for distal joints in each independent frame are hard to learn; second, the basic architecture of graph convolution network ignores hierarchical structure and diverse motion patterns of different body parts. To address these issues, this paper proposes an intra-frame skeleton constraints modeling method and a Grouping based Multi-Scale Graph Convolution Network (GMS-GCN) model. The intra-frame skeleton constraints modeling method leverages self-attention mechanism and a designed adjacency matrix to model the skeleton constraints of distal joints in each independent frame. The GMS-GCN utilizes a grouping strategy to learn the dynamics of various body parts separately. Instead of mapping features in the same feature space, GMS-GCN extracts human body features in different dimensions by up-sample and down-sample GCN layers. Experiment results demonstrate that our method achieves an average MPJPE of 34.7mm for short-term prediction and 93.2mm for long-term prediction and both outperform the state-of-the-art approaches.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"353 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122791411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Outline Generation Transformer for Bilingual Scene Text Recognition 双语场景文本识别的轮廓生成变压器
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216107
Jui-Teng Ho, G. Hsu, S. Yanushkevich, M. Gavrilova
{"title":"Outline Generation Transformer for Bilingual Scene Text Recognition","authors":"Jui-Teng Ho, G. Hsu, S. Yanushkevich, M. Gavrilova","doi":"10.23919/MVA57639.2023.10216107","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216107","url":null,"abstract":"We propose the Outline Generation Transformer (OGT) for bilingual Scene Text Recognition (STR). As most STR approaches focus on English, we consider both English and Chinese as Chinese is also a major language, and it is a common scene in many areas/countries where both languages can be seen. The OGT consists of an Outline Generator (OG) and a transformer with a language model embedded. The OG detects the character outline of the text and embeds the outline features into a transformer with the outline-query cross-attention layer to better locate each character and enhance the text recognition performance. The training of OGT has two phases, one is training on synthetic data where the text outline masks are made available, followed by the other training on real data where the text outline masks can only be estimated. The proposed OGT is evaluated on several benchmark datasets and compared with state-of-the-art methods.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126492214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-class Semantic Segmentation of Tooth Pathologies and Anatomical Structures on Bitewing and Periapical Radiographs 咬翼和根尖周x线片上牙齿病理和解剖结构的多类别语义分割
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215653
James-Andrew R. Sarmiento, Liushifeng Chen, P. Naval
{"title":"Multi-class Semantic Segmentation of Tooth Pathologies and Anatomical Structures on Bitewing and Periapical Radiographs","authors":"James-Andrew R. Sarmiento, Liushifeng Chen, P. Naval","doi":"10.23919/MVA57639.2023.10215653","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215653","url":null,"abstract":"Detecting dental problems early can prevent invasive procedures and reduce healthcare costs, but traditional exams may not identify all issues, making radiography essential. However, interpreting X-rays can be time-consuming, subjective, prone to error, and requires specialized knowledge. Automated segmentation methods using AI can improve interpretation and aid in diagnosis and patient education. Our U-Net model, trained on 344 bitewing and periapical X-rays, can identify two pathologies and eight anatomical features. It achieves an overall diagnostic performance of 0.794 and 0.787 in terms of Dice score and sensitivity, respectively, 0.493 and 0.405 for dental caries, and 0.471 and 0.44 for root infections. This successful application of deep learning to dental imaging demonstrates the potential of automated segmentation methods for improving accuracy and efficiency in diagnosing and treating dental disorders.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127414661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Plane Projection for Extending Perspective Image Object Detection Models to 360° Images 多平面投影扩展透视图像对象检测模型到360°图像
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215689
Yasuto Nagase, Y. Babazaki, Katsuhiko Takahashi
{"title":"Multi-Plane Projection for Extending Perspective Image Object Detection Models to 360° Images","authors":"Yasuto Nagase, Y. Babazaki, Katsuhiko Takahashi","doi":"10.23919/MVA57639.2023.10215689","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215689","url":null,"abstract":"Since 360° cameras are still in their diffusion phase, there are no large annotated datasets or models trained on them as there are for perspective cameras. Creating new 360°-specific datasets and training recognition models for each domain and tasks have a significant barrier for many users aiming at practical applications. Therefore, we propose a novel technique to effectively adapt the existing models to 360° images. The 360° images are projected to multiple planes and adapted to the existing model, and the detected results are unified in a spherical coordinate system. In experiments, we evaluated our method on an object detection task and compared it to baselines, which showed an improvement in recognition accuracy of up to 6.7%.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116857496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safe Landing Zone Detection for UAVs using Image Segmentation and Super Resolution 基于图像分割和超分辨率的无人机安全着陆区检测
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215759
Anagh Benjwal, Prajwal Uday, Aditya Vadduri, Abhishek Pai
{"title":"Safe Landing Zone Detection for UAVs using Image Segmentation and Super Resolution","authors":"Anagh Benjwal, Prajwal Uday, Aditya Vadduri, Abhishek Pai","doi":"10.23919/MVA57639.2023.10215759","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215759","url":null,"abstract":"Increased usage of UAVs in urban environments has led to the necessity of safe and robust emergency landing zone detection techniques. This paper presents a novel approach for detecting safe landing zones for UAVs using deep learning-based image segmentation. Our approach involves using a custom dataset to train a CNN model. To account for low-resolution input images, our approach incorporates a Super-Resolution model to upscale low-resolution images before feeding them into the segmentation model. The proposed approach achieves robust and accurate detection of safe landing zones, even on low-resolution images. Experimental results demonstrate the effectiveness of our method and show a marked improvement of upto 6.3% in accuracy over state-of-the-art safe landing zone detection methods.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128689719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint learning of images and videos with a single Vision Transformer 用单个视觉转换器联合学习图像和视频
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215661
Shuki Shimizu, Toru Tamaki
{"title":"Joint learning of images and videos with a single Vision Transformer","authors":"Shuki Shimizu, Toru Tamaki","doi":"10.23919/MVA57639.2023.10215661","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215661","url":null,"abstract":"In this study, we propose a method for jointly learning of images and videos using a single model. In general, images and videos are often trained by separate models. We propose in this paper a method that takes a batch of images as input to Vision Transformer (IV-ViT), and also a set of video frames with temporal aggregation by late fusion. Experimental results on two image datasets and two action recognition datasets are presented.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127324476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive Knowledge Distillation for Anomaly Detection in Multi-Illumination/Focus Display Images 多照度/聚焦显示图像异常检测的对比知识蒸馏
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10215808
Jihyun Lee, Hangi Park, Yongmin Seo, Taewon Min, Joodong Yun, Jaewon Kim, Tae-Kyun Kim
{"title":"Contrastive Knowledge Distillation for Anomaly Detection in Multi-Illumination/Focus Display Images","authors":"Jihyun Lee, Hangi Park, Yongmin Seo, Taewon Min, Joodong Yun, Jaewon Kim, Tae-Kyun Kim","doi":"10.23919/MVA57639.2023.10215808","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10215808","url":null,"abstract":"In this paper, we tackle automatic anomaly detection in multi-illumination and multi-focus display images. The minute defects on the display surface are hard to spot out in RGB images and by a model trained with only normal data. To address this, we propose a novel contrastive learning scheme for knowledge distillation-based anomaly detection. In our framework, Multiresolution Knowledge Distillation (MKD) is adopted as a baseline, which operates by measuring feature similarities between the teacher and student networks. Based on MKD, we propose a novel contrastive learning method, namely Multiresolution Contrastive Distillation (MCD), which does not require positive/negative pairs with an anchor but operates by pulling/pushing the distance between the teacher and student features. Furthermore, we propose the blending module that transforms and aggregate multi-channel information to the three-channel input layer of MCD. Our proposed method significantly outperforms competitive state-of-the-art methods in both AUROC and accuracy metrics on the collected Multi-illumination and Multi-focus display image dataset for Anomaly Detection (MMdAD).","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129295108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Identification of Surgical Instruments without Tagging: Implementation in Real Hospital Work Environment 无标签手术器械的自动识别:在真实医院工作环境中的实现
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/MVA57639.2023.10216222
Rui Ishiyama, Per Helge Litzheim Frøiland, Stein-Asle Øvrebotn
{"title":"Automated Identification of Surgical Instruments without Tagging: Implementation in Real Hospital Work Environment","authors":"Rui Ishiyama, Per Helge Litzheim Frøiland, Stein-Asle Øvrebotn","doi":"10.23919/MVA57639.2023.10216222","DOIUrl":"https://doi.org/10.23919/MVA57639.2023.10216222","url":null,"abstract":"This paper presents a new practical system to track and trace individual surgical instruments without marking or tagging. Individual identification is fundamental to traceability, documentation, and optimization for patient safety, compliance, economy, and the environment. However, existing identification systems have yet to be adopted by most hospitals due to the costs and risks of tagging or marking. The \"Fingerprint of Things\" recognition technology enables tag-less identification; however, scanning automation to save labor costs, which should be devoted to patient care, is also essential for practical use. We developed a new system concept that automates the detection, type recognition, fingerprint scanning, and identification of every instrument in the workspace. A prototype solution has also been implemented and tested in real hospital work. The feasibility of our solution as a commercial product is verified by its order for adoption.","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131303166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Most Influential Paper over the Decade Award 十年最具影响力论文奖
2023 18th International Conference on Machine Vision and Applications (MVA) Pub Date : 2023-07-23 DOI: 10.23919/mva57639.2023.10215707
{"title":"Most Influential Paper over the Decade Award","authors":"","doi":"10.23919/mva57639.2023.10215707","DOIUrl":"https://doi.org/10.23919/mva57639.2023.10215707","url":null,"abstract":"","PeriodicalId":338734,"journal":{"name":"2023 18th International Conference on Machine Vision and Applications (MVA)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125620476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信