Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision最新文献

筛选
英文 中文
Performance Evaluation of Recent Object Detection Models for Traffic Safety Applications on Edge 基于边缘的交通安全目标检测模型性能评价
Anilcan Bulut, Fatmanur Ozdemir, Y. S. Bostanci, M. Soyturk
{"title":"Performance Evaluation of Recent Object Detection Models for Traffic Safety Applications on Edge","authors":"Anilcan Bulut, Fatmanur Ozdemir, Y. S. Bostanci, M. Soyturk","doi":"10.1145/3582177.3582178","DOIUrl":"https://doi.org/10.1145/3582177.3582178","url":null,"abstract":"Real-time objection detection is becoming more important and critical in all application areas, including Smart Transport and Smart City. From safety/security to resource efficiency, real-time image processing approaches are used more than ever. On the other hand, low-latency requirements and available resources present challenges. Edge computing integrated with cloud computing minimizes communication delays but requires efficient use of resources due to its limited resources. For example, although deep learning-based object detection methods give very accurate and reliable results, they require high computational power. This overhead reveals a need to implement deep learning models with less complex architectures for edge deployment. In this paper, the performance of evolving deep learning models with their lightweight versions such as YOLOv5-Nano, YOLOX-Nano, YOLOX-Tiny, YOLOv6-Nano, YOLOv6-Tiny, and YOLOv7-Tiny are evaluated on a commercially available edge device. The results show that YOLOv5-Nano and YOLOv6-Nano with their TensorRT versions can provide real-time applicability in approximately 35 milliseconds of inference time. It is also observed that YOLOv6-Tiny gives the highest average precision while YOLOv5-Nano gives the lowest energy consumption when compared to other models.","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129520893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Matching Images from Different Viewpoints with Deep Learning Based on LoFTR and MAGSAC++ 基于LoFTR和MAGSAC++的深度学习不同视点图像匹配
Liang Tian
{"title":"Matching Images from Different Viewpoints with Deep Learning Based on LoFTR and MAGSAC++","authors":"Liang Tian","doi":"10.1145/3582177.3582181","DOIUrl":"https://doi.org/10.1145/3582177.3582181","url":null,"abstract":"Matching 2D images from different viewpoints plays a crucial role in the fields of Structure-from-Motion and 3D reconstruction. However, image matching for assorted and unstructured images with a wide variety of viewpoints leads to difficulty for traditional matching methods. In this paper, we propose a Transformer-based feature matching approach to capture the same physical points of a scene from two images with different viewpoints. The local features of images are extracted by the LoFTR, which is a detector-free deep-learning matching model on the basis of Transformer. The subsequent matching process is realized by the MAGSAC++ estimator, where the matching results are summarized in the fundamental matrix as the model output. By removing image feature points with low confidence scores and applying the test time augmentation, our approach can reach a mean Average Accuracy 0.81340 in the Kaggle competition Image Matching Challenge 2022. This score ranks 45/642 in the competition leaderboard, and can get a silver medal in this competition. Our work could help accelerate the research of generalized methods for Structure-from-Motion and 3D reconstruction, and would potentially deepen the understanding of image feature matching and related fields.","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128860032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Penetration Point Detection for Autonomous Trench Excavation Based on Binocular Vision 基于双目视觉的自主挖沟穿透点检测
Jiangying Zhao, Y. Hu, Mingrui Tian, Xiaohua Xia, Peng Tan
{"title":"Penetration Point Detection for Autonomous Trench Excavation Based on Binocular Vision","authors":"Jiangying Zhao, Y. Hu, Mingrui Tian, Xiaohua Xia, Peng Tan","doi":"10.1145/3582177.3582191","DOIUrl":"https://doi.org/10.1145/3582177.3582191","url":null,"abstract":"To autonomously detect the penetration point in the working area of trench excavation, a feature detection method of penetration point based on binocular cameras was proposed. First, the homogeneous coordinate transformation is established, which can convert the 3D point cloud of the excavation area from the camera coordinate system to the excavator global base coordinate system. Then, the global gradient consistency function is designed to describe the geometric feature of the penetration point of a trench, and the position coordinates of the penetration point are detected. Finally, the test of the penetration point detection of the excavation area is conducted. Within the range of the excavation operation, the maximum position error of the penetration point detection is less than 80 mm, and the average detection error is 46.2 mm, which proves that this method can effectively detect the penetration point.","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125643755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised Defect Segmentation with Uncertainty-aware Pseudo-labels from Multi-branch Network 基于不确定性感知的多分支网络伪标签半监督缺陷分割
Dejene M. Sime, Guotai Wang, Zhi Zeng, Bei Peng
{"title":"Semi-supervised Defect Segmentation with Uncertainty-aware Pseudo-labels from Multi-branch Network","authors":"Dejene M. Sime, Guotai Wang, Zhi Zeng, Bei Peng","doi":"10.1145/3582177.3582190","DOIUrl":"https://doi.org/10.1145/3582177.3582190","url":null,"abstract":"Semi-supervised learning methods have recently gained considerable attention for training deep learning networks with limited labeled samples and additional large label-free samples. Consistency regularization and pseudo-labeling methods are among the most widely used semi-supervised learning methods. However, unreliable pseudo labels will largely limit the model’s performance when learning from unlabeled images. To alleviate this problem, we propose uncertainty-rectified pseudo labels generated from dynamically mixing predictions of multiple decoders with a shared encoder network for semi-supervised defect segmentation. We estimated the uncertainty as the prediction discrepancy between the average prediction and the output of each decoder head. The estimated uncertainty then guides the consistency training as well as the pseudo-label-based supervision. The proposed method achieved significant performance improvement over the fully supervised baseline and other state-of-the-art semi-supervised segmentation methods on similar labeled data proportions. We also performed an extensive ablation study to demonstrate that the proposed method performs well under various setups.","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"420 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122794761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Case study of 3D Interactive Design of Nanjing Tongji Gate Wall 南京同济城门墙三维交互设计案例研究
Yang Cao, Yuan Gao, Ang Geng
{"title":"Case study of 3D Interactive Design of Nanjing Tongji Gate Wall","authors":"Yang Cao, Yuan Gao, Ang Geng","doi":"10.1145/3582177.3582189","DOIUrl":"https://doi.org/10.1145/3582177.3582189","url":null,"abstract":"Through the case study of 3D interactive design of Nanjing Tongji Gate Wall, this paper explores the application of 3D interactive design in relic visual reconstruction. This paper collects literature on Nanjing City Wall and conducts field research and interviews about Tongji Gate. On the basis of relevant data and literature, 3D interactive technology is used to design the interactive animation system of Tongji Gate. 3D interactive animation contributes to the visual reconstruction of material cultural heritage, the improvement of traditional cultural relics protection and research, the adaptation to communication and promotion in the new media era, the protection of cultural heritage, and the continuous cultural inheritance.","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130488268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Action Recognition with Non-Uniform Key Frame Selector 非统一关键帧选择器的动作识别
Haohe Li, Chong Wang, Shenghao Yu, Chenchen Tao
{"title":"Action Recognition with Non-Uniform Key Frame Selector","authors":"Haohe Li, Chong Wang, Shenghao Yu, Chenchen Tao","doi":"10.1145/3582177.3582182","DOIUrl":"https://doi.org/10.1145/3582177.3582182","url":null,"abstract":"Current approaches for spatiotemporal action recognition have achieved impressive progress, especially in temporal information processing. Meanwhile, the power of spatial information may be underestimated. Thus, a non-uniform key frame selector is proposed to pick the most representative frames according to the relationship between frames along the temporal dimension. Specifically, the reweight high-level frame features are used to generate an importance score sequence, while the key frames, in each temporal section, are selected based on the above scores. Such selected frames have richer semantic information, which has positive impact on the network training. The proposed model is evaluated on two action recognition, namely datasets HMDB51 and UCF101, and promising accuracy improvement is achieved.","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131511653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision 2023第五届图像处理与机器视觉国际会议论文集
{"title":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","authors":"","doi":"10.1145/3582177","DOIUrl":"https://doi.org/10.1145/3582177","url":null,"abstract":"","PeriodicalId":170327,"journal":{"name":"Proceedings of the 2023 5th International Conference on Image Processing and Machine Vision","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131428210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信