BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference最新文献_第2页

Learning Object-level Point Augmentor for Semi-supervised 3D Object Detection 学习对象级点增强器用于半监督3D对象检测

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-19 DOI: 10.48550/arXiv.2212.09273

Cheng-Ju Ho, Chen Tai, Yi-Hsuan Tsai, Yen-Yu Lin, Ming-Hsuan Yang

引用次数: 2

Free-form 3D Scene Inpainting with Dual-stream GAN 自由形式的3D场景绘制与双流GAN

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-16 DOI: 10.48550/arXiv.2212.08464

Ru-Fen Jheng, Tsung-Han Wu, Jia-Fong Yeh, Winston H. Hsu

{"title":"Free-form 3D Scene Inpainting with Dual-stream GAN","authors":"Ru-Fen Jheng, Tsung-Han Wu, Jia-Fong Yeh, Winston H. Hsu","doi":"10.48550/arXiv.2212.08464","DOIUrl":"https://doi.org/10.48550/arXiv.2212.08464","url":null,"abstract":"Nowadays, the need for user editing in a 3D scene has rapidly increased due to the development of AR and VR technology. However, the existing 3D scene completion task (and datasets) cannot suit the need because the missing regions in scenes are generated by the sensor limitation or object occlusion. Thus, we present a novel task named free-form 3D scene inpainting. Unlike scenes in previous 3D completion datasets preserving most of the main structures and hints of detailed shapes around missing regions, the proposed inpainting dataset, FF-Matterport, contains large and diverse missing regions formed by our free-form 3D mask generation algorithm that can mimic human drawing trajectories in 3D space. Moreover, prior 3D completion methods cannot perform well on this challenging yet practical task, simply interpolating nearby geometry and color context. Thus, a tailored dual-stream GAN method is proposed. First, our dual-stream generator, fusing both geometry and color information, produces distinct semantic boundaries and solves the interpolation issue. To further enhance the details, our lightweight dual-stream discriminator regularizes the geometry and color edges of the predicted scenes to be realistic and sharp. We conducted experiments with the proposed FF-Matterport dataset. Qualitative and quantitative results validate the superiority of our approach over existing scene completion methods and the efficacy of all proposed components.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"16 1","pages":"378"},"PeriodicalIF":0.0,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75318376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Dual Moving Average Pseudo-Labeling for Source-Free Inductive Domain Adaptation 无源感应域自适应的双移动平均伪标记

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-15 DOI: 10.48550/arXiv.2212.08187

Hao Yan, Yuhong Guo

{"title":"Dual Moving Average Pseudo-Labeling for Source-Free Inductive Domain Adaptation","authors":"Hao Yan, Yuhong Guo","doi":"10.48550/arXiv.2212.08187","DOIUrl":"https://doi.org/10.48550/arXiv.2212.08187","url":null,"abstract":"Unsupervised domain adaptation reduces the reliance on data annotation in deep learning by adapting knowledge from a source to a target domain. For privacy and efficiency concerns, source-free domain adaptation extends unsupervised domain adaptation by adapting a pre-trained source model to an unlabeled target domain without accessing the source data. However, most existing source-free domain adaptation methods to date focus on the transductive setting, where the target training set is also the testing set. In this paper, we address source-free domain adaptation in the more realistic inductive setting, where the target training and testing sets are mutually exclusive. We propose a new semi-supervised fine-tuning method named Dual Moving Average Pseudo-Labeling (DMAPL) for source-free inductive domain adaptation. We first split the unlabeled training set in the target domain into a pseudo-labeled confident subset and an unlabeled less-confident subset according to the prediction confidence scores from the pre-trained source model. Then we propose a soft-label moving-average updating strategy for the unlabeled subset based on a moving-average prototypical classifier, which gradually adapts the source model towards the target domain. Experiments show that our proposed method achieves state-of-the-art performance and outperforms previous methods by large margins.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"11 1","pages":"965"},"PeriodicalIF":0.0,"publicationDate":"2022-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82574307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Resolving Semantic Confusions for Improved Zero-Shot Detection 解决语义混淆改进零射击检测

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-12 DOI: 10.48550/arXiv.2212.06097

Sandipan Sarma, Sushil Kumar, A. Sur

{"title":"Resolving Semantic Confusions for Improved Zero-Shot Detection","authors":"Sandipan Sarma, Sushil Kumar, A. Sur","doi":"10.48550/arXiv.2212.06097","DOIUrl":"https://doi.org/10.48550/arXiv.2212.06097","url":null,"abstract":"Zero-shot detection (ZSD) is a challenging task where we aim to recognize and localize objects simultaneously, even when our model has not been trained with visual samples of a few target (\"unseen\") classes. Recently, methods employing generative models like GANs have shown some of the best results, where unseen-class samples are generated based on their semantics by a GAN trained on seen-class data, enabling vanilla object detectors to recognize unseen objects. However, the problem of semantic confusion still remains, where the model is sometimes unable to distinguish between semantically-similar classes. In this work, we propose to train a generative model incorporating a triplet loss that acknowledges the degree of dissimilarity between classes and reflects them in the generated samples. Moreover, a cyclic-consistency loss is also enforced to ensure that generated visual samples of a class highly correspond to their own semantics. Extensive experiments on two benchmark ZSD datasets - MSCOCO and PASCAL-VOC - demonstrate significant gains over the current ZSD methods, reducing semantic confusion and improving detection for the unseen classes.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"16 1","pages":"347"},"PeriodicalIF":0.0,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78848019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Non-uniform Sampling Strategies for NeRF on 360{textdegree} images 360{textdegree}图像上NeRF的非均匀采样策略

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-07 DOI: 10.48550/arXiv.2212.03635

Takashi Otonari, Satoshi Ikehata, K. Aizawa

{"title":"Non-uniform Sampling Strategies for NeRF on 360{textdegree} images","authors":"Takashi Otonari, Satoshi Ikehata, K. Aizawa","doi":"10.48550/arXiv.2212.03635","DOIUrl":"https://doi.org/10.48550/arXiv.2212.03635","url":null,"abstract":"In recent years, the performance of novel view synthesis using perspective images has dramatically improved with the advent of neural radiance fields (NeRF). This study proposes two novel techniques that effectively build NeRF for 360{textdegree} omnidirectional images. Due to the characteristics of a 360{textdegree} image of ERP format that has spatial distortion in their high latitude regions and a 360{textdegree} wide viewing angle, NeRF's general ray sampling strategy is ineffective. Hence, the view synthesis accuracy of NeRF is limited and learning is not efficient. We propose two non-uniform ray sampling schemes for NeRF to suit 360{textdegree} images - distortion-aware ray sampling and content-aware ray sampling. We created an evaluation dataset Synth360 using Replica and SceneCity models of indoor and outdoor scenes, respectively. In experiments, we show that our proposal successfully builds 360{textdegree} image NeRF in terms of both accuracy and efficiency. The proposal is widely applicable to advanced variants of NeRF. DietNeRF, AugNeRF, and NeRF++ combined with the proposed techniques further improve the performance. Moreover, we show that our proposed method enhances the quality of real-world scenes in 360{textdegree} images. Synth360: https://drive.google.com/drive/folders/1suL9B7DO2no21ggiIHkH3JF3OecasQLb.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"101 1","pages":"344"},"PeriodicalIF":0.0,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90074551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs 时间动态视频图的多任务边缘预测

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-06 DOI: 10.48550/arXiv.2212.02875

Osman Ulger, Julian Wiederer, Mohsen Ghafoorian, Vasileios Belagiannis, P. Mettes

{"title":"Multi-Task Edge Prediction in Temporally-Dynamic Video Graphs","authors":"Osman Ulger, Julian Wiederer, Mohsen Ghafoorian, Vasileios Belagiannis, P. Mettes","doi":"10.48550/arXiv.2212.02875","DOIUrl":"https://doi.org/10.48550/arXiv.2212.02875","url":null,"abstract":"Graph neural networks have shown to learn effective node representations, enabling node-, link-, and graph-level inference. Conventional graph networks assume static relations between nodes, while relations between entities in a video often evolve over time, with nodes entering and exiting dynamically. In such temporally-dynamic graphs, a core problem is inferring the future state of spatio-temporal edges, which can constitute multiple types of relations. To address this problem, we propose MTD-GNN, a graph network for predicting temporally-dynamic edges for multiple types of relations. We propose a factorized spatio-temporal graph attention layer to learn dynamic node representations and present a multi-task edge prediction loss that models multiple relations simultaneously. The proposed architecture operates on top of scene graphs that we obtain from videos through object detection and spatio-temporal linking. Experimental evaluations on ActionGenome and CLEVRER show that modeling multiple relations in our temporally-dynamic graph network can be mutually beneficial, outperforming existing static and spatio-temporal graph neural networks, as well as state-of-the-art predicate classification methods.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"26 1","pages":"968"},"PeriodicalIF":0.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86673122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection DA-CIL:面向领域自适应类增量三维目标检测

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-05 DOI: 10.48550/arXiv.2212.02057

Ziyuan Zhao, Ming Xu, Peisheng Qian, R. Pahwa, Richard Chang

{"title":"DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection","authors":"Ziyuan Zhao, Ming Xu, Peisheng Qian, R. Pahwa, Richard Chang","doi":"10.48550/arXiv.2212.02057","DOIUrl":"https://doi.org/10.48550/arXiv.2212.02057","url":null,"abstract":"Deep learning has achieved notable success in 3D object detection with the advent of large-scale point cloud datasets. However, severe performance degradation in the past trained classes, i.e., catastrophic forgetting, still remains a critical issue for real-world deployment when the number of classes is unknown or may vary. Moreover, existing 3D class-incremental detection methods are developed for the single-domain scenario, which fail when encountering domain shift caused by different datasets, varying environments, etc. In this paper, we identify the unexplored yet valuable scenario, i.e., class-incremental learning under domain shift, and propose a novel 3D domain adaptive class-incremental object detection framework, DA-CIL, in which we design a novel dual-domain copy-paste augmentation method to construct multiple augmented domains for diversifying training distributions, thereby facilitating gradual domain adaptation. Then, multi-level consistency is explored to facilitate dual-teacher knowledge distillation from different domains for domain adaptive class-incremental learning. Extensive experiments on various datasets demonstrate the effectiveness of the proposed method over baselines in the domain adaptive class-incremental learning scenario.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"730 1","pages":"916"},"PeriodicalIF":0.0,"publicationDate":"2022-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73626172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

ViewNeRF: Unsupervised Viewpoint Estimation Using Category-Level Neural Radiance Fields ViewNeRF:使用类别级神经辐射场的无监督视点估计

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-12-01 DOI: 10.48550/arXiv.2212.00436

Octave Mariotti, Oisin Mac Aodha, Hakan Bilen

引用次数: 1

Part-based Face Recognition with Vision Transformers 基于零件的视觉变形人脸识别

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-11-30 DOI: 10.48550/arXiv.2212.00057

Zhonglin Sun, Georgios Tzimiropoulos

引用次数: 2

G-CMP: Graph-enhanced Contextual Matrix Profile for unsupervised anomaly detection in sensor-based remote health monitoring G-CMP:基于传感器的远程健康监测中无监督异常检测的图形增强上下文矩阵配置文件

BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference Pub Date : 2022-11-29 DOI: 10.48550/arXiv.2211.16122

Nivedita Bijlani, Oscar Alejandro Mendez Maldonado, S. Kouchaki

{"title":"G-CMP: Graph-enhanced Contextual Matrix Profile for unsupervised anomaly detection in sensor-based remote health monitoring","authors":"Nivedita Bijlani, Oscar Alejandro Mendez Maldonado, S. Kouchaki","doi":"10.48550/arXiv.2211.16122","DOIUrl":"https://doi.org/10.48550/arXiv.2211.16122","url":null,"abstract":"Sensor-based remote health monitoring is used in industrial, urban and healthcare settings to monitor ongoing operation of equipment and human health. An important aim is to intervene early if anomalous events or adverse health is detected. In the wild, these anomaly detection approaches are challenged by noise, label scarcity, high dimensionality, explainability and wide variability in operating environments. The Contextual Matrix Profile (CMP) is a configurable 2-dimensional version of the Matrix Profile (MP) that uses the distance matrix of all subsequences of a time series to discover patterns and anomalies. The CMP is shown to enhance the effectiveness of the MP and other SOTA methods at detecting, visualising and interpreting true anomalies in noisy real world data from different domains. It excels at zooming out and identifying temporal patterns at configurable time scales. However, the CMP does not address cross-sensor information, and cannot scale to high dimensional data. We propose a novel, self-supervised graph-based approach for temporal anomaly detection that works on context graphs generated from the CMP distance matrix. The learned graph embeddings encode the anomalous nature of a time context. In addition, we evaluate other graph outlier algorithms for the same task. Given our pipeline is modular, graph construction, generation of graph embeddings, and pattern recognition logic can all be chosen based on the specific pattern detection application. We verified the effectiveness of graph-based anomaly detection and compared it with the CMP and 3 state-of-the art methods on two real-world healthcare datasets with different anomalies. Our proposed method demonstrated better recall, alert rate and generalisability.","PeriodicalId":72437,"journal":{"name":"BMVC : proceedings of the British Machine Vision Conference. British Machine Vision Conference","volume":"78 1","pages":"854"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86054988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0