Cross-Modal Knowledge Transfer Without Task-Relevant Source Data

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision Pub Date : 2022-09-08 DOI:10.48550/arXiv.2209.04027

Sk. Miraj Ahmed, Suhas Lohit, Kuan-Chuan Peng, Michael Jones, A. Roy-Chowdhury

{"title":"Cross-Modal Knowledge Transfer Without Task-Relevant Source Data","authors":"Sk. Miraj Ahmed, Suhas Lohit, Kuan-Chuan Peng, Michael Jones, A. Roy-Chowdhury","doi":"10.48550/arXiv.2209.04027","DOIUrl":null,"url":null,"abstract":"Cost-effective depth and infrared sensors as alternatives to usual RGB sensors are now a reality, and have some advantages over RGB in domains like autonomous navigation and remote sensing. As such, building computer vision and deep learning systems for depth and infrared data are crucial. However, large labeled datasets for these modalities are still lacking. In such cases, transferring knowledge from a neural network trained on a well-labeled large dataset in the source modality (RGB) to a neural network that works on a target modality (depth, infrared, etc.) is of great value. For reasons like memory and privacy, it may not be possible to access the source data, and knowledge transfer needs to work with only the source models. We describe an effective solution, SOCKET: SOurce-free Cross-modal KnowledgE Transfer for this challenging task of transferring knowledge from one source modality to a different target modality without access to task-relevant source data. The framework reduces the modality gap using paired task-irrelevant data, as well as by matching the mean and variance of the target features with the batch-norm statistics that are present in the source models. We show through extensive experiments that our method significantly outperforms existing source-free methods for classification tasks which do not account for the modality gap.","PeriodicalId":72676,"journal":{"name":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","volume":"7 1","pages":"111-127"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2209.04027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Cost-effective depth and infrared sensors as alternatives to usual RGB sensors are now a reality, and have some advantages over RGB in domains like autonomous navigation and remote sensing. As such, building computer vision and deep learning systems for depth and infrared data are crucial. However, large labeled datasets for these modalities are still lacking. In such cases, transferring knowledge from a neural network trained on a well-labeled large dataset in the source modality (RGB) to a neural network that works on a target modality (depth, infrared, etc.) is of great value. For reasons like memory and privacy, it may not be possible to access the source data, and knowledge transfer needs to work with only the source models. We describe an effective solution, SOCKET: SOurce-free Cross-modal KnowledgE Transfer for this challenging task of transferring knowledge from one source modality to a different target modality without access to task-relevant source data. The framework reduces the modality gap using paired task-irrelevant data, as well as by matching the mean and variance of the target features with the batch-norm statistics that are present in the source models. We show through extensive experiments that our method significantly outperforms existing source-free methods for classification tasks which do not account for the modality gap.

查看原文本刊更多论文

无任务相关源数据的跨模态知识转移

具有成本效益的深度和红外传感器作为常规RGB传感器的替代品现在已经成为现实，并且在自主导航和遥感等领域比RGB具有一些优势。因此，为深度和红外数据构建计算机视觉和深度学习系统至关重要。然而，这些模式的大型标记数据集仍然缺乏。在这种情况下，将在源模态(RGB)中标记良好的大型数据集上训练的神经网络的知识转移到在目标模态(深度，红外等)上工作的神经网络是很有价值的。由于内存和隐私等原因，可能无法访问源数据，并且知识转移只需要使用源模型。我们描述了一个有效的解决方案，SOCKET:无源跨模态知识转移，用于在不访问任务相关源数据的情况下将知识从一个源模态转移到另一个目标模态的具有挑战性的任务。该框架使用与任务无关的成对数据，以及通过将目标特征的均值和方差与源模型中存在的批规范统计数据相匹配，减少了模态差距。我们通过大量的实验表明，我们的方法在分类任务中显著优于现有的无源方法，这些方法不考虑模态差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer vision - ECCV ... : ... European Conference on Computer Vision : proceedings. European Conference on Computer Vision

自引率

0.00%

发文量