Unsupervised domain adaptation without source domain training samples: a maximum margin clustering based approach

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing Pub Date : 2016-12-18 DOI:10.1145/3009977.3010033

Sudipan Saha, Biplab Banerjee, S. Merchant

{"title":"Unsupervised domain adaptation without source domain training samples: a maximum margin clustering based approach","authors":"Sudipan Saha, Biplab Banerjee, S. Merchant","doi":"10.1145/3009977.3010033","DOIUrl":null,"url":null,"abstract":"Unsupervised domain adaptation (DA) techniques inherently assume the presence of ample amount of source domain training samples in addition to the target domain test data. The domains are characterized by domain-specific probability distributions governing the data which are substantially different from each other. The goal is to build a task oriented classifier model that performs proportionately in both the domains. In contrary to the standard unsupervised DA setup, we propose a maximum-margin clustering (MMC) based framework for the same which does not consider source domain labeled samples. Instead we formulate it as a joint clustering problem of all the samples from both the domains in a common feature subspace. The Geodesic Flow Kernel (GFK) based subspace projection technique in the Grassmannian manifold is adopted to cast the samples in a domain invariant space. Further, the MMC stage is followed to simultaneously group the data based on the maximization of margins and a classifier is learned for each group. The data overlapping problem is taken care of by specifically learning a SVM-KNN classifier for the potentially unreliable samples per group. We validate the framework on a pair of remote sensing images of different modalities for the purpose of land-cover classification and a generic object dataset for recognition. We observe that the proposed method exhibits performances at par with the fully supervised case for both the tasks but without the requirement of costly annotations.","PeriodicalId":93806,"journal":{"name":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","volume":"46 1","pages":"56:1-56:8"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3009977.3010033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Unsupervised domain adaptation (DA) techniques inherently assume the presence of ample amount of source domain training samples in addition to the target domain test data. The domains are characterized by domain-specific probability distributions governing the data which are substantially different from each other. The goal is to build a task oriented classifier model that performs proportionately in both the domains. In contrary to the standard unsupervised DA setup, we propose a maximum-margin clustering (MMC) based framework for the same which does not consider source domain labeled samples. Instead we formulate it as a joint clustering problem of all the samples from both the domains in a common feature subspace. The Geodesic Flow Kernel (GFK) based subspace projection technique in the Grassmannian manifold is adopted to cast the samples in a domain invariant space. Further, the MMC stage is followed to simultaneously group the data based on the maximization of margins and a classifier is learned for each group. The data overlapping problem is taken care of by specifically learning a SVM-KNN classifier for the potentially unreliable samples per group. We validate the framework on a pair of remote sensing images of different modalities for the purpose of land-cover classification and a generic object dataset for recognition. We observe that the proposed method exhibits performances at par with the fully supervised case for both the tasks but without the requirement of costly annotations.

查看原文本刊更多论文

无源域训练样本的无监督域自适应:基于最大边际聚类的方法

无监督域自适应(DA)技术固有地假设除了目标域测试数据之外，还存在大量的源域训练样本。这些域的特征是控制数据的特定于域的概率分布，这些分布彼此之间有很大的不同。目标是构建一个面向任务的分类器模型，在这两个领域中按比例执行。与标准的无监督DA设置相反，我们提出了一个基于最大边际聚类(MMC)的框架，该框架不考虑源域标记样本。相反，我们将其表述为两个域的所有样本在公共特征子空间中的联合聚类问题。采用基于测地线流核(GFK)的格拉斯曼流形子空间投影技术将样本投影到域不变空间。此外，遵循MMC阶段，根据边界最大化同时对数据进行分组，并为每组学习一个分类器。通过对每组可能不可靠的样本学习SVM-KNN分类器来处理数据重叠问题。我们在一对不同模式的遥感图像上验证了该框架，用于土地覆盖分类和用于识别的通用目标数据集。我们观察到，所提出的方法在两种任务中都表现出与完全监督情况相当的性能，但不需要昂贵的注释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. Indian Conference on Computer Vision, Graphics & Image Processing

自引率

0.00%

发文量