Semi-supervised sparse metric learning using alternating linearization optimization

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining Pub Date : 2010-07-25 DOI:10.1145/1835804.1835947

Wei Liu, Shiqian Ma, D. Tao, Jianzhuang Liu, Peng Liu

{"title":"Semi-supervised sparse metric learning using alternating linearization optimization","authors":"Wei Liu, Shiqian Ma, D. Tao, Jianzhuang Liu, Peng Liu","doi":"10.1145/1835804.1835947","DOIUrl":null,"url":null,"abstract":"In plenty of scenarios, data can be represented as vectors and then mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining applications need proximity measures over data, a simple and universal distance metric is desirable, and metric learning methods have been explored to produce sensible distance measures consistent with data relationship. However, most existing methods suffer from limited labeled data and expensive training. In this paper, we address these two issues through employing abundant unlabeled data and pursuing sparsity of metrics, resulting in a novel metric learning approach called semi-supervised sparse metric learning. Two important contributions of our approach are: 1) it propagates scarce prior affinities between data to the global scope and incorporates the full affinities into the metric learning; and 2) it uses an efficient alternating linearization method to directly optimize the sparse metric. Compared with conventional methods, ours can effectively take advantage of semi-supervision and automatically discover the sparse metric structure underlying input data patterns. We demonstrate the efficacy of the proposed approach with extensive experiments carried out on six datasets, obtaining clear performance gains over the state-of-the-arts.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":"70 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1835804.1835947","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 61

Abstract

In plenty of scenarios, data can be represented as vectors and then mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining applications need proximity measures over data, a simple and universal distance metric is desirable, and metric learning methods have been explored to produce sensible distance measures consistent with data relationship. However, most existing methods suffer from limited labeled data and expensive training. In this paper, we address these two issues through employing abundant unlabeled data and pursuing sparsity of metrics, resulting in a novel metric learning approach called semi-supervised sparse metric learning. Two important contributions of our approach are: 1) it propagates scarce prior affinities between data to the global scope and incorporates the full affinities into the metric learning; and 2) it uses an efficient alternating linearization method to directly optimize the sparse metric. Compared with conventional methods, ours can effectively take advantage of semi-supervision and automatically discover the sparse metric structure underlying input data patterns. We demonstrate the efficacy of the proposed approach with extensive experiments carried out on six datasets, obtaining clear performance gains over the state-of-the-arts.

查看原文本刊更多论文

使用交替线性化优化的半监督稀疏度量学习

在许多情况下，数据可以表示为向量，然后在数学上抽象为欧几里得空间中的点。由于大量机器学习和数据挖掘应用需要对数据进行接近度量，因此需要一种简单而通用的距离度量，度量学习方法已被探索用于产生与数据关系一致的合理距离度量。然而，大多数现有方法都存在标记数据有限和训练费用昂贵的问题。在本文中，我们通过使用大量的未标记数据和追求度量的稀疏性来解决这两个问题，从而产生了一种新的度量学习方法，称为半监督稀疏度量学习。该方法的两个重要贡献是:1)它将数据之间的稀缺先验亲和力传播到全局范围，并将完整的亲和力纳入度量学习;2)采用一种高效的交替线性化方法直接优化稀疏度量。与传统方法相比，我们的方法可以有效地利用半监督的优势，自动发现输入数据模式下的稀疏度量结构。我们在六个数据集上进行了广泛的实验，证明了所提出方法的有效性，在最先进的情况下获得了明显的性能提升。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining

自引率

0.00%

发文量