Semi-supervised sparse metric learning using alternating linearization optimization

Wei Liu, Shiqian Ma, D. Tao, Jianzhuang Liu, Peng Liu
{"title":"Semi-supervised sparse metric learning using alternating linearization optimization","authors":"Wei Liu, Shiqian Ma, D. Tao, Jianzhuang Liu, Peng Liu","doi":"10.1145/1835804.1835947","DOIUrl":null,"url":null,"abstract":"In plenty of scenarios, data can be represented as vectors and then mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining applications need proximity measures over data, a simple and universal distance metric is desirable, and metric learning methods have been explored to produce sensible distance measures consistent with data relationship. However, most existing methods suffer from limited labeled data and expensive training. In this paper, we address these two issues through employing abundant unlabeled data and pursuing sparsity of metrics, resulting in a novel metric learning approach called semi-supervised sparse metric learning. Two important contributions of our approach are: 1) it propagates scarce prior affinities between data to the global scope and incorporates the full affinities into the metric learning; and 2) it uses an efficient alternating linearization method to directly optimize the sparse metric. Compared with conventional methods, ours can effectively take advantage of semi-supervision and automatically discover the sparse metric structure underlying input data patterns. We demonstrate the efficacy of the proposed approach with extensive experiments carried out on six datasets, obtaining clear performance gains over the state-of-the-arts.","PeriodicalId":20529,"journal":{"name":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1835804.1835947","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 61

Abstract

In plenty of scenarios, data can be represented as vectors and then mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining applications need proximity measures over data, a simple and universal distance metric is desirable, and metric learning methods have been explored to produce sensible distance measures consistent with data relationship. However, most existing methods suffer from limited labeled data and expensive training. In this paper, we address these two issues through employing abundant unlabeled data and pursuing sparsity of metrics, resulting in a novel metric learning approach called semi-supervised sparse metric learning. Two important contributions of our approach are: 1) it propagates scarce prior affinities between data to the global scope and incorporates the full affinities into the metric learning; and 2) it uses an efficient alternating linearization method to directly optimize the sparse metric. Compared with conventional methods, ours can effectively take advantage of semi-supervision and automatically discover the sparse metric structure underlying input data patterns. We demonstrate the efficacy of the proposed approach with extensive experiments carried out on six datasets, obtaining clear performance gains over the state-of-the-arts.
使用交替线性化优化的半监督稀疏度量学习
在许多情况下,数据可以表示为向量,然后在数学上抽象为欧几里得空间中的点。由于大量机器学习和数据挖掘应用需要对数据进行接近度量,因此需要一种简单而通用的距离度量,度量学习方法已被探索用于产生与数据关系一致的合理距离度量。然而,大多数现有方法都存在标记数据有限和训练费用昂贵的问题。在本文中,我们通过使用大量的未标记数据和追求度量的稀疏性来解决这两个问题,从而产生了一种新的度量学习方法,称为半监督稀疏度量学习。该方法的两个重要贡献是:1)它将数据之间的稀缺先验亲和力传播到全局范围,并将完整的亲和力纳入度量学习;2)采用一种高效的交替线性化方法直接优化稀疏度量。与传统方法相比,我们的方法可以有效地利用半监督的优势,自动发现输入数据模式下的稀疏度量结构。我们在六个数据集上进行了广泛的实验,证明了所提出方法的有效性,在最先进的情况下获得了明显的性能提升。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信