Unsupervised User Identity Linkage via Factoid Embedding

Wei Xie, Xin Mu, R. Lee, Feida Zhu, Ee-Peng Lim
{"title":"Unsupervised User Identity Linkage via Factoid Embedding","authors":"Wei Xie, Xin Mu, R. Lee, Feida Zhu, Ee-Peng Lim","doi":"10.1109/ICDM.2018.00182","DOIUrl":null,"url":null,"abstract":"User identity linkage (UIL), the problem of matching user account across multiple online social networks (OSNs), is widely studied and important to many real-world applications. Most existing UIL solutions adopt a supervised or semi-supervised approach which generally suffer from scarcity of labeled data. In this paper, we propose Factoid Embedding, a novel framework that adopts an unsupervised approach. It is designed to cope with different profile attributes, content types and network links of different OSNs. The key idea is that each piece of information about a user identity describes the real identity owner, and thus distinguishes the owner from other users. We represent such a piece of information by a factoid and model it as a triplet consisting of user identity, predicate, and an object or another user identity. By embedding these factoids, we learn the user identity latent representations and link two user identities from different OSNs if they are close to each other in the user embedding space. Our Factoid Embedding algorithm is designed such that as we learn the embedding space, each embedded factoid is \"translated\" into a motion in the user embedding space to bring similar user identities closer, and different user identities further apart. Extensive experiments are conducted to evaluate Factoid Embedding on two real-world OSNs data sets. The experiment results show that Factoid Embedding outperforms the state-of-the-art methods even without training data.","PeriodicalId":286444,"journal":{"name":"2018 IEEE International Conference on Data Mining (ICDM)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining (ICDM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2018.00182","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

Abstract

User identity linkage (UIL), the problem of matching user account across multiple online social networks (OSNs), is widely studied and important to many real-world applications. Most existing UIL solutions adopt a supervised or semi-supervised approach which generally suffer from scarcity of labeled data. In this paper, we propose Factoid Embedding, a novel framework that adopts an unsupervised approach. It is designed to cope with different profile attributes, content types and network links of different OSNs. The key idea is that each piece of information about a user identity describes the real identity owner, and thus distinguishes the owner from other users. We represent such a piece of information by a factoid and model it as a triplet consisting of user identity, predicate, and an object or another user identity. By embedding these factoids, we learn the user identity latent representations and link two user identities from different OSNs if they are close to each other in the user embedding space. Our Factoid Embedding algorithm is designed such that as we learn the embedding space, each embedded factoid is "translated" into a motion in the user embedding space to bring similar user identities closer, and different user identities further apart. Extensive experiments are conducted to evaluate Factoid Embedding on two real-world OSNs data sets. The experiment results show that Factoid Embedding outperforms the state-of-the-art methods even without training data.
基于Factoid嵌入的无监督用户身份链接
用户身份链接(User identity linkage, UIL)是指跨多个在线社交网络(online social network, OSNs)匹配用户账户的问题,它被广泛研究,对许多现实应用都很重要。大多数现有的UIL解决方案采用监督或半监督方法,通常受到标记数据稀缺的影响。在本文中,我们提出了一种采用无监督方法的新框架Factoid Embedding。针对不同osn的不同配置文件属性、不同内容类型、不同网络链路而设计。关键思想是,关于用户身份的每条信息都描述了真正的身份所有者,从而将所有者与其他用户区分开来。我们用factoid表示这样的信息,并将其建模为由用户标识、谓词和对象或另一个用户标识组成的三元组。通过嵌入这些factoid,我们学习用户身份的潜在表征,并在用户嵌入空间中,如果两个用户身份在不同的osn中彼此接近,我们将它们链接起来。我们的Factoid嵌入算法是这样设计的:当我们学习嵌入空间时,每个嵌入的Factoid被“翻译”成用户嵌入空间中的运动,从而使相似的用户身份更接近,而不同的用户身份更远。在两个真实的osn数据集上进行了大量的实验来评估Factoid嵌入。实验结果表明,即使没有训练数据,Factoid嵌入也优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信