基于基数约束的链路预测

Jiawei Zhang, Jianhui Chen, Junxing Zhu, Yi Chang, Philip S. Yu
{"title":"基于基数约束的链路预测","authors":"Jiawei Zhang, Jianhui Chen, Junxing Zhu, Yi Chang, Philip S. Yu","doi":"10.1145/3018661.3018734","DOIUrl":null,"url":null,"abstract":"Inferring the links among entities in networks is an important research problem for various disciplines. Depending on the specific application settings, the links to be inferred are usually subject to different cardinality constraints, like one-to-one, one-to-many and many-to-many. However, most existing research works on link prediction problems fail to consider such a kind of constraint. In this paper, we propose to study the link prediction problem with general cardinality constraints, which is formally defined as the CLP (Cardinality Constrained Link Prediction) problem. By minimizing the projection loss of links from feature vectors to labels, the CLP problem is formulated as an optimization problem involving multiple variables, where the cardinality constraints are modeled as mathematical constraints on node degrees. The objective function is shown to be not jointly convex and the optimal solution subject to the cardinality constraints can be very time-consuming to achieve. To solve the optimization problem, an iterative variable updating based link prediction framework ITERCLIPS (Iterative Constrained Link Prediction & Selection) is introduced in this paper, which involves the steps on link updating and selection alternatively. To overcome the high time cost problem, a greedy link selection step is introduced in this paper, which picks links greedily while preserving the link cardinality constraints simultaneously. Meanwhile, to ensure the effectiveness of ITERCLIPS on large-scale networks, a distributed implementation of ITERCLIPS is further presented as a scalable solution to the CLP problem. Extensive experiments have been done on three real-world network datasets with different types of cardinality constraints, and the experimental results achieved by ITERCLIPS on all these datasets can demonstrate the effectiveness and advantages of ITERCLIPS in solving the CLP problem.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"143 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"Link Prediction with Cardinality Constraint\",\"authors\":\"Jiawei Zhang, Jianhui Chen, Junxing Zhu, Yi Chang, Philip S. Yu\",\"doi\":\"10.1145/3018661.3018734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Inferring the links among entities in networks is an important research problem for various disciplines. Depending on the specific application settings, the links to be inferred are usually subject to different cardinality constraints, like one-to-one, one-to-many and many-to-many. However, most existing research works on link prediction problems fail to consider such a kind of constraint. In this paper, we propose to study the link prediction problem with general cardinality constraints, which is formally defined as the CLP (Cardinality Constrained Link Prediction) problem. By minimizing the projection loss of links from feature vectors to labels, the CLP problem is formulated as an optimization problem involving multiple variables, where the cardinality constraints are modeled as mathematical constraints on node degrees. The objective function is shown to be not jointly convex and the optimal solution subject to the cardinality constraints can be very time-consuming to achieve. To solve the optimization problem, an iterative variable updating based link prediction framework ITERCLIPS (Iterative Constrained Link Prediction & Selection) is introduced in this paper, which involves the steps on link updating and selection alternatively. To overcome the high time cost problem, a greedy link selection step is introduced in this paper, which picks links greedily while preserving the link cardinality constraints simultaneously. Meanwhile, to ensure the effectiveness of ITERCLIPS on large-scale networks, a distributed implementation of ITERCLIPS is further presented as a scalable solution to the CLP problem. Extensive experiments have been done on three real-world network datasets with different types of cardinality constraints, and the experimental results achieved by ITERCLIPS on all these datasets can demonstrate the effectiveness and advantages of ITERCLIPS in solving the CLP problem.\",\"PeriodicalId\":344017,\"journal\":{\"name\":\"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining\",\"volume\":\"143 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3018661.3018734\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018661.3018734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

推断网络中实体之间的联系是各学科的重要研究问题。根据特定的应用程序设置,要推断的链接通常受到不同的基数约束,比如一对一、一对多和多对多。然而,大多数现有的链路预测问题的研究工作都没有考虑到这种约束。本文提出研究具有一般基数约束的链路预测问题,将其正式定义为CLP (cardinality Constrained link prediction)问题。通过最小化从特征向量到标签的链接的投影损失,CLP问题被表述为涉及多个变量的优化问题,其中基数约束被建模为节点度的数学约束。目标函数不是联合凸的,并且受基数约束的最优解可能非常耗时。为了解决这一优化问题,本文引入了一种基于迭代变量更新的链路预测框架ITERCLIPS(迭代约束链路预测与选择),该框架将链路更新与选择交替进行。为了克服高时间成本问题,本文引入了贪婪的链路选择步骤,在保持链路基数约束的同时贪婪地选择链路。同时,为了保证ITERCLIPS在大规模网络中的有效性,进一步提出了ITERCLIPS的分布式实现,作为CLP问题的可扩展解决方案。在三个具有不同基数约束类型的真实网络数据集上进行了大量的实验,ITERCLIPS在这些数据集上获得的实验结果可以证明ITERCLIPS在解决CLP问题方面的有效性和优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Link Prediction with Cardinality Constraint
Inferring the links among entities in networks is an important research problem for various disciplines. Depending on the specific application settings, the links to be inferred are usually subject to different cardinality constraints, like one-to-one, one-to-many and many-to-many. However, most existing research works on link prediction problems fail to consider such a kind of constraint. In this paper, we propose to study the link prediction problem with general cardinality constraints, which is formally defined as the CLP (Cardinality Constrained Link Prediction) problem. By minimizing the projection loss of links from feature vectors to labels, the CLP problem is formulated as an optimization problem involving multiple variables, where the cardinality constraints are modeled as mathematical constraints on node degrees. The objective function is shown to be not jointly convex and the optimal solution subject to the cardinality constraints can be very time-consuming to achieve. To solve the optimization problem, an iterative variable updating based link prediction framework ITERCLIPS (Iterative Constrained Link Prediction & Selection) is introduced in this paper, which involves the steps on link updating and selection alternatively. To overcome the high time cost problem, a greedy link selection step is introduced in this paper, which picks links greedily while preserving the link cardinality constraints simultaneously. Meanwhile, to ensure the effectiveness of ITERCLIPS on large-scale networks, a distributed implementation of ITERCLIPS is further presented as a scalable solution to the CLP problem. Extensive experiments have been done on three real-world network datasets with different types of cardinality constraints, and the experimental results achieved by ITERCLIPS on all these datasets can demonstrate the effectiveness and advantages of ITERCLIPS in solving the CLP problem.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信