NGLinker：无节点特征网络的链路预测

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Big Data Research Pub Date : 2025-08-18 DOI:10.1016/j.bdr.2025.100558

Yong Li , Jingpeng Wu , Zhongying Zhang

{"title":"NGLinker：无节点特征网络的链路预测","authors":"Yong Li , Jingpeng Wu , Zhongying Zhang","doi":"10.1016/j.bdr.2025.100558","DOIUrl":null,"url":null,"abstract":"<div><div>Link prediction is a paradigmatic problem with tremendous real-world applications in network science, which aims to infer missing links or future links based on currently observed partial nodes and links. However, conventional link prediction models are based on network structure, with relatively low prediction accuracy and lack universality and scalability. The performance of link prediction based on machine learning and artificial features is greatly influenced by subjective consciousness. Although graph embedding learning (GEL) models can avoid these shortcomings, it still poses some challenges. Because GEL models are generally based on random walks and graph neural networks (GNNs), their prediction accuracy is relatively ineffective, making them unsuitable for revealing hidden information in node featureless networks. To address these challenges, we present NGLinker, a new link prediction model based on Node2vec and GraphSage, which can reconcile the performance and accuracy in a node featureless network. Rather than learning node features with label information, NGLinker depends only on the local network structure. Quantitatively, we observe superior prediction accuracy of NGLinker and lab test imputations compared to the state-of-the-art models, which strongly supports that using NGLinker to predict three public networks and one private network and then conduct prediction results is feasible and effective. The NGLinker can not only achieve prediction accuracy in terms of precision and area under the receiver operating characteristic curve (AUC) but also acquire strong universality and scalability. The NGLinker model enlarges the application of the GNNs to node featureless networks.</div></div>","PeriodicalId":56017,"journal":{"name":"Big Data Research","volume":"41 ","pages":"Article 100558"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NGLinker: Link prediction for node featureless networks\",\"authors\":\"Yong Li , Jingpeng Wu , Zhongying Zhang\",\"doi\":\"10.1016/j.bdr.2025.100558\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Link prediction is a paradigmatic problem with tremendous real-world applications in network science, which aims to infer missing links or future links based on currently observed partial nodes and links. However, conventional link prediction models are based on network structure, with relatively low prediction accuracy and lack universality and scalability. The performance of link prediction based on machine learning and artificial features is greatly influenced by subjective consciousness. Although graph embedding learning (GEL) models can avoid these shortcomings, it still poses some challenges. Because GEL models are generally based on random walks and graph neural networks (GNNs), their prediction accuracy is relatively ineffective, making them unsuitable for revealing hidden information in node featureless networks. To address these challenges, we present NGLinker, a new link prediction model based on Node2vec and GraphSage, which can reconcile the performance and accuracy in a node featureless network. Rather than learning node features with label information, NGLinker depends only on the local network structure. Quantitatively, we observe superior prediction accuracy of NGLinker and lab test imputations compared to the state-of-the-art models, which strongly supports that using NGLinker to predict three public networks and one private network and then conduct prediction results is feasible and effective. The NGLinker can not only achieve prediction accuracy in terms of precision and area under the receiver operating characteristic curve (AUC) but also acquire strong universality and scalability. The NGLinker model enlarges the application of the GNNs to node featureless networks.</div></div>\",\"PeriodicalId\":56017,\"journal\":{\"name\":\"Big Data Research\",\"volume\":\"41 \",\"pages\":\"Article 100558\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Big Data Research\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S221457962500053X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Big Data Research","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221457962500053X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

链路预测是网络科学中一个具有广泛实际应用的典型问题，其目的是根据当前观察到的部分节点和链路推断缺失的链路或未来的链路。然而，传统的链路预测模型是基于网络结构的，预测精度较低，缺乏通用性和可扩展性。基于机器学习和人工特征的链接预测的性能受主观意识的影响很大。虽然图嵌入学习（GEL）模型可以避免这些缺点，但它仍然存在一些挑战。由于GEL模型通常基于随机行走和图神经网络（gnn），其预测精度相对较低，不适合在无节点特征网络中揭示隐藏信息。为了解决这些问题，我们提出了一种新的基于Node2vec和GraphSage的链路预测模型NGLinker，它可以在无节点特征网络中协调性能和准确性。与使用标签信息学习节点特征不同，NGLinker只依赖于局部网络结构。在定量上，我们观察到NGLinker和实验室测试估算的预测精度优于当前最先进的模型，这有力地支持了使用NGLinker预测三个公网和一个专网并进行预测结果的可行性和有效性。该nglink不仅能在精度和接收机工作特性曲线下面积上达到预测精度，而且具有较强的通用性和可扩展性。NGLinker模型扩大了gnn在无节点特征网络中的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

NGLinker: Link prediction for node featureless networks

Link prediction is a paradigmatic problem with tremendous real-world applications in network science, which aims to infer missing links or future links based on currently observed partial nodes and links. However, conventional link prediction models are based on network structure, with relatively low prediction accuracy and lack universality and scalability. The performance of link prediction based on machine learning and artificial features is greatly influenced by subjective consciousness. Although graph embedding learning (GEL) models can avoid these shortcomings, it still poses some challenges. Because GEL models are generally based on random walks and graph neural networks (GNNs), their prediction accuracy is relatively ineffective, making them unsuitable for revealing hidden information in node featureless networks. To address these challenges, we present NGLinker, a new link prediction model based on Node2vec and GraphSage, which can reconcile the performance and accuracy in a node featureless network. Rather than learning node features with label information, NGLinker depends only on the local network structure. Quantitatively, we observe superior prediction accuracy of NGLinker and lab test imputations compared to the state-of-the-art models, which strongly supports that using NGLinker to predict three public networks and one private network and then conduct prediction results is feasible and effective. The NGLinker can not only achieve prediction accuracy in terms of precision and area under the receiver operating characteristic curve (AUC) but also acquire strong universality and scalability. The NGLinker model enlarges the application of the GNNs to node featureless networks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Big Data Research Computer Science-Computer Science Applications

CiteScore

8.40

自引率

3.00%

发文量

期刊介绍： The journal aims to promote and communicate advances in big data research by providing a fast and high quality forum for researchers, practitioners and policy makers from the very many different communities working on, and with, this topic. The journal will accept papers on foundational aspects in dealing with big data, as well as papers on specific Platforms and Technologies used to deal with big data. To promote Data Science and interdisciplinary collaboration between fields, and to showcase the benefits of data driven research, papers demonstrating applications of big data in domains as diverse as Geoscience, Social Web, Finance, e-Commerce, Health Care, Environment and Climate, Physics and Astronomy, Chemistry, life sciences and drug discovery, digital libraries and scientific publications, security and government will also be considered. Occasionally the journal may publish whitepapers on policies, standards and best practices.