IEEE Transactions on Knowledge and Data Engineering最新文献

筛选
英文 中文
Efficient Algorithms for Minimizing the Kirchhoff Index via Adding Edges 通过添加边最小化Kirchhoff指数的有效算法
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-18 DOI: 10.1109/TKDE.2025.3552644
Xiaotian Zhou;Ahad N. Zehmakan;Zhongzhi Zhang
{"title":"Efficient Algorithms for Minimizing the Kirchhoff Index via Adding Edges","authors":"Xiaotian Zhou;Ahad N. Zehmakan;Zhongzhi Zhang","doi":"10.1109/TKDE.2025.3552644","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3552644","url":null,"abstract":"The Kirchhoff index, which is the sum of the resistance distance between every pair of nodes in a network, is a key metric for gauging network performance, where lower values signify enhanced performance. In this paper, we study the problem of minimizing the Kirchhoff index by adding edges. We first provide a greedy algorithm for solving this problem and give an analysis of its quality based on the bounds of the submodularity ratio and the curvature. Then, we introduce a gradient-based greedy algorithm as a new paradigm to solve this problem. To accelerate the computation cost, we leverage geometric properties, convex hull approximation, and approximation of the projected coordinate of each point. To further improve this algorithm, we use pre-pruning and fast update techniques, making it particularly suitable for large networks. Our proposed algorithms have nearly-linear time complexity. We provide extensive experiments on ten real networks to evaluate the quality of our algorithms. The results demonstrate that our proposed algorithms outperform the state-of-the-art methods in terms of efficiency and effectiveness. Moreover, our algorithms are scalable to large graphs with over 5 million nodes and 12 million edges.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3342-3355"},"PeriodicalIF":8.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LOFTune: A Low-Overhead and Flexible Approach for Spark SQL Configuration Tuning LOFTune:一种低开销和灵活的Spark SQL配置调优方法
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-18 DOI: 10.1109/TKDE.2025.3549232
Jiahui Li;Junhao Ye;Yuren Mao;Yunjun Gao;Lu Chen
{"title":"LOFTune: A Low-Overhead and Flexible Approach for Spark SQL Configuration Tuning","authors":"Jiahui Li;Junhao Ye;Yuren Mao;Yunjun Gao;Lu Chen","doi":"10.1109/TKDE.2025.3549232","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3549232","url":null,"abstract":"The query efficiency of Spark SQL is significantly impacted by its configurations. Therefore, configuration tuning has drawn great attention, and various automatic configuration tuning methods have been proposed. However, existing methods suffer from two issues: (1) high tuning overhead: they need to repeatedly execute the workloads several times to obtain the training samples, which is time-consuming; and (2) low throughput: they need to occupy resources like CPU cores and memory for a long time, causing other Spark SQL workloads to wait, thereby reducing the overall system throughput. These issues impede the use of automatic configuration tuning methods in practical systems which have limited tuning budget and many concurrent workloads. To address these issues, this paper proposes a <bold>L</b>ow-<bold>O</b>verhead and <bold>F</b>lexible approach for Spark SQL configuration <bold>Tuning</b>, dubbed <bold>LOFTune</b>. LOFTune reduces the tuning overhead via a sample-efficient optimization framework, which is proposed based on multi-task SQL representation learning and multi-armed bandit. Furthermore, LOFTune solves the low throughput issue with a recommendation-sampling-decoupled tuning framework. Extensive experiments validate the effectiveness of LOFTune. In the sampling-allowed case, LOFTune can save up to 90% of the workload runs comparing with the state-of-the-art methods. Besides, in the zero-sampling case, LOFTune can reduce up to 41.26% of latency.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3528-3542"},"PeriodicalIF":8.9,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zkfhed: A Verifiable and Scalable Blockchain-Enhanced Federated Learning System Zkfhed:一个可验证和可扩展的区块链增强联邦学习系统
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-17 DOI: 10.1109/TKDE.2025.3550546
Bingxue Zhang;Guangguang Lu;Yuncheng Wu;Kunpeng Ren;Feida Zhu
{"title":"Zkfhed: A Verifiable and Scalable Blockchain-Enhanced Federated Learning System","authors":"Bingxue Zhang;Guangguang Lu;Yuncheng Wu;Kunpeng Ren;Feida Zhu","doi":"10.1109/TKDE.2025.3550546","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3550546","url":null,"abstract":"Federated learning (FL) is an emerging paradigm that enables multiple clients to collaboratively train a machine learning (ML) model without the need to exchange their raw data. However, it relies on a centralized authority to coordinate participants’ activities. This not only interrupts the entire training task in case of a single point of failure, but also lacks an effective regulatory mechanism to prevent malicious behavior. Although blockchain, with its decentralized architecture and data immutability, has significantly advanced the development of FL, it still struggles to withstand poisoning attacks and faces limitations in computational scalability. We propose Zkfhed, a verifiable and scalable FL system that overcomes the limitations of blockchain-based FL in poison attacks and computational scalability. First, we propose a two-stage audit scheme based on zero-knowledge proofs (ZKPs), which verifies that the training data are extracted from trusted organizations and that computations on the data exactly follow the specified training protocols. Second, we propose a homomorphic encryption delegation learning (HEDL), based on fully homomorphic encryption (FHE). It is capable of outsourcing complex computing to external computing resources without sacrificing the client's data privacy. Final, extensive experiments on real-world datasets demonstrate that Zkfhed can effectively identify malicious clients and is highly efficient and scalable in terms of online time and communication efficiency.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3841-3854"},"PeriodicalIF":8.9,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143902652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiscale Weisfeiler-Leman Directed Graph Neural Networks for Prerequisite-Link Prediction 多尺度Weisfeiler-Leman有向图神经网络用于前提链路预测
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-17 DOI: 10.1109/TKDE.2025.3552045
Yupei Zhang;Xiran Qu;Shuhui Liu;Yan Pang;Xuequn Shang
{"title":"Multiscale Weisfeiler-Leman Directed Graph Neural Networks for Prerequisite-Link Prediction","authors":"Yupei Zhang;Xiran Qu;Shuhui Liu;Yan Pang;Xuequn Shang","doi":"10.1109/TKDE.2025.3552045","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3552045","url":null,"abstract":"Prerequisite-link Prediction (PLP) aims to discover the condition relations of a specific event or a concerned variable, which is a fundamental problem in a large number of fields, such as educational data mining. Current studies on PLP usually developed graph neural networks (GNNs) to learn the representations of pairs of nodes. However, these models fail to distinguish non-isomorphic graphs and integrate multiscale structures, leading to the insufficient expressive capability of GNNs. To this end, we in this paper proposed <italic>k</i>-dimensional Weisferiler-Leman directed GNNs, dubbed <italic>k</i>-WediGNNs, to recognize non-isomorphic graphs via the Weisferiler-Leman algorithm. Furthermore, we integrated the multiscale structures of a directed graph into <italic>k</i>-WediGNNs, dubbed multiscale <italic>k</i>-WediGNNs, from the bidirected views of in-degree and out-degree. With the Siamese network, the proposed models are extended to address the problem of PLP. Besides, the expressive power is then interpreted via theoretical proofs. The experiments were conducted on four publicly available datasets for concept prerequisite relation prediction (CPRP). The results show that the proposed models achieve better performance than the state-of-the-art approaches, where our multiscale <italic>k</i>-WediGNN achieves a new benchmark in the task of CPRP.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3556-3569"},"PeriodicalIF":8.9,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Final: Combining First-Order Logic With Natural Logic for Question Answering 最后:将一阶逻辑与自然逻辑结合起来进行问答
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-14 DOI: 10.1109/TKDE.2025.3551231
Jihao Shi;Xiao Ding;Siu Cheung Hui;Yuxiong Yan;Hengwei Zhao;Ting Liu;Bing Qin
{"title":"Final: Combining First-Order Logic With Natural Logic for Question Answering","authors":"Jihao Shi;Xiao Ding;Siu Cheung Hui;Yuxiong Yan;Hengwei Zhao;Ting Liu;Bing Qin","doi":"10.1109/TKDE.2025.3551231","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3551231","url":null,"abstract":"Many question-answering problems can be approached as textual entailment tasks, where the hypotheses are formed by the question and candidate answers, and the premises are derived from an external knowledge base. However, current neural methods often lack transparency in their decision-making processes. Moreover, first-order logic methods, while systematic, struggle to integrate unstructured external knowledge. To address these limitations, we propose a neuro-symbolic reasoning framework called <italic><small>Final</small></i>, which combines <underline><b>FI</b></u>rst-order logic with <underline><b>NA</b></u>tural <underline><b>L</b></u>ogic for question answering. Our framework utilizes <italic>first-order logic</i> to systematically decompose hypotheses and <italic>natural logic</i> to construct reasoning paths from premises to hypotheses, employing bidirectional reasoning to establish links along the reasoning path. This approach not only enhances interpretability but also effectively integrates unstructured knowledge. Our experiments on three benchmark datasets, namely QASC, WorldTree, and WikiHop, demonstrate that <sc>Final</small> outperforms existing methods in commonsense reasoning and reading comprehension tasks, achieving state-of-the-art results. Additionally, our framework also provides transparent reasoning paths that elucidate the rationale behind the correct decisions.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3103-3117"},"PeriodicalIF":8.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security 兴趣点推荐的调查:模型、体系结构和安全性
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-14 DOI: 10.1109/TKDE.2025.3551292
Qianru Zhang;Peng Yang;Junliang Yu;Haixin Wang;Xingwei He;Siu-Ming Yiu;Hongzhi Yin
{"title":"A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security","authors":"Qianru Zhang;Peng Yang;Junliang Yu;Haixin Wang;Xingwei He;Siu-Ming Yiu;Hongzhi Yin","doi":"10.1109/TKDE.2025.3551292","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3551292","url":null,"abstract":"The widespread adoption of smartphones and Location-Based Social Networks has led to a massive influx of spatio-temporal data, creating unparalleled opportunities for enhancing Point-of-Interest (POI) recommendation systems. These advanced POI systems are crucial for enriching user experiences, enabling personalized interactions, and optimizing decision-making processes in the digital landscape. However, existing surveys tend to focus on traditional approaches and few of them delve into cutting-edge developments, emerging architectures, as well as security considerations in POI recommendations. To address this gap, our survey stands out by offering a comprehensive, up-to-date review of POI recommendation systems, covering advancements in models, architectures, and security aspects. We systematically examine the transition from traditional models to advanced techniques such as large language models. Additionally, we explore the architectural evolution from centralized to decentralized and federated learning systems, highlighting the improvements in scalability and privacy. Furthermore, we address the increasing importance of security, examining potential vulnerabilities and privacy-preserving approaches. Our taxonomy provides a structured overview of the current state of POI recommendation, while we also identify promising directions for future research in this rapidly advancing field.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3153-3172"},"PeriodicalIF":8.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual Test-Time Training for Out-of-Distribution Recommender System 分布外推荐系统的双测试时间训练
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-14 DOI: 10.1109/TKDE.2025.3548160
Xihong Yang;Yiqi Wang;Jin Chen;Wenqi Fan;Xiangyu Zhao;En Zhu;Xinwang Liu;Defu Lian
{"title":"Dual Test-Time Training for Out-of-Distribution Recommender System","authors":"Xihong Yang;Yiqi Wang;Jin Chen;Wenqi Fan;Xiangyu Zhao;En Zhu;Xinwang Liu;Defu Lian","doi":"10.1109/TKDE.2025.3548160","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3548160","url":null,"abstract":"Deep learning has been widely applied in recommender systems, which has recently achieved revolutionary progress. However, most existing learning-based methods assume that the user and item distributions remain unchanged between the training phase and the test phase. However, the distribution of user and item features can naturally shift in real-world scenarios, potentially resulting in a substantial decrease in recommendation performance. This phenomenon can be formulated as an Out-Of-Distribution (OOD) recommendation problem. To address this challenge, we propose a novel <bold>D</b>ual <bold>T</b>est-<bold>T</b>ime-<bold>T</b>raining framework for <bold>O</b>OD <bold>R</b>ecommendation, termed <bold>DT3OR</b>. In DT3OR, we incorporate a model adaptation mechanism during the test-time phase to carefully update the recommendation model, allowing the model to adapt specially to the shifting user and item features. To be specific, we propose a self-distillation task and a contrastive task to assist the model learning both the user’s invariant interest preferences and the variant user/item characteristics during the test-time phase, thus facilitating a smooth adaptation to the shifting features. Furthermore, we provide theoretical analysis to support the rationale behind our dual test-time training framework. To the best of our knowledge, this paper is the first work to address OOD recommendation via a test-time-training strategy. We conduct experiments on five datasets with various backbones. Comprehensive experimental results have demonstrated the effectiveness of DT3OR compared to other state-of-the-art baselines.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3312-3326"},"PeriodicalIF":8.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pricing for Data Assets Based on Data Quality, Quantity and Utility on the Perspective of Consumer Heterogeneity 消费者异质性视角下基于数据质量、数量和效用的数据资产定价
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-14 DOI: 10.1109/TKDE.2025.3551401
Juanjuan Lin;Zhigang Huang;Yong Tang
{"title":"Pricing for Data Assets Based on Data Quality, Quantity and Utility on the Perspective of Consumer Heterogeneity","authors":"Juanjuan Lin;Zhigang Huang;Yong Tang","doi":"10.1109/TKDE.2025.3551401","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3551401","url":null,"abstract":"It is an inevitable trend for the development of global digital economy to transform data into data assets and realize their transaction circulation. Aiming at the release of data value and the development of its transaction process, the concept of integrated score of data is proposed by combining integrated quality index containing four dimensions with data quantity. On this basis, data assets are priced according to the principle of profit maximization by constructing a nonlinear programming model. Among them, three types of pricing models are divided according to the heterogeneity of consumers’ utility sensitivity, and the consumers’ wiilingness to pay are adjusted based on business parameters using FAHP system. The proposed model is verified with the data of China's carbon emissions as the original data, combined with the KNN machine learning algorithm and a series of simulation analyses. In addition, multiple sets of heterogeneous data are tested. The results show that the quality, quantity and utility of data have an important impact on the pricing of data assets, and it is necessary to divide the utility sensitivity of consumers as well as take business parameters into consideration. The model proposed can also provide decision-making reference for data platforms.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3641-3652"},"PeriodicalIF":8.9,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DRLPG: Reinforced Opponent-Aware Order Pricing for Hub Mobility Services DRLPG:增强对手感知的枢纽移动服务订单定价
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-13 DOI: 10.1109/TKDE.2025.3551147
Zuohan Wu;Chen Jason Zhang;Han Yin;Rui Meng;Libin Zheng;Huaijie Zhu;Wei Liu
{"title":"DRLPG: Reinforced Opponent-Aware Order Pricing for Hub Mobility Services","authors":"Zuohan Wu;Chen Jason Zhang;Han Yin;Rui Meng;Libin Zheng;Huaijie Zhu;Wei Liu","doi":"10.1109/TKDE.2025.3551147","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3551147","url":null,"abstract":"A modern service model known as the “hub-oriented” model has emerged with the development of mobility services. This model allows users to request vehicles from multiple companies (agents) simultaneously through a unified entry (a ‘hub’). In contrast to conventional services, the “hub-oriented” model emphasizes pricing competition. To address this scenario, an agent should consider its competitors when developing its pricing strategy. In this paper, we introduce DRLPG, a mixed opponent-aware pricing method, which consists of two main components: the two-stage guarantor and the end-to-end deep reinforcement learning (DRL) module, as well as interaction mechanisms. In the guarantor, we design a prediction-decision framework. Specifically, we propose a new objective function for the spatiotemporal neural network in the prediction stage and utilize a traditional reinforcement learning method in the decision stage, respectively. In the end-to-end DRL framework, we explore the adoption of conventional DRL in the “hub-oriented” scenario. Finally, a meta-decider and an experience-sharing mechanism are proposed to combine both methods and leverage their advantages. We conduct extensive experiments on real data, and DRLPG achieves an average improvement of 99.9% and 61.1% in the peak and low peak periods, respectively. Our results demonstrate the effectiveness of our approach compared to the baseline.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3298-3311"},"PeriodicalIF":8.9,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hard or False: Keep the Balance for Negative Sampling in Knowledge Graphs 硬或假:在知识图谱中保持负抽样的平衡
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-12 DOI: 10.1109/TKDE.2025.3550545
Feihu Che;Jianhua Tao;Qionghai Dai
{"title":"Hard or False: Keep the Balance for Negative Sampling in Knowledge Graphs","authors":"Feihu Che;Jianhua Tao;Qionghai Dai","doi":"10.1109/TKDE.2025.3550545","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3550545","url":null,"abstract":"Negative sampling is an essential part in knowledge graph embedding, which offers significant advantages to numerous downstream related tasks. There are two kinds of important negatives: hard and false negatives. Hard negatives are the negatives which are difficult to distinguish from positive samples, while false negatives are positive samples which are mistakenly identified as negatives. Harnessing hard negatives effectively can make the model more discriminative, and reducing false negatives can avoid misleading the model during training. Therefore, the two kinds of negatives are essential in high-quality negative sampling. However, the present negative sampling methods face two shortcomings: 1.judging one negative is hard or false mainly relies on score functions; 2. difficulty in balancing the impact of hard and false negatives. In this paper, we absorb bigram language model and propose a novel criterion to help verify the negatives are hard or false, and discuss how to keep the balance between hard and false negatives. Experiments on four representative score functions and two public datasets demonstrate the effects of the proposed negative sampling method.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3445-3456"},"PeriodicalIF":8.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信