Knowledge-Based Systems最新文献

筛选
英文 中文
Synthesizing global and local perspectives in contrastive learning for graph anomaly detection
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-10 DOI: 10.1016/j.knosys.2025.113289
Qiqi Yang, Hang Yu, Zhengyang Liu, Pengbo Li, Xue Chen, Xiangfeng Luo
{"title":"Synthesizing global and local perspectives in contrastive learning for graph anomaly detection","authors":"Qiqi Yang,&nbsp;Hang Yu,&nbsp;Zhengyang Liu,&nbsp;Pengbo Li,&nbsp;Xue Chen,&nbsp;Xiangfeng Luo","doi":"10.1016/j.knosys.2025.113289","DOIUrl":"10.1016/j.knosys.2025.113289","url":null,"abstract":"<div><div>Graph data has shown explosive growth, with application scenarios covering social networks, e-commerce networks, financial transaction networks, etc. In this context, graph anomaly detection is particularly important, aiming to prevent various malicious activities. Existing approaches, however, are still limited in that they either ignore global information and focus only on aggregating neighbor information of the target node, or they utilize global context as a supervisory signal while ignoring local information. In certain scenarios, anomalies can only be detected in a single view (global or local). Furthermore, the issue of class imbalance in graph-based anomaly detection is exacerbated by the significant disparity between the number of benign user samples and anomalous samples in real-world scenarios. As a solution to the above challenges, we present a framework for synthesizing Global and Local perspectives in Contrastive Learning (GALCL). GALCL leverages multi-view contrast to integrate both global and local information. By using node-graph and node-subgraph cross-scale contrasts, the framework enhances the prominence of local and global information, thereby capturing anomaly information that might be missed by focusing solely on the global or local level. In addition, a class-wise loss function is adopted to alleviate class imbalances on the graph. Comprehensive experiments conducted on eight real-world datasets demonstrate that our method outperforms the current state-of-the-art methods.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113289"},"PeriodicalIF":7.2,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bidirectional alignment text-embeddings with decoupled contrastive for sequential recommendation
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-10 DOI: 10.1016/j.knosys.2025.113290
Piao Tong , Qiao Liu , Zhipeng Zhang , Yuke Wang , Tian Lan
{"title":"Bidirectional alignment text-embeddings with decoupled contrastive for sequential recommendation","authors":"Piao Tong ,&nbsp;Qiao Liu ,&nbsp;Zhipeng Zhang ,&nbsp;Yuke Wang ,&nbsp;Tian Lan","doi":"10.1016/j.knosys.2025.113290","DOIUrl":"10.1016/j.knosys.2025.113290","url":null,"abstract":"<div><div>The key challenge in sequential recommendation is to accurately predict the next item based on historical interaction sequences by learning effective sequence representations. Existing models typically optimize sequence representations using the next ground-truth item as the supervised signal. However, this approach often results in biased interest representations and neglects the benefits of bidirectional supervision, leading to incomplete sequence representations and semantic mismatches. To address these limitations, we propose <strong>ADRec</strong> for bidirectional sequence–item <strong>A</strong>lignment text-embeddings with <strong>D</strong>ecoupled contrastive learning for sequential <strong>Rec</strong>ommendation based only on text data. ADRec combines self-supervised and supervised signals derived from intrinsic correlations in recommendation data, to enhance semantic consistency between sequence and ground-truth item representations, improving recommendation performance. Specifically, we introduce a hybrid learning mechanism that integrates an unsupervised contrastive learning paradigm to decouple sequence and item representations and supervised contrastive learning to achieve bidirectional semantic alignment. Additionally, a dual-momentum queue mechanism is devised to expand the diversity of negative samples with limited resources, optimizing the quality of user interest representations in the text modality. Extensive experiments on six public datasets show that ADRec consistently outperforms state-of-the-art methods by learning superior sequence representations. The code is publicly available at <span><span>https://github.com/pppiao/ADRec</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113290"},"PeriodicalIF":7.2,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Twin Q-learning-driven forest ecosystem optimization for feature selection
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-09 DOI: 10.1016/j.knosys.2025.113323
Hongbo Zhang, Jinlong Li, Xiaofeng Yue, Xueliang Gao, Haohuan Nan
{"title":"Twin Q-learning-driven forest ecosystem optimization for feature selection","authors":"Hongbo Zhang,&nbsp;Jinlong Li,&nbsp;Xiaofeng Yue,&nbsp;Xueliang Gao,&nbsp;Haohuan Nan","doi":"10.1016/j.knosys.2025.113323","DOIUrl":"10.1016/j.knosys.2025.113323","url":null,"abstract":"<div><div>Feature selection (FS) enhances the performance of the classification model by selecting relevant features and discarding unnecessary ones. Due to the efficiency of metaheuristic algorithms in solving FS problems, they have drawn much attention. However, the previous metaheuristic-based FS methods have drawbacks, such as easily falling into local optima and limited utilization of FS characteristics. To address these problems, we propose a novel twin Q-learning-driven forest ecosystem optimization named TQFEO for FS problems. Initially, an ordinal number initialization strategy is developed to guarantee the quality of initial individuals at the initial stage. Specifically, a twin Q-learning-driven forest ecosystem is constructed to ensure the algorithm's adaptive capability. Furthermore, a fitness-variance-evaluation-based status detection strategy is proposed to perceive optimization status. If an abnormality is detected, low-quality individuals are to be processed. Finally, a Manhattan distance guides position update and elite random walk strategy is designed to maintain population diversity and accelerate the convergence rate. Experimental results on 20 benchmark datasets across various domains demonstrate that TQFEO outperforms conventional and recent metaheuristic algorithms.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113323"},"PeriodicalIF":7.2,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TCrossE: Cross-space interaction of bicomplex and quaternion embeddings for temporal knowledge graph completion
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-09 DOI: 10.1016/j.knosys.2025.113321
Thanh Vu , Thanh Le
{"title":"TCrossE: Cross-space interaction of bicomplex and quaternion embeddings for temporal knowledge graph completion","authors":"Thanh Vu ,&nbsp;Thanh Le","doi":"10.1016/j.knosys.2025.113321","DOIUrl":"10.1016/j.knosys.2025.113321","url":null,"abstract":"<div><div>Completing temporal knowledge graphs is essential to ensuring their readiness for real-world applications. Temporal knowledge graph completion addresses this challenge by predicting missing temporal facts and enriching the knowledge base over time. However, existing models face key challenges: translation-based models offer interpretability but underperform, while neural network-based models achieve high accuracy but lack transparency in how they capture structural and temporal dependencies. To address these challenges, we propose TCrossE (Temporal Cross-Space Embedding), a novel model that fuses bicomplex and quaternion spaces to enhance the representation of temporal structures. By leveraging rotations in hypercomplex spaces, TCrossE creates hybrid embeddings that effectively model both structural relationships and temporal dependencies. The fusion of bicomplex and quaternion spaces is mathematically motivated and validated through empirical studies. Unlike prior models, TCrossE balances expressiveness and interpretability, ensuring strong performance without sacrificing model transparency. Additionally, our approach optimizes training efficiency, making it more practical for large-scale TKG applications. We evaluate TCrossE on five benchmark datasets: ICEWS14, ICEWS05–15, GDELT, WIKIDATA12k, and YAGO11k, covering a diverse range of temporal knowledge graph structures. Experimental results show that TCrossE outperforms state-of-the-art models, achieving up to 18 % improvement on GDELT and YAGO11k while maintaining competitive performance on other datasets. Furthermore, TCrossE exhibits lower training times, making it suitable for real-world deployment.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113321"},"PeriodicalIF":7.2,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph knowledge tracing in cognitive situation: Validation of classic assertions in cognitive psychology
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-08 DOI: 10.1016/j.knosys.2025.113281
Qianxi Wu , Weidong Ji , Guohui Zhou , Yingchun Yang
{"title":"Graph knowledge tracing in cognitive situation: Validation of classic assertions in cognitive psychology","authors":"Qianxi Wu ,&nbsp;Weidong Ji ,&nbsp;Guohui Zhou ,&nbsp;Yingchun Yang","doi":"10.1016/j.knosys.2025.113281","DOIUrl":"10.1016/j.knosys.2025.113281","url":null,"abstract":"<div><div>Knowledge Tracing (KT) is a fundamental and challenging task in intelligent education, aiming to trace learners’ knowledge states and learning processes, providing better support and guidance for teaching and addressing mental factors. Previous KT tasks have focused on considering learners’ exposure to extrinsic environmental factors while ignoring the influence of intrinsic psychological factors. Moreover, previous methods have adopted a single perspective in modeling learners’ knowledge states, ignoring the diversity of states in the learning process. To address these issues, we define the concept of <em>cognitive situation</em> through the guidance of cognitive psychology theory to help to explain the extrinsic influence and intrinsic cognition of learners within complex learning environments. Moreover, we design a Cognitive Situation-based Graph KT (CSGKT) model to quantify learners’ influences in the cognitive process by modeling schemas capturing intrinsic characteristics and extrinsic factors through Hyper-Graph Neural Networks (HGNN). Second, we utilize a Directed Graph Convolutional Neural Network (DGCNN) to capture the correlation information between knowledge concepts and structure the learner’s cognitive activities and knowledge states, adding a detailed representation of multiple states of the learning process. In addition, we use the Erase-add Gate to filter out the knowledge states that do not match the learner’s current cognitive activities to stabilize the learner’s due cognition. In our experiments, we selected nine baseline models from three mainstream approaches, including sequence-based approaches, <em>Transformer</em>-based approaches, and complex structure-based approaches. The experimental results show that our models outperform these baseline models. At the same time, we also verify two classic assertions in cognitive psychology, namely, the “short-term memory forgetting of knowledge concepts is mainly caused by interference rather than memory trace fading” and the “cognitive imagery and perceptual function play an equivalent role in the cognitive process”, which further support the feasibility of the model.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113281"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quadruple strategy-driven hiking optimization algorithm for low and high-dimensional feature selection and real-world skin cancer classification
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-08 DOI: 10.1016/j.knosys.2025.113286
Mahmoud Abdel-salam , Saleh Ali Alomari , Mohammad H. Almomani , Gang Hu , Sangkeum Lee , Kashif Saleem , Aseel Smerat , Laith Abualigah
{"title":"Quadruple strategy-driven hiking optimization algorithm for low and high-dimensional feature selection and real-world skin cancer classification","authors":"Mahmoud Abdel-salam ,&nbsp;Saleh Ali Alomari ,&nbsp;Mohammad H. Almomani ,&nbsp;Gang Hu ,&nbsp;Sangkeum Lee ,&nbsp;Kashif Saleem ,&nbsp;Aseel Smerat ,&nbsp;Laith Abualigah","doi":"10.1016/j.knosys.2025.113286","DOIUrl":"10.1016/j.knosys.2025.113286","url":null,"abstract":"<div><div>Feature selection (FS) is critical in classification, aiming to identify the smallest subset of features that maximizes accuracy. Given the NP-hard nature of FS, metaheuristic algorithms (MAs) are commonly applied as effective wrapper-based FS methods. However, high-dimensional datasets with many features and limited samples pose challenges, often resulting in reduced effectiveness and increased computational costs. This study presents the Adaptive Enhanced Diversified Hiking Optimization Algorithm (AEDHOA), an improved variant of the Hiking Optimization Algorithm (HOA), crafted to address these issues efficiently. AEDHOA incorporates four key strategies: the Stratified Random Initialization Strategy (SRIS) for enhanced population diversity, the Enhanced Leader Coordination Strategy (ELCS) for multiple leader guidance to prevent premature convergence, the Adaptive Perturbation Strategy (APS) to introduce controlled randomness for escaping local optima, and the Dynamic Exploration Strategy (DES) to balance global exploration and local exploitation dynamically. AEDHOA's performance is validated using benchmark functions from CEC2017 and CEC2022, where it is compared with sets of classical, recent, and advanced algorithms to ensure comprehensive benchmarking. Additionally, AEDHOA is evaluated as a feature selection method on datasets from the UCI repository and a real-world skin cancer dataset, showcasing its capacity to overcome local minima and accelerate convergence. Comparative results reveal that AEDHOA achieves substantial improvements, with classification accuracy ranging from 0.76 to 1.00 across diverse datasets, demonstrating its robustness and effectiveness in high-dimensional FS tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113286"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143611515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Occluded human pose estimation based on part-aware discrete diffusion priors
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-08 DOI: 10.1016/j.knosys.2025.113272
Hongyu Xiao , Hui He , Yifan Xie , Yi Zheng
{"title":"Occluded human pose estimation based on part-aware discrete diffusion priors","authors":"Hongyu Xiao ,&nbsp;Hui He ,&nbsp;Yifan Xie ,&nbsp;Yi Zheng","doi":"10.1016/j.knosys.2025.113272","DOIUrl":"10.1016/j.knosys.2025.113272","url":null,"abstract":"<div><div>In this work, we focus on reconstructing human poses from RGB images, with particular attention given to the ambiguity issues caused by complex scenes such as occlusions. The main challenges we face are twofold: how to reconstruct a complete pose based on limited visible cues and how to handle the uncertainty of occluded parts. To address these issues, our primary approach is to leverage human prior knowledge to ensure the physical plausibility of the reconstructed pose and simulate occluded scenarios through the forward process of the diffusion model, followed by recovering the occluded parts through the reverse process. Specifically, we first train hierarchical encoders, codebooks, and decoders to learn rich pose prior knowledge and then incorporate these priors into a discrete diffusion model with multimodal guidance. We train the network to gradually predict clean discrete pose tokens that are consistent with prior knowledge and ultimately decode them into complete body poses. Extensive experimental results on the COCO and 3DMPB datasets demonstrate that our method achieves state-of-the-art performance compared with previous approaches. The code will be publicly available.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113272"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The evolution of cooperation in continuous dilemmas via multi-agent reinforcement learning
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-08 DOI: 10.1016/j.knosys.2025.113153
Congcong Zhu , Dayong Ye , Tianqing Zhu , Wanlei Zhou
{"title":"The evolution of cooperation in continuous dilemmas via multi-agent reinforcement learning","authors":"Congcong Zhu ,&nbsp;Dayong Ye ,&nbsp;Tianqing Zhu ,&nbsp;Wanlei Zhou","doi":"10.1016/j.knosys.2025.113153","DOIUrl":"10.1016/j.knosys.2025.113153","url":null,"abstract":"<div><div>The evolution of cooperation aims to investigate how to increase the proportion of cooperating participants in a system. It has been studied in a broad range of domains from biology and social science to multi-agent systems and control systems. However, the current research shares a common limitation in that each participant can only opt for cooperation or defection. In the real-world, however, whether to cooperate or defect may not be a strict option; rather, it might be measured in multiple levels. To address this issue, we first propose a novel continuous dilemma in the federated learning setting called the malicious client’s dilemma, where malicious clients can quantify the poisonous updates that will be sent to the server. A multi-agent reinforcement learning-based method that involves a deep prediction network and a deep generation network is then developed to deal with the continuous dilemma. Taking each participant in turn, the deep prediction network predicts the behavior of the other participants in the current round based on their previous behavior. Then, based on the prediction, the deep generation network generates an action for the participant. We theoretically prove that, by combining the two networks, both the learning stationarity and convergence can be guaranteed. A comprehensive set of experiments comparing our method with two other state-of-the-art methods also based on reinforcement learning demonstrates the superior performance of our method in both the proposed dilemma and two other prevalent dilemmas. Our method achieves better results in promoting cooperation and obtaining higher rewards through its unique ability to predict other agents’ behavior and generate optimal strategies based on these predictions, while existing methods rely solely on historical behaviors or reputation mechanisms.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113153"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Research hypothesis generation over scientific knowledge graphs 通过科学知识图谱生成研究假设
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-08 DOI: 10.1016/j.knosys.2025.113280
Agustín Borrego , Danilo Dessì , Daniel Ayala , Inma Hernández , Francesco Osborne , Diego Reforgiato Recupero , Davide Buscaldi , David Ruiz , Enrico Motta
{"title":"Research hypothesis generation over scientific knowledge graphs","authors":"Agustín Borrego ,&nbsp;Danilo Dessì ,&nbsp;Daniel Ayala ,&nbsp;Inma Hernández ,&nbsp;Francesco Osborne ,&nbsp;Diego Reforgiato Recupero ,&nbsp;Davide Buscaldi ,&nbsp;David Ruiz ,&nbsp;Enrico Motta","doi":"10.1016/j.knosys.2025.113280","DOIUrl":"10.1016/j.knosys.2025.113280","url":null,"abstract":"<div><div>Generating research hypotheses is a crucial step in scientific investigation that involves the creation of precise, verifiable, and logically valid statements that can be empirically examined. Therefore, many efforts have been made to automate or assist this process through the use of various Artificial Intelligence solutions. However, most existing methods are tailored to very specific domains, particularly within the biomedical field. There have been recent attempts to formalize hypothesis generation as a link prediction task over knowledge graphs. This solution is potentially domain-independent and applicable across diverse disciplines. Nevertheless, current approaches for link prediction, which typically rely on embedding models or path-based methods, have shown limited success in accurately predicting new hypotheses. To address these limitations, this paper introduces ResearchLink, an innovative and domain-independent methodology for hypothesis generation over knowledge graphs. ResearchLink combines path-based features and knowledge graph embeddings with text embeddings, capturing the semantic context of entities within a given corpus, and integrates additional information from bibliometric databases to improve research collaboration predictions. To conduct a rigorous evaluation of ResearchLink, we constructed CSKG-600, a new dataset for hypothesis generation, consisting of 600 statements that were manually labeled by domain experts. ResearchLink achieved outstanding performance (78.7% P@20), significantly outperforming alternative approaches such as TransH (71.8%), TransD (71.7%), and RotatE (70.7%).</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113280"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143621389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Q-value-based experience replay in reinforcement learning
IF 7.2 1区 计算机科学
Knowledge-Based Systems Pub Date : 2025-03-08 DOI: 10.1016/j.knosys.2025.113296
Zihong Zhang, Ruijia Li
{"title":"Q-value-based experience replay in reinforcement learning","authors":"Zihong Zhang,&nbsp;Ruijia Li","doi":"10.1016/j.knosys.2025.113296","DOIUrl":"10.1016/j.knosys.2025.113296","url":null,"abstract":"<div><div>Experience replay has long been used in reinforcement learning to store and reuse past experiences. However, most existing experience replay methods sample experiences with non-uniform probabilities that are proportional to their values, such as temporal-difference errors, which often lead to biased learning. To address this issue, we propose a new experience replay method that hierarchically samples experiences based on Q-values. Specifically, the proposed method divides a set of uniformly sampled experiences into three groups of the same size according to the Q-values and then uniformly samples the same number of experiences from the three groups as a mini-batch. In this manner, the sampled experiences can effectively maintain diversity. Moreover, to estimate the Q-value accurately, we develop a new critic network based on the self-attention mechanism. We integrate the new experience replay method and critic network into the twin delayed deep deterministic policy gradient algorithm to form a new reinforcement learning algorithm. An extensive set of experiments using several standard benchmarks demonstrates the effectiveness of the proposed algorithm.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113296"},"PeriodicalIF":7.2,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信