{"title":"Radial Adaptive Node Embedding Hashing for cross-modal retrieval","authors":"Yunfei Chen , Renwei Xia , Zhan Yang , Jun Long","doi":"10.1016/j.knosys.2025.113522","DOIUrl":"10.1016/j.knosys.2025.113522","url":null,"abstract":"<div><div>With the rapid growth of multimedia data on social networks, efficient and accurate cross-modal retrieval has become essential. Cross-modal hashing methods offer advantages such as fast retrieval speed and low storage cost. However, unsupervised deep cross-modal hashing methods often struggle with semantic misalignment and noise, limiting their effectiveness in capturing fine-grained relationships across modalities. To address these challenges, we propose Radial Adaptive Node Embedding Hashing (RANEH), designed to enhance semantic consistency and retrieval efficiency. Specifically, the semantic meta-similarity construction module reconstructs identity semantics using a similarity matrix, ensuring that hash codes retain modality-specific features. The radial adaptive hybrid coding method employs FastKAN as an encoder to map features into a shared hash space, maintaining semantic consistency across modalities. Lastly, the broadcasting node embedding unit leverages the Fast Kolmogorov–Arnold network to capture deep modality relationships, improving semantic alignment and node embedding accuracy. Experiments on the NUS-WIDE, MIRFlickr, and MSCOCO datasets show that RANEH method consistently outperforms state-of-the-art unsupervised cross-modal hashing methods in accuracy and efficiency. The codes are available at <span><span>https://github.com/YunfeiChenMY/RANEH</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113522"},"PeriodicalIF":7.2,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jian Luo , Jian Zhang , Bo Cai , Yaoxiang Yu , Aihua Ke
{"title":"Learning hierarchical scene graph and contrastive learning for object goal navigation","authors":"Jian Luo , Jian Zhang , Bo Cai , Yaoxiang Yu , Aihua Ke","doi":"10.1016/j.knosys.2025.113532","DOIUrl":"10.1016/j.knosys.2025.113532","url":null,"abstract":"<div><div>The task of object goal navigation (ObjNav) requires the agent to locate the given target object within a complex dynamic scene. To successfully accomplish the task, the agent needs to well understand the scenes, make executable decisions with less steps, avoid collisions, and successfully navigate to the target. As a result, efficient environmental perception and scene graph-inspired path planning is important to successfully accomplish the ObjNav task. In this paper, we present a hierarchical scene graph (HSG) contrastive learning, which consists of (1) a multimodal graph mixer that aligns the visual and textual information using open-vocabulary detector with GLIP. It can be regarded as an “eagle eye” to perceive target-related frontiers and suppress irrelevant information, and (2) a graph constructer that takes observed RGBD images to incrementally build a hierarchical scene graph. It acts as the “brain” that memorizes the common scene layout, (3) an action control contrastive learning that takes the graph contextual relationships as input to predict optimal actions to the target. It is treated as the “limbs” of the agent, coordinating and correcting incorrect movements. On the task of ObjNav, experiments on Gibson, HM3D, MP3D, and ProcTHOR demonstrate that navigation plans from the HSG framework achieve significantly higher success rates than existing map-based method, indicating the feasibility of executing navigation utilizing commonsense knowledge from language models leading efficient semantic exploration. <em>Code is available at</em> <span><span>https://github.com/luosword/HSG4VN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113532"},"PeriodicalIF":7.2,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinwei Zhai , Yuanyuan Wang , Luwen Liang , Kangzhong Wang , Fengchun Pei , Eugene Yujun Fu
{"title":"Personalized e-learning resource recommendation using multimodal-enhanced collaborative filtering","authors":"Xinwei Zhai , Yuanyuan Wang , Luwen Liang , Kangzhong Wang , Fengchun Pei , Eugene Yujun Fu","doi":"10.1016/j.knosys.2025.113605","DOIUrl":"10.1016/j.knosys.2025.113605","url":null,"abstract":"<div><div>Personalized learning resource recommendation is a prominent research area in the field of e-learning, allowing learners to find appropriate resources that align with their specific learning needs. The continuous development and optimization of online learning platforms have resulted in an increasing amount of e-learning resources and learner data. This poses challenges to the existing e-learning resource recommendation approaches, most of which rely on conventional collaborative filtering (CF) exclusively. Their efficiency is constrained owing to the utilization of a sole modality or a limited subset of modalities for the recommendation. To address these challenges, this study proposes a multimodal-enhanced CF approach in e-learning. Our approach uses various modalities for modeling, including learners’ learning records, human–computer interaction patterns, and information related to the resources. It integrates techniques such as matrix factorization for the joint learner–resource pattern modeling, clustering for grouping similar learners, and the long short-term memory network for capturing the temporal dynamics of learning activities. Comprehensive experiments are conducted to evaluate the efficiency of the proposed approach, and to determine its optimal setup for a deep understanding of the contributions of each component.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113605"},"PeriodicalIF":7.2,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuefeng Li , Jian Wei , Chensu Zhao , Xiaqiong Fan , Yuhang Wang
{"title":"Multi-domain fake news detection method based on generative adversarial network and graph network","authors":"Xuefeng Li , Jian Wei , Chensu Zhao , Xiaqiong Fan , Yuhang Wang","doi":"10.1016/j.knosys.2025.113665","DOIUrl":"10.1016/j.knosys.2025.113665","url":null,"abstract":"<div><div>The proliferation of misinformation in today's digital era poses significant challenges, with fake news detection becoming critical to mitigate economic losses and social instability. Despite extensive research efforts, most existing approaches are tailored for single-domain fake news detection, struggling with data distribution discrepancies and domain shifts when applied to multi-domain scenarios. This limitation underscores the urgent need for solutions that address the complexities of cross-domain detection. Here, we propose a novel framework MFGAG that synergistically integrates adversarial networks and graph neural networks with emotional, stylistic, and semantic features to enable precise domain localization. By leveraging these features, the framework effectively models intricate relationships among news articles within the same temporal context, addressing the challenges posed by multi-domain datasets. Experimental evaluations demonstrate that our approach outperforms state-of-the-art methods, achieving an average accuracy improvement of 3.3 percentage points for single-domain news and nearly 1 percentage point for mixed-domain data, culminating in an overall accuracy of 93.1 %. The code involved in this study is publicly available on website <span><span>https://github.com/SWLee777/MFGAG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113665"},"PeriodicalIF":7.2,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143895419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From local verification to global reasoning: Exploiting slot-accompanying update for improved slot selection","authors":"Bing Qian , Jinyu Guo , Qiwei Wang , Kai Shuang","doi":"10.1016/j.knosys.2025.113521","DOIUrl":"10.1016/j.knosys.2025.113521","url":null,"abstract":"<div><div>The goal of dialogue-state tracking (DST) is to determine the current state of a dialogue by analysing the entire preceding dialogue context. Nonetheless, current approaches frequently fail to account for the significance of concurrent updates, where related slots must be updated simultaneously based on their historical relationships, even in the absence of explicit signals in the current dialogue turn. To address this limitation, we introduce From Local Verification to Global Reasoning (FLV2GR), an innovative method that improves slot-update selection by combining local verification of present dialogue details with global reasoning over historical dialogue data. Our approach utilizes a graph neural network (GNN) to model and infer interdependencies between slots, enabling the identification of accompanying update relationships that are frequently overlooked by other approaches. This comprehensive selection mechanism improves the precision of slot updates, thereby enhancing overall DST performance. The FLV2GR model establishes a new performance benchmark on the MultiWOZ 2.1, 2.2, and 2.4 datasets, showcasing its effectiveness in capturing both local and global dialogue dynamics for more precise and reliable DST.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113521"},"PeriodicalIF":7.2,"publicationDate":"2025-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143886860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid deep learning model for automated colorectal cancer detection using local and global feature extraction","authors":"Ishak Pacal , Omneya Attallah","doi":"10.1016/j.knosys.2025.113625","DOIUrl":"10.1016/j.knosys.2025.113625","url":null,"abstract":"<div><div>Colorectal cancer (CRC) ranks among the most lethal malignancies globally, underscoring the importance of timely and precise diagnosis. Although histopathological examination remains the clinical gold standard, the intricate morphology of tissue samples and inter-observer variability drive the need for robust automated methods. To address these challenges, this paper presents a hybrid deep learning model that integrates InceptionNeXt blocks, enhanced Swin Transformer blocks, and a Residual Multi-Layer Perceptron (ResMLP). In the initial stages, InceptionNeXt blocks employ multi-branch convolutions to capture nuclear morphology, glandular structures, and stromal textures, particularly benefiting limited training data scenarios. Subsequent layers utilize enhanced Swin Transformer blocks with window-based self-attention and shifted windows, effectively modeling long-range dependencies. The ResMLP component further refines feature representation via residual learning. Comprehensive evaluations on two benchmark CRC datasets—NCT-CRC<img>HE-100K and Kather-5K—demonstrated accuracies of 99.96 % and 99.06 %, respectively, outperforming 10 state-of-the-art CNN and 10 ViT-based models. Additionally, Grad-CAM visualizations highlight the critical regions influencing classification decisions, enhancing model interpretability. These results establish the proposed method as a reliable, generalizable, and clinically viable solution for automated CRC detection.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113625"},"PeriodicalIF":7.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143882261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuang Li , Minhui Wang , Chang Tang , Yanfeng Zhu
{"title":"Integrating edge features and complementary attention mechanism for drug response prediction","authors":"Chuang Li , Minhui Wang , Chang Tang , Yanfeng Zhu","doi":"10.1016/j.knosys.2025.113508","DOIUrl":"10.1016/j.knosys.2025.113508","url":null,"abstract":"<div><div>Predicting drug response in cancer cell lines is a vital field in precision medicine, supporting personalized treatment planning, optimizing drug selection, and enhancing the accuracy and effectiveness of cancer therapies. Although graph neural network-based models for drug response prediction have made significant progress in performance, they often focus solely on learning node embeddings while overlooking adjacency relationships between cell lines, limiting the model’s ability to capture inter-cell line adjacency information. To address this limitation, we propose a novel model that constructs edge features by measuring similarity between cell lines and their k-nearest neighbors, integrating these edge features with a node-edge complementary attention mechanism. This approach enables the model to dynamically incorporate node and edge information, achieving complementary and collaborative feature learning. Such a design substantially improves the accuracy and biological interpretability of drug response prediction. Furthermore, to enhance the independence and complementarity of node and edge features, we introduce a complementary loss mechanism in the model and design a topology updating module that performs dynamic feature updates via neighborhood aggregation, effectively capturing and utilizing multi-omics data. We conduct comprehensive experiments on the Genomics of Drug Sensitivity in Cancer and the Cancer Cell Line Encyclopedia, which contains various diseases such as esophageal carcinoma, stomach adenocarcinoma, colon adenocarcinoma and rectal adenocarcinoma, the results demonstrate that our model outperforms current state-of-the-art methods in cancer drug response prediction.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113508"},"PeriodicalIF":7.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143882262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ming Liao, Xiaoguang Di, Maozhen Liu, Teng Lv, Xiaofei Zhang, Runwen Zhu
{"title":"Dynamic-Aware and Static Context Network for large-scale 3D place recognition","authors":"Ming Liao, Xiaoguang Di, Maozhen Liu, Teng Lv, Xiaofei Zhang, Runwen Zhu","doi":"10.1016/j.knosys.2025.113577","DOIUrl":"10.1016/j.knosys.2025.113577","url":null,"abstract":"<div><div>3D point cloud-based place recognition enables robots to obtain precise global positions without GPS, correct trajectory drift in SLAM, and recover from the kidnapped robot problem. However, in outdoor environments, the presence of moving objects can cause occlusions in point clouds and introduce noise into the data, leading to localization failures. To address this issue, we propose a Dynamic-Aware and Static Context Network (DASC-Net) for large-scale 3D place recognition. Our approach leverages the spatio-temporal consistency of point cloud sequences to accurately segment dynamic objects while incorporating static point cloud context to compensate for feature loss caused by noise interference or occlusions from dynamic objects, thereby enhancing robustness and generalization. Specifically, DASC-Net adopts a two-stage strategy: first, it introduces a coarse-to-fine moving object segmentation method to effectively eliminate dynamic noise; second, it utilizes spatial context association and multi-scale feature aggregation to improve static feature representation and matching. Extensive experimental results demonstrate that DASC-Net outperforms existing place recognition approaches, particularly in dynamic scenes.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113577"},"PeriodicalIF":7.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143886764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced Dynamic Deep Q-Network for Federated Learning scheduling policies on IoT devices using explanation-driven trust","authors":"Gaith Rjoub , Hanae Elmekki , Jamal Bentahar , Witold Pedrycz , Sofian Kassaymeh , Shahed Bassam Almobydeen , Rachida Dssouli","doi":"10.1016/j.knosys.2025.113574","DOIUrl":"10.1016/j.knosys.2025.113574","url":null,"abstract":"<div><div>Recent advancements in Internet of Things (IoT) and edge computing have led to rapid growth in the number of IoT devices generating extensive volumes of data at the network edge. Efficiently scheduling tasks on these devices, particularly under strict latency constraints in federated learning (FL) environments, poses substantial challenges. In this paper, we propose a novel trust-energy-aware scheduling framework specifically designed for latency-constrained federated edge computing scenarios. Our innovative strategy integrates Dynamic Deep Q-Network (Dynamic-DQN) reinforcement learning with Local Interpretable Model-agnostic Explanations (LIME), enabling dynamic, real-time assessment of device trustworthiness with interpretability and transparency. This combined approach allows the framework to intelligently allocate tasks to IoT devices, explicitly optimizing for reduced latency, improved energy efficiency, and enhanced system reliability. Extensive experimental evaluations confirm that our proposed method substantially outperforms conventional reinforcement learning and heuristic scheduling algorithms, demonstrating significant reductions in latency, superior energy management, and improved scalability. These results underscore the robustness and practical effectiveness of our framework in addressing critical FL challenges.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"318 ","pages":"Article 113574"},"PeriodicalIF":7.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143881284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenbiao Du , Jingfeng Xue , Xiuqi Yang , Wenjie Guo , Dujuan Gu , Weijie Han
{"title":"TransfficFormer: A novel Transformer-based framework to generate evasive malicious traffic","authors":"Wenbiao Du , Jingfeng Xue , Xiuqi Yang , Wenjie Guo , Dujuan Gu , Weijie Han","doi":"10.1016/j.knosys.2025.113546","DOIUrl":"10.1016/j.knosys.2025.113546","url":null,"abstract":"<div><div>Machine learning (ML) and deep learning (DL) have significantly improved the detection accuracy of contemporary Network Intrusion Detection Systems (NIDS), yet they remain susceptible to adversarial attacks. Current attacks against ML/DL-based NIDS primarily focus on altering feature vectors, thereby overlooking the discrete and irreversible nature of network traffic packets, which significantly limits its practical applicability. To address these challenges, we propose TransfficFormer to generate adversarial attack traffic that combines heuristic algorithm and transformer. We train a Transformer-based generator by transforming source-space features into discrete sequence autoregressive models. The three-layer particle swarm optimization algorithm with random and perception factor is utilized to optimize the generation of adversarial mutation malicious traffic with reversible metadata feature vectors. Furthermore, the discriminator feedback probability is fine-tuned using reinforcement learning strategies, ensuring the preservation of both malicious intent and normal communication functionality within the generated traffic. Comprehensive experiments demonstrate that Transfficformer can autonomously generate mutant malicious traffic, effectively evading various ML/DL-based NIDS with minimal overhead. The practicality of the generated mutant traffic is validated in the NSFOCUS cyber range.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"319 ","pages":"Article 113546"},"PeriodicalIF":7.2,"publicationDate":"2025-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}