IEEE Transactions on Knowledge and Data Engineering最新文献

筛选
英文 中文
Scalable Transactional Stream Processing on Multicore Processors 多核处理器上的可伸缩事务性流处理
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-04 DOI: 10.1109/TKDE.2025.3556741
Jianjun Zhao;Yancan Mao;Zhonghao Yang;Haikun Liu;Shuhao Zhang
{"title":"Scalable Transactional Stream Processing on Multicore Processors","authors":"Jianjun Zhao;Yancan Mao;Zhonghao Yang;Haikun Liu;Shuhao Zhang","doi":"10.1109/TKDE.2025.3556741","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3556741","url":null,"abstract":"Transactional stream processing engines (TSPEs) are central to modern stream applications handling shared mutable states. However, their full potential, particularly in adaptive scheduling, remains largely unexplored. We present <italic>MorphStream</i>, a TSPE designed to optimize parallelism and performance for transactional stream processing on multicores. Through a unique three-stage execution paradigm (i.e., <italic>planning</i>, <italic>scheduling</i>, and <italic>execution</i>), <italic>MorphStream</i> enables adaptive scheduling under varying workload characteristics. Building on this foundation, <italic>MorphStream</i> is further enhanced with support for non-deterministic state access, employing a stateful task precedence graph to handle undefined read/write sets at runtime while guaranteeing transaction semantics. Additionally, <italic>MorphStream</i> incorporates a generalized framework for managing window-based operations, enabling efficient tracking and maintenance of overlapping windows using multi-versioned state management. These extensions enhance the system’s ability to process dynamic and irregular workloads. Experimental results demonstrate up to 3.4 times higher throughput and 69.1% lower latency compared to state-of-the-art TSPEs, validating its scalability and adaptability in real-world streaming scenarios.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4254-4269"},"PeriodicalIF":8.9,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10949743","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144229476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disentangling Dynamics: Advanced, Scalable and Explainable Imputation for Multivariate Time Series 解纠缠动力学:多元时间序列的高级、可扩展和可解释的Imputation
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-04 DOI: 10.1109/TKDE.2025.3558405
Shuai Liu;Xiucheng Li;Yile Chen;Yue Jiang;Gao Cong
{"title":"Disentangling Dynamics: Advanced, Scalable and Explainable Imputation for Multivariate Time Series","authors":"Shuai Liu;Xiucheng Li;Yile Chen;Yue Jiang;Gao Cong","doi":"10.1109/TKDE.2025.3558405","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3558405","url":null,"abstract":"Missing values pose a formidable obstacle in multivariate time series analysis. Existing imputation methods rely on entangled representations that struggle to simultaneously capture multiple orthogonal time-series patterns, leading to suboptimal performance and limited interpretability. Meanwhile, requiring the entire data span as input renders these models impractical for long time series. To address these issues, we propose <inline-formula><tex-math>$mathsf {TIDER}$</tex-math></inline-formula> and its enhanced version, <inline-formula><tex-math>$mathsf {AdaTIDER}$</tex-math></inline-formula>. <inline-formula><tex-math>$mathsf {TIDER}$</tex-math></inline-formula> employs low-rank matrix factorization and disentangled temporal representations to model intricate dynamics like trend, seasonality, and local bias. However, <inline-formula><tex-math>$mathsf {TIDER}$</tex-math></inline-formula> is limited to single-period modeling and does not explicitly capture dependencies between channels. To overcome these limitations, <inline-formula><tex-math>$mathsf {AdaTIDER}$</tex-math></inline-formula> incorporates adaptive cross-channel dependency modeling and multi-period seasonality representations. These advancements enable it to dynamically capture variable relationships and complex multi-period patterns, significantly enhancing imputation accuracy and interpretability, while maintaining <inline-formula><tex-math>$mathsf {TIDER}$</tex-math></inline-formula>’s scalability. Extensive experiments on real-world datasets validate the superiority of our models in imputation accuracy, scalability, interpretability, and robustness.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4010-4022"},"PeriodicalIF":8.9,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large Language Models are in-Context Molecule Learners 大型语言模型是上下文中的分子学习者
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-03 DOI: 10.1109/TKDE.2025.3557697
Jiatong Li;Wei Liu;Zhihao Ding;Wenqi Fan;Yuqiang Li;Qing Li
{"title":"Large Language Models are in-Context Molecule Learners","authors":"Jiatong Li;Wei Liu;Zhihao Ding;Wenqi Fan;Yuqiang Li;Qing Li","doi":"10.1109/TKDE.2025.3557697","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3557697","url":null,"abstract":"Large Language Models (LLMs) have demonstrated exceptional performance in biochemical tasks, especially the molecule caption translation task, which aims to bridge the gap between molecules and natural language texts. However, previous methods in adapting LLMs to the molecule-caption translation task required extra domain-specific pre-training stages, suffered weak alignment between molecular and textual spaces, or imposed stringent demands on the scale of LLMs. To resolve the challenges, we propose <bold>I</b>n-<bold>C</b>ontext <bold>M</b>olecule <bold>A</b>daptation (<bold>ICMA</b>), as a new paradigm allowing LLMs to learn the molecule-text alignment from context examples via In-Context Molecule Tuning. Specifically, ICMA incorporates the following three stages: Hybrid Context Retrieval, Post-retrieval Re-ranking, and In-context Molecule Tuning. Initially, Hybrid Context Retrieval utilizes BM25 Caption Retrieval and Molecule Graph Retrieval to retrieve similar informative context examples. Additionally, Post-retrieval Re-ranking is composed of Sequence Reversal and Random Walk selection to further improve the quality of retrieval results. Finally, In-Context Molecule Tuning unlocks the in-context learning and reasoning capability of LLMs with the retrieved examples and adapts the parameters of LLMs for better alignment between molecules and texts. Experimental results demonstrate that ICMA can empower LLMs to achieve state-of-the-art or comparable performance without extra training corpora and intricate structures, showing that LLMs are inherently in-context molecule learners.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4131-4143"},"PeriodicalIF":8.9,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FairCoRe: Fairness-Aware Recommendation Through Counterfactual Representation Learning FairCoRe:基于反事实表征学习的公平意识推荐
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-03 DOI: 10.1109/TKDE.2025.3557501
Chenzhong Bin;Wenqiang Liu;Feng Zhang;Liang Chang;Tianlong Gu
{"title":"FairCoRe: Fairness-Aware Recommendation Through Counterfactual Representation Learning","authors":"Chenzhong Bin;Wenqiang Liu;Feng Zhang;Liang Chang;Tianlong Gu","doi":"10.1109/TKDE.2025.3557501","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3557501","url":null,"abstract":"Eliminating bias from data representations is crucial to ensure fairness in recommendation. Existing studies primarily focus on weakening the correlation between data representations and sensitive attributes, yet may inadvertently steer the user representations toward another potential bias direction of the target attribute. Furthermore, they often overlook the impact of user preferences on capturing sensitive information, incurring inadequate bias elimination. In this paper, we propose a <bold>Fair</b> <bold>Co</b>unterfactual <bold>Re</b>presentations (<bold>FairCoRe</b>) learning framework, which aims to ensure the neutrality of representations among all bias directions. First, we intervene on sensitive attributes to construct a counterfactual scenario. Then, two opposing attribute prediction tasks are respectively performed in ground-truth and counterfactual scenarios to encode sensitive information along different bias directions. Second, we design a bias-aware enhancement learning method that quantifies the respective correlation of user preferences and sensitive attributes to enhance sensitive information encoding. Finally, we introduce two mutual information optimization methods that optimize the representations to capture users’ interests and disentangle sensitive factors. Moreover, we propose an attribute neutralization strategy that refines the learned representations, ensuring sensitive attribute neutrality. Extensive experiments demonstrate that our method achieves the optimal fairness and competitive accuracy compared to state-of-the-art methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4049-4062"},"PeriodicalIF":8.9,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S-MGHSTN: Towards An Effective Streaming Traffic Accident Risk Prediction Framework S-MGHSTN:一种有效的流交通事故风险预测框架
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-03 DOI: 10.1109/TKDE.2025.3557864
Minxiao Chen;Haitao Yuan;Nan Jiang;Zhihan Zheng;Zhifeng Bao;Ao Zhou;Jiaxin Jiang;Shangguang Wang
{"title":"S-MGHSTN: Towards An Effective Streaming Traffic Accident Risk Prediction Framework","authors":"Minxiao Chen;Haitao Yuan;Nan Jiang;Zhihan Zheng;Zhifeng Bao;Ao Zhou;Jiaxin Jiang;Shangguang Wang","doi":"10.1109/TKDE.2025.3557864","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3557864","url":null,"abstract":"Traffic accidents pose a significant risk to human health and property safety. To address this issue, predicting their risks has garnered growing interest. We argue that a desired prediction solution should demonstrate resilience to the complexity of traffic accidents. In particular, it should adequately consider the streaming nature of data and key related aspects, such as regional background, accurately capture both proximity and similarity while bridging the disparities, and effectively address the sparsity. However, these factors are often overlooked or difficult to incorporate. In this paper, we propose a novel streaming multi-granularity hierarchical spatio-temporal network. Initially, we innovate by incorporating remote sensing data, facilitating the creation of hierarchical multi-granularity structure and the comprehension of regional background. We construct multiple high-level risk prediction tasks to enhance model’s ability to cope with sparsity. Subsequently, to capture and bridge spatial proximity and semantic similarity, region features and multi-view graph undergo encoding processes to distill effective representations, followed by a graph-enhanced representation alignment module that reconciles their disparities. At last, an alternating experience replay with a dual-memory buffer is employed to accommodate streaming data scenarios. Extensive experiments on two real datasets verify the superiority of our model against the state-of-the-art methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4285-4298"},"PeriodicalIF":8.9,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge-Centered Dual-Process Reasoning for Math Word Problems With Large Language Models 以知识为中心的大语言模型数学单词问题双过程推理
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-01 DOI: 10.1109/TKDE.2025.3556367
Jiayu Liu;Zhenya Huang;Qi Liu;Zhiyuan Ma;Chengxiang Zhai;Enhong Chen
{"title":"Knowledge-Centered Dual-Process Reasoning for Math Word Problems With Large Language Models","authors":"Jiayu Liu;Zhenya Huang;Qi Liu;Zhiyuan Ma;Chengxiang Zhai;Enhong Chen","doi":"10.1109/TKDE.2025.3556367","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3556367","url":null,"abstract":"Math word problem (MWP) serves as a critical milestone for assessing the text mining ability and knowledge mastery level of models. Recent advancements have witnessed large language models (LLMs) showcasing remarkable performance on MWP. However, current LLMs still frequently exhibit logical errors, which highlights their inability to fully grasp the knowledge required for genuine step-by-step mathematical reasoning. To this end, in this paper, we propose a novel Knowledge-guided Solver (KNOS) framework that empowers LLMs to simulate human mathematical reasoning, whose core idea is to <italic>Invoke-Verify-Inject</i> necessary knowledge to solve MWP. We draw inspiration from the dual-process theory to construct two cooperative systems: a <italic>Knowledge System</i> and an <italic>Inference System</i>. Specifically, the <italic>Knowledge System</i> employs LLMs as the knowledge base and develops a novel <italic>knowledge invoker</i> that can elicit their relevant knowledge to support the strict step-level mathematical reasoning. In the <italic>Inference System</i>, we propose a <italic>knowledge verifier</i> and a <italic>knowledge injector</i> to evaluate the knowledge rationality and further guide the step-wise symbolic deduction in an interpretable manner based on human cognitive mechanism, respectively. Moreover, to tackle the potential scarcity issue of mathematics-specific knowledge in LLMs, we consider an open-book exam scenario and propose an improved version of KNOS called EKNOS. In EKNOS, we meticulously design <italic>knowledge selectors</i> to extract the most relevant commonsense and math formulas from external knowledge sources for each reasoning step. This knowledge is utilized to assist the <italic>knowledge invoker</i> in better stimulating LLMs’ reasoning abilities. Both KNOS and EKNOS are flexible to empower different LLMs. Our experiments with GPT3, ChatGPT, and GPT4 not only demonstrate their reasoning accuracy improvement but also show how they bring the strict step-wise interpretability of mathematical thinking.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3457-3471"},"PeriodicalIF":8.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inconsistent Multivariate Time Series Forecasting 不一致多元时间序列预测
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-01 DOI: 10.1109/TKDE.2025.3556940
Li Shen;Yangzhu Wang;Xuyi Fan;Xu Yang;Huaxin Qiu
{"title":"Inconsistent Multivariate Time Series Forecasting","authors":"Li Shen;Yangzhu Wang;Xuyi Fan;Xu Yang;Huaxin Qiu","doi":"10.1109/TKDE.2025.3556940","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3556940","url":null,"abstract":"Traditional statistical time series forecasting models rely on model identification methods to identify the worthiest model variants to investigate; therefore, the model parameters change with the statistical features of rolling windows to reach optimality. Currently, although deep-learning-based methods achieve promising multivariate forecasting performance, their representations of variable correlations are consistent regardless of the observed local time series properties and dynamic cross-variable relations, rendering them prone to overfitting. To bridge this gap, we propose FPPformer-MD, a novel inconsistent time series forecasting transformer. FPPformer-MD leverages multiresolution analysis to transform each univariate series into multiple frequency scales and evaluate the local variable correlations via their variances. Thus, FPPformer-MD receives richer input features, and its inner inconsistent cross-variable attention mechanism enables the adaptive extraction of cross-variable features. To further alleviate the overfitting problem, we apply dynamic mode decomposition to perform cross-variable data augmentation, which reconstructs the sequence outliers with other correlated sequences during the model training process. Extensive experiments conducted on thirteen real-world benchmarks demonstrate the state-of-the-art performance of FPPformer-MD.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4117-4130"},"PeriodicalIF":8.9,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal and Spatial Analysis in Early Sepsis Prediction via Causal Disentanglements 通过因果解缠在脓毒症早期预测中的时空分析
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-30 DOI: 10.1109/TKDE.2025.3569584
Qiang Li;Dongchen Li;Weizhi Nie;He Jiao;Zhenhua Wu;Anan Liu
{"title":"Temporal and Spatial Analysis in Early Sepsis Prediction via Causal Disentanglements","authors":"Qiang Li;Dongchen Li;Weizhi Nie;He Jiao;Zhenhua Wu;Anan Liu","doi":"10.1109/TKDE.2025.3569584","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3569584","url":null,"abstract":"Sepsis is one of the main causes of death in ICU patients, and accurate and stable early prediction is essential for clinical intervention. Existing methods mostly rely on traditional time series models (e.g., LSTM, Transformer) or clinical scoring criteria (e.g., SOFA, qSOFA), but face two major challenges: 1) spurious correlations in the data affect the robustness of the model; 2) Lack of modeling the underlying causal relationships in the data space. We propose a Serialized Causal Disentanglement Model (SCDM) that decouples latent variables into sepsis-related factors (<inline-formula><tex-math>$u$</tex-math></inline-formula>), other disease-related factors (<inline-formula><tex-math>$v$</tex-math></inline-formula>), and irrelevant confounders (<inline-formula><tex-math>$s$</tex-math></inline-formula> ). Based on the MIMIC-IV v2.2 dataset (3,511 positive samples and 17,538 negative samples), SCDM took patient clinical indicators, personal information, and clinical notes as input, and achieved an AUC of 0.765-0.928in the prediction task 48 to 0 hours before the onset of sepsis. The performance is significantly better than the baseline models (e.g., Transformer's 0.662-0.910, MGP-AttTCN's 0.692-0.913). Experiments show that optimizing the time window (5 hours of continuous observation) and variable selection (45 key indicators) can improve the performance of the model. The effectiveness of causal unwinding is verified by the visualization of Grad CAM and t-SNE, key clinical indicators such as platelet count, lactic acid, and respiratory rate are further identified to provide interpretable decision support for doctors. Our study provides a high-precision and interpretable causal disentanglement framework for early prediction of sepsis, which is expected to promote the development of intelligent diagnosis and treatment in the ICU.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 8","pages":"4860-4872"},"PeriodicalIF":8.9,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144572985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SE-GNN: Seed Expanded-Aware Graph Neural Network With Iterative Optimization for Semi-Supervised Entity Alignment 半监督实体对齐的迭代优化种子扩展感知图神经网络
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-28 DOI: 10.1109/TKDE.2025.3555586
Tao Meng;Shuo Shan;Hongen Shao;Yuntao Shou;Wei Ai;Keqin Li
{"title":"SE-GNN: Seed Expanded-Aware Graph Neural Network With Iterative Optimization for Semi-Supervised Entity Alignment","authors":"Tao Meng;Shuo Shan;Hongen Shao;Yuntao Shou;Wei Ai;Keqin Li","doi":"10.1109/TKDE.2025.3555586","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3555586","url":null,"abstract":"Entity alignment aims to use pre-aligned seed pairs to find other equivalent entities from different knowledge graphs and is widely used in graph fusion-related fields. However, as the scale of knowledge graphs increases, manually annotating pre-aligned seed pairs becomes difficult. Existing research utilizes entity embeddings obtained by aggregating single structural information to identify potential seed pairs, thus reducing the reliance on pre-aligned seed pairs. However, due to the structural heterogeneity of KG, the quality of potential seed pairs obtained using only a single structural information is not ideal. In addition, although existing research improves the quality of potential seed pairs through semi-supervised iteration, they underestimate the impact of embedding distortion produced by noisy seed pairs on the alignment effect. In order to solve the above problems, we propose a seed expanded-aware graph neural network with iterative optimization for semi-supervised entity alignment, named SE-GNN. First, we utilize the semantic attributes and structural features of entities, combined with a conditional filtering mechanism, to obtain high-quality initial potential seed pairs. Next, we designed a local and global awareness mechanism. It introduces initial potential seed pairs and combines local and global information to obtain a more comprehensive entity embedding representation, which alleviates the impact of KG structural heterogeneity and lays the foundation for the optimization of initial potential seed pairs. Then, we designed the threshold nearest neighbor embedding correction strategy. It combines the similarity threshold and the bidirectional nearest neighbor method as a filtering mechanism to select iterative potential seed pairs and also uses an embedding correction strategy to eliminate the embedding distortion. Finally, we will reach the optimized potential seeds after iterative rounds to input local and global sensing mechanisms, obtain the final entity embedding, and perform entity alignment. Experimental results on public datasets demonstrate the excellent performance of our SE-GNN, showcasing the effectiveness of the model. Our code is publicly available at <uri>https://github.com/ShuoShan1/SE-GNN</uri>.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3700-3713"},"PeriodicalIF":8.9,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ten Challenging Problems in Federated Foundation Models 联邦基础模型中的十个挑战问题
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-03-28 DOI: 10.1109/TKDE.2025.3555328
Tao Fan;Hanlin Gu;Xuemei Cao;Chee Seng Chan;Qian Chen;Yiqiang Chen;Yihui Feng;Yang Gu;Jiaxiang Geng;Bing Luo;Shuoling Liu;Win Kent Ong;Chao Ren;Jiaqi Shao;Chuan Sun;Xiaoli Tang;Hong Xi Tae;Yongxin Tong;Shuyue Wei;Fan Wu;Wei Xi;Mingcong Xu;He Yang;Xin Yang;Jiangpeng Yan;Hao Yu;Han Yu;Teng Zhang;Yifei Zhang;Xiaojin Zhang;Zhenzhe Zheng;Lixin Fan;Qiang Yang
{"title":"Ten Challenging Problems in Federated Foundation Models","authors":"Tao Fan;Hanlin Gu;Xuemei Cao;Chee Seng Chan;Qian Chen;Yiqiang Chen;Yihui Feng;Yang Gu;Jiaxiang Geng;Bing Luo;Shuoling Liu;Win Kent Ong;Chao Ren;Jiaqi Shao;Chuan Sun;Xiaoli Tang;Hong Xi Tae;Yongxin Tong;Shuyue Wei;Fan Wu;Wei Xi;Mingcong Xu;He Yang;Xin Yang;Jiangpeng Yan;Hao Yu;Han Yu;Teng Zhang;Yifei Zhang;Xiaojin Zhang;Zhenzhe Zheng;Lixin Fan;Qiang Yang","doi":"10.1109/TKDE.2025.3555328","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3555328","url":null,"abstract":"Federated Foundation Models (FedFMs) represent a distributed learning paradigm that fuses general competences of foundation models as well as privacy-preserving capabilities of federated learning. This combination allows the large foundation models and the small local domain models at the remote clients to learn from each other in a teacher-student learning setting. This paper provides a comprehensive summary of the ten challenging problems inherent in FedFMs, encompassing foundational theory, utilization of private data, continual learning, unlearning, Non-IID and graph data, bidirectional knowledge transfer, incentive mechanism design, game mechanism design, model watermarking, and efficiency. The ten challenging problems manifest in five pivotal aspects: “Foundational Theory,” which aims to establish a coherent and unifying theoretical framework for FedFMs. “Data,” addressing the difficulties in leveraging domain-specific knowledge from private data while maintaining privacy; “Heterogeneity,” examining variations in data, model, and computational resources across clients; “Security and Privacy,” focusing on defenses against malicious attacks and model theft; and “Efficiency,” highlighting the need for improvements in training, communication, and parameter efficiency. For each problem, we offer a clear mathematical definition on the objective function, analyze existing methods, and discuss the key challenges and potential solutions. This in-depth exploration aims to advance the theoretical foundations of FedFMs, guide practical implementations, and inspire future research to overcome these obstacles, thereby enabling the robust, efficient, and privacy-preserving FedFMs in various real-world applications.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4314-4337"},"PeriodicalIF":8.9,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信