IEEE Transactions on Knowledge and Data Engineering最新文献

筛选
英文 中文
FeBT: A Feature Balancing Transformer for Corporate ESG Forecasting FeBT:企业ESG预测的特征平衡变压器
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-17 DOI: 10.1109/TKDE.2025.3560137
Yawen Li;Mengyu Zhuang;Guanhua Ye;Yan Li;Junheng Wang;Jinyi Zhou;Pengfei Zhang
{"title":"FeBT: A Feature Balancing Transformer for Corporate ESG Forecasting","authors":"Yawen Li;Mengyu Zhuang;Guanhua Ye;Yan Li;Junheng Wang;Jinyi Zhou;Pengfei Zhang","doi":"10.1109/TKDE.2025.3560137","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3560137","url":null,"abstract":"Environmental, social, and governance (ESG) serves as a crucial indicator for evaluating firms in terms of sustainable development. However, the existing ESG evaluation systems suffer from limitations, such as narrow coverage, subjective bias, and lack of timeliness. Therefore, there is a pressing need to leverage machine learning methods to predict the ESG performance of firms using their publicly available data. Traditional machine learning models encounter the feature imbalance problem due to the heterogeneity in ESG-related features. Common approaches typically involve unfolding all features, thereby granting high-dimensional folding features greater exposure and accessibility to downstream models, which results in the neglect of low-dimensional features. To fill the research gap regarding fully using the heterogeneous features of enterprises to enhance AI-based ESG prediction performance, we propose the Feature Balancing Transformer (FeBT), a model based on autoencoders and Transformer blocks. FeBT incorporates a novel feature balancing technique that compresses and enhances high-dimensional features from imbalanced data into low-dimensional representations, thereby ensuring a more balanced impact of high-dimensional and low-dimensional features on the model’s performance in the downstream ESG forecasting module. Extensive experiments verified the superior performance of FeBT compared with state-of-the-art methods in real-world ESG-related datasets and evidenced that our feature balancing module provides significant insights from high-dimensional folding features.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4063-4074"},"PeriodicalIF":8.9,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EPM: Evolutionary Perception Method for Anomaly Detection in Noisy Dynamic Graphs 噪声动态图异常检测的进化感知方法
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-15 DOI: 10.1109/TKDE.2025.3561191
Huan Wang;Junyang Chen;Yirui Wu;Victor C. M. Leung;Di Wang
{"title":"EPM: Evolutionary Perception Method for Anomaly Detection in Noisy Dynamic Graphs","authors":"Huan Wang;Junyang Chen;Yirui Wu;Victor C. M. Leung;Di Wang","doi":"10.1109/TKDE.2025.3561191","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3561191","url":null,"abstract":"With the rapid expansion of interactions across various domains such as social networks, transaction networks, and IP-IP networks, anomaly detection in dynamic graphs has become increasingly critical for mitigating potential risks. However, existing anomaly detection methods often assume noise-free dynamic graphs, overlooking the prevalence of noisy dynamic graphs in real-world applications. Specifically, noisy dynamic graphs affected by structural noises—such as spurious and missing nodes and edges—struggle to consistently provide reliable structural evidence for anomaly detection. To tackle this challenge, we propose an Evolutionary Perception Method (EPM) for identifying anomalous nodes in noisy dynamic graphs by resisting the interference of structural noises. EPM primarily consists of two components: a dynamic fitter and a filtering reviser. The dynamic fitter characterizes the interaction dynamics of nodes that removes and generates links at each period as a multiple superposition state, utilizing various link prediction algorithms to fit evolutionary mechanisms. Additionally, the filtering reviser designs evolutional entropies to quantify the evolutional uncertainty in multiple superposition states, further reconstructing the Kalman filter to optimize these entropies. Extensive experiments have proved that our proposed EPM outperforms state-of-the-art methods in discovering anomalous nodes in noisy dynamic graphs.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4035-4048"},"PeriodicalIF":8.9,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ontology Embedding: A Survey of Methods, Applications and Resources 本体嵌入:方法、应用和资源综述
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-11 DOI: 10.1109/TKDE.2025.3559023
Jiaoyan Chen;Olga Mashkova;Fernando Zhapa-Camacho;Robert Hoehndorf;Yuan He;Ian Horrocks
{"title":"Ontology Embedding: A Survey of Methods, Applications and Resources","authors":"Jiaoyan Chen;Olga Mashkova;Fernando Zhapa-Camacho;Robert Hoehndorf;Yuan He;Ian Horrocks","doi":"10.1109/TKDE.2025.3559023","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3559023","url":null,"abstract":"Ontologies are widely used for representing domain knowledge and meta data, playing an increasingly important role in Information Systems, the Semantic Web, Bioinformatics and many other domains. However, logical reasoning that ontologies can directly support are quite limited in learning, approximation and prediction. One straightforward solution is to integrate statistical analysis and machine learning. To this end, automatically learning vector representation for knowledge of an ontology i.e., <italic>ontology embedding</i> has been widely investigated. Numerous papers have been published on ontology embedding, but a lack of systematic reviews hinders researchers from gaining a comprehensive understanding of this field. To bridge this gap, we write this survey paper, which first introduces different kinds of semantics of ontologies and formally defines ontology embedding as well as its property of faithfulness. Based on this, it systematically categorizes and analyses a relatively complete set of over 80 papers, according to the ontologies they aim at and their technical solutions including geometric modeling, sequence modeling and graph propagation. This survey also introduces the applications of ontology embedding in ontology engineering, machine learning augmentation and life sciences, presents a new library mOWL and discusses the challenges and future directions.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4193-4212"},"PeriodicalIF":8.9,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OpDiag: Unveiling Database Performance Anomalies Through Query Operator Attribution OpDiag:通过查询操作符属性揭示数据库性能异常
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-10 DOI: 10.1109/TKDE.2025.3557049
Shiyue Huang;Ziwei Wang;Yinjun Wu;Yaofeng Tu;Jiankai Wang;Bin Cui
{"title":"OpDiag: Unveiling Database Performance Anomalies Through Query Operator Attribution","authors":"Shiyue Huang;Ziwei Wang;Yinjun Wu;Yaofeng Tu;Jiankai Wang;Bin Cui","doi":"10.1109/TKDE.2025.3557049","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3557049","url":null,"abstract":"How to effectively diagnose and mitigate database performance anomalies remains a significant concern for modern database systems. Manually identifying the root causes of the anomalies is a labor-intensive process and significantly relies on professional experience. Meanwhile, existing work on automatic database diagnosis mainly focuses on detecting anomalous performance metrics or system log. These solutions lack the power to pinpoint detailed issues such as bad queries or problematic operators, which are indispensable for most database troubleshooting processes. In this paper, we propose OpDiag, a diagnosis framework that attributes database performance anomalies to query operators. In this framework, we first construct models offline to represent the relationship between query operators, performance metrics, and anomalies. These models can capture query plan features and support ad-hoc queries and schemas. Then, through feature attribution on these models during online diagnosis, OpDiag can effectively identify critical anomalous metrics and further trace back to suspicious queries and operators. This can provide concrete guidance for subsequent steps in anomaly mitigation. We applied OpDiag to both synthetic benchmark and real industry cases from ZTE Corporation. Empirical studies prove that OpDiag can accurately localize anomalous queries and operators, thus reducing human efforts in diagnosing and mitigating database performance anomalies.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3613-3626"},"PeriodicalIF":8.9,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering Cliques in Attribute Graphs Based on Proportional Fairness 基于比例公平的属性图中团的发现
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-10 DOI: 10.1109/TKDE.2025.3559994
Yongye Li;Renjie Sun;Chen Chen;Xiaoyang Wang;Ying Zhang;Wenjie Zhang
{"title":"Discovering Cliques in Attribute Graphs Based on Proportional Fairness","authors":"Yongye Li;Renjie Sun;Chen Chen;Xiaoyang Wang;Ying Zhang;Wenjie Zhang","doi":"10.1109/TKDE.2025.3559994","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3559994","url":null,"abstract":"Community detection is a fundamental problem and has been extensively studied. With the abundance of information in real-world networks, the discovery of communities in attribute graphs is increasingly valuable. However, numerous previous models in attribute graphs neglect the fairness concept, which plays an important role in ensuring that graph analysis is not biased toward specific groups. In this paper, we propose a novel model, named proportional fair clique (PFC). Specifically, given an attribute graph <inline-formula><tex-math>$G=(V,E,A)$</tex-math></inline-formula>, an integer <inline-formula><tex-math>$k$</tex-math></inline-formula> and a threshold <inline-formula><tex-math>$lambda in [0,1/|A|]$</tex-math></inline-formula>, a subgraph <inline-formula><tex-math>$S$</tex-math></inline-formula> of <inline-formula><tex-math>$G$</tex-math></inline-formula> is a PFC if <inline-formula><tex-math>$(i)$</tex-math></inline-formula> <inline-formula><tex-math>$S$</tex-math></inline-formula> is a clique with size at least <inline-formula><tex-math>$k$</tex-math></inline-formula> and <inline-formula><tex-math>$(ii)$</tex-math></inline-formula> <inline-formula><tex-math>$|S_{a_{i}}|/|S| geq lambda$</tex-math></inline-formula> for each attribute <inline-formula><tex-math>$a_{i}$</tex-math></inline-formula> in <inline-formula><tex-math>$G$</tex-math></inline-formula>, where <inline-formula><tex-math>$S_{a_{i}}$</tex-math></inline-formula> is the node set in <inline-formula><tex-math>$S$</tex-math></inline-formula> associated with attribute <inline-formula><tex-math>$a_{i}$</tex-math></inline-formula>. We show that the problem of enumerating all the maximal proportional fair cliques (MPFC) is NP-hard. A reasonable baseline algorithm is first presented by extending the Bron-Kerbosch framework. To scale for large networks, we propose several optimization strategies to accelerate the computation. Finally, comprehensive experiments are conducted over 6 graphs to demonstrate the efficiency and effectiveness of the proposed techniques and model.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4003-4009"},"PeriodicalIF":8.9,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable and Load-Balanced Full-Graph GNN Training on Multiple GPUs 多gpu上可扩展和负载均衡的全图GNN训练
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-08 DOI: 10.1109/TKDE.2025.3558641
Qiange Wang;Yao Chen;Weng-Fai Wong;Bingsheng He
{"title":"Scalable and Load-Balanced Full-Graph GNN Training on Multiple GPUs","authors":"Qiange Wang;Yao Chen;Weng-Fai Wong;Bingsheng He","doi":"10.1109/TKDE.2025.3558641","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3558641","url":null,"abstract":"While full-graph training is effective for graph learning, it typically demands substantial memory resources. Existing multi-GPU training frameworks struggle with scalability because they require retaining data for each layer within GPU memory. In this work, we present <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula>, a memory-efficient system that supports out-of-memory full-graph GNN training on GPUs. <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula> offloads vertex data to CPU memory and employs partition parallelism training that splits and assigns large graphs to multiple GPUs. To reduce runtime memory consumption with optimal performance, <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula> utilizes a hybrid solution combining recomputation, caching, and computation-reordering, enabling efficient layer-wise intermediate data management. To address the increased communication caused by duplicated neighbor access among partitions, <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula> employs a deduplicated communication framework that converts host-GPU transfers into more efficient inter/intra-GPU data access. Additionally, <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula> tackles the load-imbalance issues in out-of-memory full-graph training, featuring a multi-objective graph partition algorithm that balances memory consumption and data transfer and maximizes the effectiveness of communication deduplication. Experiments on a 4× A100 GPU server show that <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula> can effectively train graphs with billion edges while reducing host-GPU data communication by 25% to 71% . Compared to the full-graph GNN system running on 16 CPU nodes, <inline-formula><tex-math>$mathsf {HongTu }$</tex-math></inline-formula> achieves speedups ranging from 11.4× to 21.3×.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4239-4253"},"PeriodicalIF":8.9,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tricolore: Multi-Behavior User Profiling for Enhanced Candidate Generation in Recommender Systems Tricolore:推荐系统中增强候选生成的多行为用户分析
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-07 DOI: 10.1109/TKDE.2025.3558503
Xiao Zhou;Zhongxiang Zhao;Hanze Guo
{"title":"Tricolore: Multi-Behavior User Profiling for Enhanced Candidate Generation in Recommender Systems","authors":"Xiao Zhou;Zhongxiang Zhao;Hanze Guo","doi":"10.1109/TKDE.2025.3558503","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3558503","url":null,"abstract":"Online platforms aggregate extensive user feedback across diverse behaviors, providing a rich source for enhancing user engagement. Traditional recommender systems, however, typically optimize for a single target behavior and represent user preferences with a single vector, limiting their ability to handle multiple important behaviors or optimization objectives. This conventional approach also struggles to capture the full spectrum of user interests, resulting in a narrow item pool during candidate generation. To address these limitations, we present <italic>Tricolore</i>, a versatile multi-vector learning framework that uncovers connections between different behavior types for more robust candidate generation. <italic>Tricolore</i>'s adaptive multi-task structure is also customizable to specific platform needs. To manage the variability in sparsity across behavior types, we incorporate a behavior-wise multi-view fusion module that dynamically enhances learning. Moreover, a popularity-balanced strategy ensures the recommendation list balances accuracy with item popularity, fostering diversity and improving overall performance. Extensive experiments on public datasets demonstrate <italic>Tricolore</i>'s effectiveness across various recommendation scenarios, from short video platforms to e-commerce. By leveraging a shared base embedding strategy, <italic>Tricolore</i> also significantly improves the performance for cold-start users.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4349-4360"},"PeriodicalIF":8.9,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144232034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Large-Scale Clustering With Anchor-Based Constrained Laplacian Rank 基于锚定约束拉普拉斯秩的大规模聚类
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-07 DOI: 10.1109/TKDE.2025.3557718
Zhenyu Ma;Jingyu Wang;Feiping Nie;Xuelong Li
{"title":"Large-Scale Clustering With Anchor-Based Constrained Laplacian Rank","authors":"Zhenyu Ma;Jingyu Wang;Feiping Nie;Xuelong Li","doi":"10.1109/TKDE.2025.3557718","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3557718","url":null,"abstract":"Graph-based clustering technique has garnered significant attention due to precise information characterization by pairwise graph similarity. Nevertheless, the post-processing step in traditional methods often limits clustering effects because of crucial information loss. Therefore, the Constrained Laplacian Rank (CLR) theory emerges to directly obtain discrete labels from optimally structural graph, achieving desirable outcomes. However, CLR suffers from substantial time overhead, making it infeasible for large-scale data analysis. To overcome this issue, we propose Anchor-based CLR (ACLR), a simple yet effective method for efficient large-scale clustering. The ACLR method comprises four stages: (1) anchors that roughly cover original data are opted to prepare bipartite graph construction; (2) a novel two-step probability transition (TSPT) strategy initializes a small-scale graph with random walk probability among anchors; (3) the main ACLR model alternately optimizes the graph connected structure and directly produces discrete anchor labels, achieving a time complexity independent of the number of samples due to dramatically reduced graph scale; and (4) labels are propagated from anchors to samples using <inline-formula><tex-math>$K$</tex-math></inline-formula>-NN algorithm. Extensive experiments demonstrate that ACLR yields superior accuracy and efficiency, particularly when applied to large-scale data.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4144-4158"},"PeriodicalIF":8.9,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty Calibration for Counterfactual Propensity Estimation in Recommendation 推荐中反事实倾向估计的不确定度校准
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-07 DOI: 10.1109/TKDE.2025.3552658
Wenbo Hu;Xin Sun;Qiang Liu;Le Wu;Liang Wang
{"title":"Uncertainty Calibration for Counterfactual Propensity Estimation in Recommendation","authors":"Wenbo Hu;Xin Sun;Qiang Liu;Le Wu;Liang Wang","doi":"10.1109/TKDE.2025.3552658","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3552658","url":null,"abstract":"Post-click conversion rate (CVR) is a reliable indicator of online customers’ preferences, making it crucial for developing recommender systems. A major challenge in predicting CVR is severe selection bias, arising from users’ inherent self-selection behavior and the system’s item selection process. To mitigate this issue, the inverse propensity score (IPS) is employed to weight the prediction error of each observed instance. However, current propensity score estimations are unreliable due to the lack of a quality measure. To address this, we evaluate the quality of propensity scores from the perspective of uncertainty calibration, proposing the use of Expected Calibration Error (ECE) as a measure of propensity-score quality, which quantifies the extent to which predicted probabilities are overconfident by assessing the difference between predicted probabilities and actual observed frequencies. Miscalibrated propensity scores can lead to distorted IPS weights, thereby compromising the debiasing process in CVR prediction. In this paper, we introduce a model-agnostic calibration framework for propensity-based debiasing of CVR predictions. Theoretical analysis on bias and generalization bounds demonstrates the superiority of calibrated propensity estimates over uncalibrated ones. Experiments conducted on the Coat, Yahoo and KuaiRand datasets show improved uncertainty calibration, as evidenced by lower ECE values, leading to enhanced CVR prediction outcomes.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3781-3793"},"PeriodicalIF":8.9,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Data-Level Augmentation Framework for Time Series Forecasting With Ambiguously Related Source Data 具有模糊关联源数据的时间序列预测的数据级增强框架
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-04-04 DOI: 10.1109/TKDE.2025.3555530
Rui Ye;Qun Dai
{"title":"A Data-Level Augmentation Framework for Time Series Forecasting With Ambiguously Related Source Data","authors":"Rui Ye;Qun Dai","doi":"10.1109/TKDE.2025.3555530","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3555530","url":null,"abstract":"Many practical time series forecasting (TSF) tasks are plagued by data limitations. To alleviate this challenge, we design a data-level augmentation framework. It involves a time series generation (TSG) module and a source data selection (Sel-src) module. TSG aims to achieve better generation results by considering both the global profile and temporal dynamics of series. However, when only few target data is available, TSG module may tend to simulate the limited target samples, leading to poor generalization performance. A natural idea for this problem is to seek help from related source domain, which can provide additional useful information for TSG module. Here we consider a more complex situation, where the relevance between source and target domains is ambiguous. That is, irrelevant samples may exist in the source domain. Blindly using all the source data may lead to counterproductive results. To meet this challenge, Sel-src module is designed to select effective source samples by Inter-Representation Learning (Inter-RL) and Intra-Representation Learning (Intra-RL). Effectiveness of this algorithm is underpinned from two aspects: the quality of the augmented data and the accuracy improvement upon the augmentation.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"3855-3868"},"PeriodicalIF":8.9,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144219595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信