{"title":"Correlating Time Series With Interpretable Convolutional Kernels","authors":"Xinyu Chen;HanQin Cai;Fuqiang Liu;Jinhua Zhao","doi":"10.1109/TKDE.2025.3550877","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3550877","url":null,"abstract":"This study addresses the problem of convolutional kernel learning in univariate, multivariate, and multidimensional time series data, which is crucial for interpreting temporal patterns in time series and supporting downstream machine learning tasks. First, we propose formulating convolutional kernel learning for univariate time series as a sparse regression problem with a non-negative constraint, leveraging the properties of circular convolution and circulant matrices. Second, to generalize this approach to multivariate and multidimensional time series data, we use tensor computations, reformulating the convolutional kernel learning problem in the form of tensors. This is further converted into a standard sparse regression problem through vectorization and tensor unfolding operations. In the proposed methodology, the optimization problem is addressed using the existing non-negative subspace pursuit method, enabling the convolutional kernel to capture temporal correlations and patterns. To evaluate the proposed model, we apply it to several real-world time series datasets. On the multidimensional ridesharing and taxi trip data from New York City and Chicago, the convolutional kernels reveal interpretable local correlations and cyclical patterns, such as weekly seasonality. For the monthly temperature time series data in North America, the proposed model can quantify the yearly seasonality and make it comparable across different decades. In the context of multidimensional fluid flow data, both local and nonlocal correlations captured by the convolutional kernels can reinforce tensor factorization, leading to performance improvements in fluid flow reconstruction tasks. Thus, this study lays an insightful foundation for automatically learning convolutional kernels from time series data, with an emphasis on interpretability through sparsity and non-negativity constraints.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3272-3283"},"PeriodicalIF":8.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zemin Liu;Yuan Li;Nan Chen;Qian Wang;Bryan Hooi;Bingsheng He
{"title":"A Survey of Imbalanced Learning on Graphs: Problems, Techniques, and Future Directions","authors":"Zemin Liu;Yuan Li;Nan Chen;Qian Wang;Bryan Hooi;Bingsheng He","doi":"10.1109/TKDE.2025.3549299","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3549299","url":null,"abstract":"Graphs represent interconnected structures prevalent in a myriad of real-world scenarios. Effective graph analytics, such as graph learning methods, enables users to gain profound insights from graph data, underpinning various tasks including node classification and link prediction. However, these methods often suffer from data imbalance, a common issue in graph data where certain segments possess abundant data while others are scarce, thereby leading to biased learning outcomes. This necessitates the emerging field of imbalanced learning on graphs, which aims to correct these data distribution skews for more accurate and representative learning outcomes. In this survey, we embark on a comprehensive review of the literature on imbalanced learning on graphs. We begin by providing a definitive understanding of the concept and related terminologies, establishing a strong foundational understanding for readers. Following this, we propose two comprehensive taxonomies: (1) the <italic>problem taxonomy</i>, which describes the forms of imbalance we consider, the associated tasks, and potential solutions and (2) the <italic>technique taxonomy</i>, which details key strategies for addressing these imbalances, and aids readers in their method selection process. Finally, we suggest prospective future directions for both problems and techniques within the sphere of imbalanced learning on graphs, fostering further innovation in this critical area.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3132-3152"},"PeriodicalIF":8.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Li Ni;Qiuyu Li;Yiwen Zhang;Wenjian Luo;Victor S. Sheng
{"title":"Local Community Detection in Multi-Attributed Road-Social Networks","authors":"Li Ni;Qiuyu Li;Yiwen Zhang;Wenjian Luo;Victor S. Sheng","doi":"10.1109/TKDE.2025.3550476","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3550476","url":null,"abstract":"The information available in multi-attributed road-social networks includes network structure, location information, and numerical attributes. Most studies mainly focus on mining communities by combining structure with attributes or structure with location, which do not consider structure, attributes, and location simultaneously. Therefore, we propose a parameter-free algorithm, called LCDMRS, to mine local communities in multi-attributed road-social networks. LCDMRS extracts a sub-network surrounding the given node and embeds it to generate the vector representations of nodes, which incorporates both structural and attributed information. Based on the vector representations of nodes, the average cosine similarity between nodes is designed to ensure both the structural and attributed cohesiveness of the community, while the community node density is designed to ensure the spatial cohesiveness of the community. Targeting the community node density and cosine similarity of nodes, LCDMRS takes the given node as the starting node and employs the community dominance relation to expand the community outward. Experimental results on multiple real-world datasets demonstrate LCDMRS outperforms comparison algorithms.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3514-3527"},"PeriodicalIF":8.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Complementary Learning Subnetworks Towards Parameter-Efficient Class-Incremental Learning","authors":"Depeng Li;Zhigang Zeng;Wei Dai;Ponnuthurai Nagaratnam Suganthan","doi":"10.1109/TKDE.2025.3550809","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3550809","url":null,"abstract":"In the scenario of class-incremental learning (CIL), deep neural networks have to adapt their model parameters to non-stationary data distributions, e.g., the emergence of new classes over time. To mitigate the catastrophic forgetting phenomenon, typical CIL methods either cumulatively store exemplars of old classes for retraining model parameters from scratch or progressively expand model size as new classes arrive, which, however, compromises their practical value due to little attention paid to <italic>parameter efficiency</i>. In this paper, we contribute a novel solution, effective control of the parameters of a well-trained model, by the synergy between two complementary learning subnetworks. Specifically, we integrate one plastic feature extractor and one analytical feed-forward classifier into a unified framework amenable to streaming data. In each CIL session, it achieves non-overwritten parameter updates in a cost-effective manner, neither revisiting old task data nor extending previously learned networks; Instead, it accommodates new tasks by attaching a tiny set of declarative parameters to its backbone, in which only one matrix per task or one vector per class is kept for knowledge retention. Experimental results on a variety of task sequences demonstrate that our method achieves competitive results against state-of-the-art CIL approaches, especially in accuracy gain, knowledge transfer, training efficiency, and task-order robustness. Furthermore, a graceful forgetting implementation on previously learned trivial tasks is empirically investigated to make its non-growing backbone (i.e., a model with limited network capacity) suffice to train on more incoming tasks.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3240-3252"},"PeriodicalIF":8.9,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kuo Yang;Zhengyang Zhou;Xu Wang;Pengkun Wang;Limin Li;Yang Wang
{"title":"RayE-Sub: Countering Subgraph Degradation via Perfect Reconstruction","authors":"Kuo Yang;Zhengyang Zhou;Xu Wang;Pengkun Wang;Limin Li;Yang Wang","doi":"10.1109/TKDE.2025.3544696","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3544696","url":null,"abstract":"Subgraph learning has dominated most practices of improving the expressive power of Message Passing Neural Networks (MPNNs). Existing subgraph discovery policies can be classified into node-based and partition-based, which both achieve impressive performance in most scenarios. However, both mainstream solutions still face a subgraph degradation trap. Subgraph degradation is reflected in the phenomenon that the subgraph-level methods fail to offer any benefits over node-level MPNNs. In this work, we empirically investigate the existence of the subgraph degradation issue and introduce a unified perspective, perfect reconstruction, to provide insights for improving two lines of methods. We further propose a subgraph learning strategy guided by the principle of perfect reconstruction. To achieve this, two major issues should be well-addressed, i.e., <italic>(i) how to ensure the subgraphs to possess with ‘perfect’ information? (ii) how to guarantee the ‘reconstruction’ power of obtained subgraphs?</i> First, we propose a subgraph partition strategy <italic>Rayleigh-resistance</i> to extract non-overlap subgraphs by leveraging the graph spectral theory. Second, we put forward a <italic>Query</i> mechanism to achieve subgraph-level equivariant learning, which guarantees subgraph reconstruction ability. These two parts, <italic>perfect subgraph partition</i> and <italic>equivariant subgraph learning</i> are seamlessly unified as a novel <italic><u>Ray</u>leigh-resistance <u>E</u>quivariant <u>Sub</u>graph learning</i> architecture (<italic><b>RayE-Sub</b></i>). Comprehensive experiments on both synthetic and real datasets demonstrate that our approach can consistently outperform previous subgraph learning architectures.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3684-3699"},"PeriodicalIF":8.9,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pattern-Wise Transparent Sequential Recommendation","authors":"Kun Ma;Cong Xu;Zeyuan Chen;Wei Zhang","doi":"10.1109/TKDE.2025.3549032","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3549032","url":null,"abstract":"A transparent decision-making process is essential for developing reliable and trustworthy recommender systems. For sequential recommendation, it means that the model can identify key items that account for its recommendation results. However, achieving both interpretability and recommendation performance simultaneously is challenging, especially for models that take the entire sequence of items as input without screening. In this paper, we propose an interpretable framework (named PTSR) that enables a pattern-wise transparent decision-making process without extra features. It breaks the sequence of items into multi-level patterns that serve as atomic units throughout the recommendation process. The contribution of each pattern to the outcome is quantified in the probability space. With a carefully designed score correction mechanism, the pattern contribution can be implicitly learned in the absence of ground-truth key patterns. The final recommended items are those that most key patterns strongly endorse. Extensive experiments on five public datasets demonstrate remarkable recommendation performance, while statistical analysis and case studies validate the model interpretability.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3627-3640"},"PeriodicalIF":8.9,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Zheng;Yihang Lu;Rong Wang;Feiping Nie;Xuelong Li
{"title":"Structured Graph-Based Ensemble Clustering","authors":"Xuan Zheng;Yihang Lu;Rong Wang;Feiping Nie;Xuelong Li","doi":"10.1109/TKDE.2025.3546502","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3546502","url":null,"abstract":"Ensemble clustering can utilize the complementary information among multiple base clusterings, and obtain a clustering model with better performance and more robustness. Despite its great success, there are still two problems in the current ensemble clustering methods. First, most ensemble clustering methods often treat all base clusterings equally. Second, the final ensemble clustering result often relies on <inline-formula><tex-math>$k$</tex-math></inline-formula>-means or other discretization procedures to uncover the clustering indicators, thus obtaining unsatisfactory results. To address these issues, we proposed a novel ensemble clustering method based on structured graph learning, which can directly extract clustering indicators from the obtained similarity matrix. Moreover, our methods take sufficient consideration of correlation among the base clusterings and can effectively reduce the redundancy among them. Extensive experiments on artificial and real-world datasets demonstrate the efficiency and effectiveness of our methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3728-3738"},"PeriodicalIF":8.9,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tao Ji;Kai Zhong;Luming Sun;Yiyan Li;Cuiping Li;Hong Chen
{"title":"LIOF: Make the Learned Index Learn Faster With Higher Accuracy","authors":"Tao Ji;Kai Zhong;Luming Sun;Yiyan Li;Cuiping Li;Hong Chen","doi":"10.1109/TKDE.2025.3548298","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3548298","url":null,"abstract":"Learned indexes, emerging as a promising alternative to traditional indexes like B+Tree, utilize machine learning models to enhance query performance and reduce memory usage. However, the widespread adoption of learned indexes is limited by their expensive training cost and the need for high accuracy of internal models. Although some studies attempt to optimize the building process of these learned indexes, existing methods are restrictive in scope and applicability. They are usually tailored to specific index types and heavily rely on pre-trained model knowledge, making deployment a challenging task. In this work, we introduce the Learned Index Optimization Framework (LIOF), a general and easily integrated solution aimed at expediting the training process and improving the accuracy of index model for one-dimensional and multi-dimensional learned indexes. The optimization of LIOF for the learned indexes is intuitive, directly providing optimized parameters for index models based on the distribution of node data. By leveraging the correlation between key distribution and node model parameters, LIOF significantly reduces the training epochs required for each node model. Initially, we introduce an optimization strategy inspired by optimization-based meta-learning to train the LIOF to generate optimized initial parameters for index node models. Subsequently, we present a data-driven encoder and a parameter-centric decoder network, which adaptively translate key distribution into a latent variable representation and decode it into optimized node model initialization. Additionally, to further utilize characteristics of key distribution, we propose a monotonic regularizer and focal loss, guiding LIOF training towards efficiency and precision. Through extensive experimentation on real-world and synthetic datasets, we demonstrate that LIOF provides substantial enhancements in both training efficiency and the predictive accuracy for learned indexes.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3499-3513"},"PeriodicalIF":8.9,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruijia Ma;Yahong Lian;Rongbo Qi;Chunyao Song;Tingjian Ge
{"title":"Valid Coverage Oriented Item Perspective Recommendation","authors":"Ruijia Ma;Yahong Lian;Rongbo Qi;Chunyao Song;Tingjian Ge","doi":"10.1109/TKDE.2025.3547968","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3547968","url":null,"abstract":"Today, mainstream recommendation systems have achieved remarkable success in recommending items that align with user interests. However, limited attention has been paid to the perspective of item providers. Content providers often desire that all their offerings, including unpopular or cold items, are <italic>displayed and appreciated by users</i>. To tackle the challenges of <italic>unfair exhibition and limited item acceptance coverage</i>, we introduce a novel recommendation perspective that enables items to “select” their most relevant users. We further introduce ItemRec, a straightforward plug-and-play approach that leverages mutual scores calculated by any model. The goal is to maximize the recommendation and acceptance of items by users. Through extensive experiments on three real-world datasets, we demonstrate that ItemRec can enhance valid coverage by up to 38.5% while maintaining comparable or superior recommendation quality. This improvement comes with only a minor increase in model inference time, ranging from 1.5% to 5%. Furthermore, when compared to thirteen state-of-the-art recommendation methods across accuracy, fairness, and diversity, ItemRec exhibits significant advantages as well. Specifically, ItemRec achieves an optimal balance between precision and valid coverage, showcasing an efficiency gain ranging from 1.8 to 45 times compared to other fairness-oriented methodologies.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3810-3823"},"PeriodicalIF":8.9,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143902678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Songwei Zhao;Bo Yu;Kang Yang;Sinuo Zhang;Jifeng Hu;Yuan Jiang;Philip S. Yu;Hechang Chen
{"title":"A Flexible Diffusion Convolution for Graph Neural Networks","authors":"Songwei Zhao;Bo Yu;Kang Yang;Sinuo Zhang;Jifeng Hu;Yuan Jiang;Philip S. Yu;Hechang Chen","doi":"10.1109/TKDE.2025.3547817","DOIUrl":"https://doi.org/10.1109/TKDE.2025.3547817","url":null,"abstract":"Graph Neural Networks (GNNs) have been gaining more attention due to their excellent performance in modeling various graph-structured data. However, most of the current GNNs only consider fixed-neighbor discrete message-passing, disregarding the importance of the local structure of different nodes and the implicit information between nodes for smoothing features. Previous approaches either focus on adaptive selection for aggregation structures or treat discrete graph convolution as a continuous diffusion process, but none of them comprehensively considered the above issues, significantly limiting the model's performance. To this end, we present a novel approach called Flexible Diffusion Convolution (Flexi-DC), which exploits the neighborhood information of nodes to set a particular continuous diffusion for each node to smooth features. Specifically, Flexi-DC first extracts the local structure knowledge based on the degrees of nodes in the graph data and then injects it into the diffusion convolution module to smooth features. Additionally, we utilize the extracted knowledge to smooth labels. Flexi-DC is an efficient framework that can significantly improve the performance of most GNN architectures. Experimental results demonstrate that Flexi-DC outperforms their vanilla implementations by an average accuracy of 13.24% (GCN), 16.37% (JKNet), and 11.98% (ARMA) on nine graph datasets with different homophily ratios.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3118-3131"},"PeriodicalIF":8.9,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}