Dezhi Sun , Jiwei Qin , Zihao Zhang , Xizhong Qin , Huiguo Zhang
{"title":"MRLCD-A: Lag-aware alignment for multivariate time series forecasting in multiple scenarios","authors":"Dezhi Sun , Jiwei Qin , Zihao Zhang , Xizhong Qin , Huiguo Zhang","doi":"10.1016/j.ipm.2025.104191","DOIUrl":"10.1016/j.ipm.2025.104191","url":null,"abstract":"<div><div>In multivariate time series forecasting tasks, the varying degrees of lag relationships among multivariate data significantly increase the complexity of accurate predictions. A model must effectively capture long-term dependencies and address intricate lag correlations to achieve reliable long-term forecasting. This paper proposes a novel Multivariate Rolling Lag Correlation Detection-Alignment (MRLCD-A) method to tackle these challenges. The method identifies rolling correlations, calculates lag distances in multivariate sequence inputs, and aligns the lagged variables accordingly. Multivariate Time Series (MTS) forecasting uses a Channel Dependency (CD) approach. Experiments on time series datasets across various scenarios, including electricity, weather, exchange rates, and atmospheric carbon concentrations, demonstrate that the proposed method outperforms state-of-the-art models in forecasting general multivariate time series and predicting long-term time series data in real-world environments.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104191"},"PeriodicalIF":7.4,"publicationDate":"2025-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143877463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Ojuri , The Anh Han , Raymond Chiong , Alessandro Di Stefano
{"title":"Optimizing text-to-SQL conversion techniques through the integration of intelligent agents and large language models","authors":"Samuel Ojuri , The Anh Han , Raymond Chiong , Alessandro Di Stefano","doi":"10.1016/j.ipm.2025.104136","DOIUrl":"10.1016/j.ipm.2025.104136","url":null,"abstract":"<div><div>In many organizations, retrieving valuable information from complex databases has traditionally required specialized technical skills, often leaving non-technical professionals dependent on others for timely insights. This study presents an approach that allows anyone, even without knowledge of query languages, to directly interact with databases by asking questions in everyday language. We achieve this by combining advanced generative language models, such as a high-capacity Generative Pre-trained Transformer (GPT) model, with intelligent software agents that translate natural language queries into precise SQL statements. Our evaluation compares different strategies, including models specifically trained on a particular database domain versus those guided by only a handful of examples. The results show that training a model with tailored examples yields more accurate and reliable database queries than relying solely on minimal guidance for the given use case. This work highlights the practical value of refining model complexity and balancing computational costs to empower business users with easy, direct access to data. By reducing reliance on technical teams, organizations can enable faster, more informed decision-making and foster a more inclusive environment where everyone can uncover data-driven insights on their own.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104136"},"PeriodicalIF":7.4,"publicationDate":"2025-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143877462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanchao Liu, Qing Song, Wenchao Song, Pengzhou Zhang, Chi Zhang
{"title":"GTHNN: Graph Transformer and Sequential Hypergraph Neural Network with dynamic aggregation mechanism for multi-task prediction","authors":"Yanchao Liu, Qing Song, Wenchao Song, Pengzhou Zhang, Chi Zhang","doi":"10.1016/j.ipm.2025.104180","DOIUrl":"10.1016/j.ipm.2025.104180","url":null,"abstract":"<div><div>Information cascade multitask prediction, which encompasses information diffusion prediction and information popularity prediction, is crucial to understand how information items spread on social networks with a wide range of real-world applications. Existing works primarily focus on information diffusion prediction or information popularity prediction by using the sequential or graph-structured model, which could yield over-smoothing and over-squashing during aggregating user preference features, resulting in suboptimal performance. In this paper, we propose a novel <u><strong>G</strong></u>raph <u><strong>T</strong></u>ransformer and Sequential <u><strong>H</strong></u>ypergraph <u><strong>N</strong></u>eural <u><strong>N</strong></u>etwork with dynamic aggregation mechanism framework (<u><strong>GTHNN</strong></u>), which is specifically tailored for multitask prediction. Specifically, to mitigate over-squashing of user dynamic features, we construct a sequential hypergraph neural network with the dynamic aggregation mechanism to directly aggregate user dynamic preferences across global periods. To reduce over-smoothness of user static features, the graph transformer architecture is designed to explore the potential high-level social homogeneity among users. To improve the expressive ability of user features, we further build the structure-enhanced self-attention mechanism with exponential decay factor to exhume the complex dependencies among any user. Finally, the prediction layer is applied to simultaneously predict the next infected user and information popularity. Extensive experiments demonstrate that our model outperforms the advanced methods on four real-world datasets, exhibiting the superior performance of our model. The study is beneficial for gaining a better understanding of multitask prediction and revealing the potential of graph transformer architecture.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104180"},"PeriodicalIF":7.4,"publicationDate":"2025-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143877464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"False information sources detecting based on an epidemic diffusion model","authors":"Wei Liu , Yu Shao , Yixin Chen , Ling Chen","doi":"10.1016/j.ipm.2025.104197","DOIUrl":"10.1016/j.ipm.2025.104197","url":null,"abstract":"<div><div>Epidemic-based diffusion models are commonly used to locate false information sources. However, most existing models overlook the topological structure of social networks, assuming that each infected person has an equal opportunity to contact all healthy individuals. Additionally, most current source locating algorithms only consider information from infected nodes, neglecting uninfected ones. As a result, these methods often produce unsatisfactory multi-source detection results. To address these shortcomings, we propose a new epidemic diffusion model, Networked-SNIR, which incorporates topological information to more accurately describe influence propagation. We analyze the properties of information propagation under the Networked-SNIR model. To reduce computational time for extensive information propagation simulations, we present an efficient algorithm to estimate the likelihood of nodes being in different states and their infection times. We also propose a likelihood maximization-based algorithm to detect multiple sources of false information. Experimental results on real-world data show that the proposed Networked-SNIR model more accurately reflects the spread of infectious diseases in social networks compared to other models. Furthermore, experiments on seven real-world and two synthetic datasets demonstrate that, compared to baseline algorithms, the sources detected by our algorithm not only influence more observed nodes but also do so at more precise times.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104197"},"PeriodicalIF":7.4,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zihao Li , Jiaxin Yang , Xianghan Wang, Jun Lei, Shuohao Li, Jun Zhang
{"title":"Uncertainty-aware disentangled representation learning for multimodal fake news detection","authors":"Zihao Li , Jiaxin Yang , Xianghan Wang, Jun Lei, Shuohao Li, Jun Zhang","doi":"10.1016/j.ipm.2025.104190","DOIUrl":"10.1016/j.ipm.2025.104190","url":null,"abstract":"<div><div>The proliferation of fake news on social media platforms poses serious societal risks, such as eroding public trust, inciting panic, and influencing policy decisions. While automated multimodal fake news detection has emerged as a promising approach, existing methods face three critical limitations: (1) they often fail to capture uncertainty within multimodal data, (2) they struggle with modality heterogeneity, and (3) they lack a balanced focus on both modality-private veracity and cross-modal inconsistencies. In this work, we propose a novel <strong>U</strong>ncertainty-aware <strong>D</strong>isentangled <strong>R</strong>epresentation <strong>L</strong>earning (UDRL) framework that addresses these limitations in three key ways. First, our probabilistic representation module models multimodal information as Gaussian distributions, effectively capturing uncertainty and ambiguity. Second, we introduce a disentangled representation learning framework that separates shared and private modality information, enhancing robustness and discrimination. Finally, our uncertainty-aware fusion module dynamically adjusts modality importance based on uncertainty, facilitating more accurate cross-modal interactions. Experimental results on three benchmark datasets demonstrate that UDRL achieves competitive and consistent performance across datasets, validating its effectiveness in multimodal fake news detection.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104190"},"PeriodicalIF":7.4,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiyuan Wen , Rui Wang , Qianlong Wang , Lin Gui , Yunfei Long , Shiwei Chen , Bin Liang , Min Yang , Ruifeng Xu
{"title":"FGVIrony: A Chinese Dataset of Fine-grained Verbal Irony","authors":"Zhiyuan Wen , Rui Wang , Qianlong Wang , Lin Gui , Yunfei Long , Shiwei Chen , Bin Liang , Min Yang , Ruifeng Xu","doi":"10.1016/j.ipm.2025.104169","DOIUrl":"10.1016/j.ipm.2025.104169","url":null,"abstract":"<div><div>Verbal irony, identified as an incongruity between a speaker’s intended meaning and their explicit linguistic expression, often manifests in nuanced forms such as irony, sarcasm, and satire. Current research often fails to differentiate among these fine-grained categories of verbal irony, primarily focusing on generic detection in texts. Therefore, in this work, we introduce a new task for fine-grained verbal irony recognition, aims not only to identify the presence of verbal irony but also distinguish among its various types. Besides, a notable gap in existing research is the lack of datasets tailored to fine-grained verbal irony, particularly in the context of the Chinese language. To tackle this issue, we have developed the <em>FGVIrony</em> dataset, which comprises 10,252 samples, including 6,790 non-ironic and 3,462 verbal ironic instances, further classified into 1,796 instances of irony, 362 of sarcasm, 577 of satire, 192 overstatements, 79 understatements, and 456 rhetorical questions. On the <em>FGVIrony</em> dataset, we explore the challenges of accurately identifying fine-grained verbal irony. Additionally, to investigate the limitations inherent in current methodologies, we propose a cascaded multi-prompt learning approach, <em>CMP</em>, designed to enhance recognition accuracy. The <em>FGVIrony</em> dataset is available at .</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104169"},"PeriodicalIF":7.4,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaobo Li , Xiaodi Hou , Fanjun Meng , Xiaokun Zhang , Mingyu Lu , Hongfei Lin , Yijia Zhang
{"title":"Knowledge enhanced representation learning network for drug recommendation","authors":"Xiaobo Li , Xiaodi Hou , Fanjun Meng , Xiaokun Zhang , Mingyu Lu , Hongfei Lin , Yijia Zhang","doi":"10.1016/j.ipm.2025.104164","DOIUrl":"10.1016/j.ipm.2025.104164","url":null,"abstract":"<div><div>Drug recommendation systems have attracted considerable attention within medical healthcare, which aim to deliver personalized and efficacious drug combinations tailored to patients’ clinical records. Through extensive investigation, we identify two key issues with existing methods: (1) class imbalance distribution, where common diseases occur more frequently than rare ones, resulting in biased and insufficient patient representations; and (2) inadequate modelling of historical medications, where the historical drugs may contain drug information that is valuable for current medical treatment is often overlooked. In this paper, we propose a Knowledge Enhanced Representation Learning (KERL) network for drug recommendation. To address the first issue, we introduce external medical knowledge, using disease entities of different granularities to enhance patient representation. Meanwhile, we construct multiple medical knowledge graphs based on extracted entities and design a graph knowledge enhancement mechanism to integrate global clinical information, alleviating the imbalanced distribution of medical entities. To address the second issue, we design a dual-path drug representation network to model longitudinal historical information from both visit-level and drug-level perspectives. Extensive experiments on two real-world datasets MIMIC-III and MIMIC-IV demonstrate the effectiveness of the proposed KERL in drug recommendation task. Specifically, our KERL achieves improvements of 2.15%, 2.09%, 1.92% and 2.09%, 2.74%, 1.96% over current state-of-the-art methods in terms of F1-score, PRAUC, and Jaccard, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104164"},"PeriodicalIF":7.4,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143844893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuai Zhao , Heyan Huang , Xinge Li , Xiaokang Chen , Rui Wang
{"title":"SST: Self-training with self-adaptive thresholding for semi-supervised learning","authors":"Shuai Zhao , Heyan Huang , Xinge Li , Xiaokang Chen , Rui Wang","doi":"10.1016/j.ipm.2025.104158","DOIUrl":"10.1016/j.ipm.2025.104158","url":null,"abstract":"<div><div>Neural networks have demonstrated exceptional performance in supervised learning, benefiting from abundant high-quality annotated data. However, obtaining such data in real-world scenarios is costly and labor-intensive. Semi-supervised learning (SSL) offers a solution to this problem by utilizing a small amount of labeled data along with a large volume of unlabeled data. Recent studies, such as Semi-ViT and Noisy Student, which employ consistency regularization or pseudo-labeling, have demonstrated significant achievements. However, they still face challenges, particularly in accurately selecting sufficient high-quality pseudo-labels due to their reliance on fixed thresholds. Recent methods such as FlexMatch and FreeMatch have introduced flexible or self-adaptive thresholding techniques, greatly advancing SSL research. Nonetheless, their process of updating thresholds at each iteration is deemed time-consuming, computationally intensive, and potentially unnecessary. To address these issues, we propose Self-training with Self-adaptive Thresholding (SST), a novel, effective, and efficient SSL framework. SST integrates with both supervised (Super-SST) and semi-supervised (Semi-SST) learning. SST introduces an innovative Self-Adaptive Thresholding (SAT) mechanism that adaptively adjusts class-specific thresholds based on the model’s learning progress. SAT ensures the selection of high-quality pseudo-labeled data, mitigating the risks of inaccurate pseudo-labels and confirmation bias (where models reinforce their own mistakes during training). Specifically, SAT prevents the model from prematurely incorporating low-confidence pseudo-labels, reducing error reinforcement and enhancing model performance. Extensive experiments demonstrate that SST achieves state-of-the-art performance with remarkable efficiency, generalization, and scalability across various architectures and datasets. Notably, Semi-SST-ViT-Huge achieves the best results on competitive ImageNet-1K SSL benchmarks (no external data), with 80.7%/84.9% Top-1 accuracy using only 1%/10% labeled data. Compared to the fully-supervised DeiT-III-ViT-Huge, which achieves 84.8% Top-1 accuracy using 100% labeled data, our method demonstrates superior performance using only 10% labeled data. This indicates a tenfold reduction in human annotation costs, significantly narrowing the performance disparity between semi-supervised and fully-supervised methods. These advancements pave the way for further innovations in SSL and practical applications where obtaining labeled data is either challenging or costly.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104158"},"PeriodicalIF":7.4,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143844981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Preference-driven Conjugate Denoising Method for sequential recommendation with side information","authors":"Xiaofei Zhu , Minqin Li , Zhou Yang","doi":"10.1016/j.ipm.2025.104174","DOIUrl":"10.1016/j.ipm.2025.104174","url":null,"abstract":"<div><div>Sequential recommendation with side information aims to predict users’ preferred items based on user behavior sequences. Previous methods utilize attention mechanisms to capture user preferences from behavior sequences but often neglect individual behavioral variations, which introduce varying levels of frequency and random noise, thus compromising preference identification and integration. To address this issue, we propose a Preference-driven Conjugate Denoising Method (PCDM) for sequential recommendation with side information. The method employs a conjugate denoising transformer, consisting of a Fourier denoising module for frequency noise elimination and a variational inference module for random noise reduction, followed by a conjugate transformer that learns the user preference representations. Subsequently, it utilizes a preference-driven denoised fusion module to integrate the learned representations, aligning them with true user preferences while minimizing mixed noise interference. Experiments on four datasets, including Amazon Beauty, Sports, Toys, and Yelp, report average gains of 8.39% in Recall@10, 9.16% in Recall@20, 6.14% in NDCG@10, and 6.94% in NDCG@20 compared to the latest models.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104174"},"PeriodicalIF":7.4,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143839181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junfeng Zhou , Zuyong Wang , Yuting Tan , Ming Du , Ziyang Chen , Xian Tang
{"title":"Efficient processing of k-hop reachability queries on temporal bipartite graphs","authors":"Junfeng Zhou , Zuyong Wang , Yuting Tan , Ming Du , Ziyang Chen , Xian Tang","doi":"10.1016/j.ipm.2025.104178","DOIUrl":"10.1016/j.ipm.2025.104178","url":null,"abstract":"<div><div>Given a temporal bipartite graph, the <span><math><mi>k</mi></math></span>-hop reachability query is used to determine whether there exists a path between two vertices in the graph that satisfies both time and length constraints. The <span><math><mi>k</mi></math></span>-hop reachability queries on temporal bipartite graphs can be used in various scenarios to facilitate data analysis, such as epidemic prevention and control and information dissemination, etc. For <span><math><mi>k</mi></math></span>-hop reachability queries processing on temporal bipartite graphs, existing methods suffer from two problems: (1) false-negative problem, which means that for some reachable queries, existing approaches return unreachable results; (2) lack of support for length constraint. To tackle the above problems, we first analyze the essential reasons of false-negative problem, and propose a traversal-based strategy to avoid the false-negative problem. To improve the efficiency, we propose a graph transform based approach to reduce the cost of graph traversal operation. We then propose to construct a compact index based on the transformed graph, which covers both time and length constraints of all vertex pairs, such that to avoid the expensive graph traversal operation. We further propose efficient algorithms to update the index when the temporal bipartite graph changes. Finally, we conduct rich experiments on real-world datasets. The experimental results show that, our methods completely avoid false-negative problem, and the query efficiency of our index-based method is more than three orders of magnitude faster than the online approach.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104178"},"PeriodicalIF":7.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}