Chengcheng Xu , Tianfeng Wang , Man Chen , Jun Chen , Wei Li , Zhisong Pan
{"title":"GRAIL: Graph contrastive learning with balanced negative sampling","authors":"Chengcheng Xu , Tianfeng Wang , Man Chen , Jun Chen , Wei Li , Zhisong Pan","doi":"10.1016/j.ipm.2025.104211","DOIUrl":"10.1016/j.ipm.2025.104211","url":null,"abstract":"<div><div>Currently, some graph contrastive learning methods mitigate the class imbalance by balancing the number of anchors, overlooking the crucial role of negative samples in forming a regular simplex. Moreover, existing strategies select a limited number of positive samples with poor quality, causing the model to erroneously push away nodes with similar semantics. To address these issues, we propose a <strong>g</strong>raph cont<strong>r</strong>astive learning method with b<strong>a</strong>lanced negat<strong>i</strong>ve samp<strong>l</strong>ing, named GRAIL. Specifically, GRAIL introduces a multi-head similarity metric that leverages mixed probability distributions related to dimensional elements to adaptively select an equal number of hard negative samples within each non-anchor cluster. As a result, GRAIL not only promotes the formation of a regular simplex by balancing the gradient contributions of different negative classes but also selects the most informative hard negative samples to improve the distinguishing ability of minority classes while minimizing the impact on majority classes. Furthermore, GRAIL selects multiple positive samples with a high correct ratio using structural similarity and feature similarity, thereby enabling the model to learn trustworthy node representations. Since traditional contrastive loss focuses on the majority class while neglecting the minority class, a balanced contrastive loss is introduced to optimize node representations. Experiments on node classification, node clustering, and link prediction tasks across six imbalanced graph datasets demonstrate that GRAIL outperforms existing state-of-the-art methods. The source code is available at <span><span>https://github.com/xushucheng-coder/GRAIL/tree/master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104211"},"PeriodicalIF":7.4,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xunlian Wu, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun
{"title":"Motif-based Contrastive Graph Clustering with clustering-oriented prompt","authors":"Xunlian Wu, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun","doi":"10.1016/j.ipm.2025.104208","DOIUrl":"10.1016/j.ipm.2025.104208","url":null,"abstract":"<div><div>Graph contrastive learning has shown significant promise in graph clustering, yet prevalent approaches face two limitations: (1) most existing methods primarily capture lower-order adjacency structures, overlooking high-order motifs that are essential building blocks of the network; (2) most of them do not address false-negative pairs and lack cluster-oriented guidance, potentially embedding irrelevant information in the node representations. To overcome these issues, we introduce a novel Motif-based Contrastive Graph Clustering approach with Clustering-Oriented Prompt (MCGC). Firstly, MCGC employs a specialized Siamese encoder network to obtain both lower-order and higher-order node embeddings. The encoder processes two views of the graph: one based on lower-order adjacency and the other on higher-order motif structures, where higher-order motif (such as triangles) is extracted using motif adjacency matrices. Then, structural contrastive learning is used to ensure cross-view structural consistency. Furthermore, node-level contrastive learning is designed to enhance the discriminative capability of node embeddings, while interactions between samples and centroids provide clustering-oriented prompts. Finally, a parameter-shared MLP aligns embeddings in a unified clustering space, refined by cluster-level contrastive learning. These contrastive learning strategy ensures better-defined cluster boundaries and improves the quality of node representations. The approach is versatile and can be applied in recommendation systems, where clustering similar users enhances personalized recommendations, and in anomaly detection, where it helps identify unusual patterns or outliers in transaction or social networks. Experimental results on six datasets demonstrate that MCGC outperforms state-of-the-art algorithms. For example, on the EAT dataset, MCGC achieves 58.68% in ACC, surpassing the runner-up (CCGC) by 4.71%, demonstrating the effectiveness of motif-based contrastive learning in improving clustering quality. The source code is available at: <span><span>https://github.com/CSLab208/MCGC-Motif-based</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104208"},"PeriodicalIF":7.4,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Qi, Yujun Wen, Junpeng Gong, Pengzhou Zhang, Yao Zheng
{"title":"Multimodal disentanglement implicit distillation for speech emotion recognition","authors":"Xin Qi, Yujun Wen, Junpeng Gong, Pengzhou Zhang, Yao Zheng","doi":"10.1016/j.ipm.2025.104213","DOIUrl":"10.1016/j.ipm.2025.104213","url":null,"abstract":"<div><div>Audio signals are generally utilized with textual data for speech emotion recognition. Nevertheless, cross-modal interactions suffer from distribution discrepancy and information redundancy, leading to an inaccurate multimodal representation. Hence, this paper proposes a multimodal disentanglement implicit distillation model (MDID) that excavates and exploits each modality’s sentiment and specific characteristics. Specifically, the pre-trained models extract high-level acoustic and textual features and align them via an attention mechanism. Then, each modality is disentangled into modality sentiment-specific features. Subsequently, feature-level and logit-level distillation distill the purified modality-specific feature into the modality-sentiment feature. Compared to the adaptive fusion feature, solely employing the refined modality-sentiment feature yields superior performance for emotion recognition. Comprehensive experiments on the IEMOCAP and RAVDESS datasets indicate that MDID outperforms state-of-the-art approaches.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104213"},"PeriodicalIF":7.4,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144124721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Zhang , Lingling Song , Jianfang Liu , Peihua Luo , Zhixin Li , Zhongwei Gong
{"title":"A novel framework for deep knowledge tracing via a dual-state joint interaction mechanism","authors":"Wei Zhang , Lingling Song , Jianfang Liu , Peihua Luo , Zhixin Li , Zhongwei Gong","doi":"10.1016/j.ipm.2025.104210","DOIUrl":"10.1016/j.ipm.2025.104210","url":null,"abstract":"<div><div>Although deep learning-based knowledge tracing (DLKT) models have shown promising results, they typically attribute student performance solely to knowledge states, neglecting the influence of students’ test-taking psychological states. Moreover, the complex interactions between knowledge states and test-taking psychological states remain underexplored, limiting the potential for further advances in these models. To address this, we propose a novel framework, termed the <strong>D</strong>ual-state <strong>J</strong>oint <strong>I</strong>nteraction <strong>M</strong>echanism for deep <strong>K</strong>nowledge <strong>T</strong>racing (DJIM-KT), which models the interactions between students’ knowledge states and test-taking psychological states, with the aim of further enhancing the performance of existing DLKT models. In DJIM-KT, DLKT models are first employed to model students’ knowledge states by extracting interaction information between students and exercises. Simultaneously, guided by behaviorist theory, students’ test-taking psychological states are modeled by capturing higher-order relations between exercises and their answering behaviors. Subsequently, we design the dual-state joint interaction mechanism (DJIM), which precisely quantifies the interactions between knowledge states and test-taking psychological states, and leverages reinforcement learning to analyze students’ real-time feedback in different exercises, thereby dynamically adjusting the prediction weights of the two states. This adaptive DJIM enables DJIM-KT to effectively capture individualized student information. Extensive experiments on three real-world datasets demonstrate that DJIM-KT significantly enhances the prediction accuracy and explainability of DLKT models. Specifically, the two representative DLKT models, deep knowledge tracing (DKT) and separated self-attentive neural knowledge tracing (SAINT), achieve average improvements of 17.46% in AUC and 10.37% in ACC with the help of DJIM-KT.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104210"},"PeriodicalIF":7.4,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144116484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanbing Zhang , Yinan Jing , Fei Zhang , Zhixin Li , X. Sean Wang , Zhenqiang Chen , Cheng Lv
{"title":"TabTransGAN: A hybrid approach integrating GAN and transformer architectures for tabular data synthesis","authors":"Hanbing Zhang , Yinan Jing , Fei Zhang , Zhixin Li , X. Sean Wang , Zhenqiang Chen , Cheng Lv","doi":"10.1016/j.ipm.2025.104220","DOIUrl":"10.1016/j.ipm.2025.104220","url":null,"abstract":"<div><div>While generative adversarial networks (GANs) have made significant advancements in the fields of image and text generation, their application to tabular data synthesis faces distinct challenges since they fail to effectively capture tabular data semantics, which leads to suboptimal performance. To address this challenge, we propose <em>TabTransGAN</em>, a novel architecture that combines the power of Transformer models and GANs to recognize the semantic integrity and attribute information of tabular data with more accuracy. TabTransGAN also introduces position encoding for each column to improve dimension recognition and facilitate correlation capture. Experimental results on 5 real-world datasets show that TabTransGAN outperforms existing methods in various aspects such as synthesis quality, machine learning performance, and privacy preservation.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104220"},"PeriodicalIF":7.4,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey on biomedical automatic text summarization with large language models","authors":"Zhenyu Huang , Xianlai Chen , Yunbo Wang , Jincai Huang , Xing Zhao","doi":"10.1016/j.ipm.2025.104216","DOIUrl":"10.1016/j.ipm.2025.104216","url":null,"abstract":"<div><div>Automatic text summarization in the biomedical field can support efficient literature screening, medical knowledge management, and innovative medical research. In recent years, Large Language Models (LLMs), as a disruptive technology in natural language processing, have shown great potential for Biomedical Automatic Text Summarization (BATS). This technology helps to better understand the terminology of biomedical texts, track medical hotspots, and generate personalized diagnoses and treatment plans. This paper provides an in-depth discussion on the development of BATS, and the opportunities as well as challenges brought by applying LLMs to biomedical automatic text summarization. Firstly, the development of BATS is reviewed, where traditional text summarization, neural network-based summarization, and LLMs-based summarization are analyzed systematically. Meanwhile, the applications of various LLMs (e.g., BERT and GPT series) in three types of BATS are presented in detail, including extractive summarization, abstractive summarization, and hybrid summarization. Next, the relevant datasets are introduced, such as PubMed, COVID-19 and MIMIC-Ⅲ. Then, traditional, emerging, and auxiliary metrics for evaluating the performance of BATS are shown, and the performance evaluation of different models is elaborated. Finally, the opportunities brought by applying LLMs to BATS are described, and the potential challenges along with the corresponding solutions are discussed.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104216"},"PeriodicalIF":7.4,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144090723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qing Li , Zhijun Huang , Jianwen Sun , Xin Yuan , Shengyingjie Liu , Zhonghua Yan
{"title":"HKT: Hierarchical structure-based knowledge tracing","authors":"Qing Li , Zhijun Huang , Jianwen Sun , Xin Yuan , Shengyingjie Liu , Zhonghua Yan","doi":"10.1016/j.ipm.2025.104206","DOIUrl":"10.1016/j.ipm.2025.104206","url":null,"abstract":"<div><div>Knowledge tracing (KT) is a fundamental task in Intelligent Tutoring Systems, aiming to predict learners’ performance on specific questions and trace their evolving knowledge state. With the advancement of deep learning in this field, various methods have been applied to model the relations between knowledge. However, most existing knowledge tracing methods focus on modeling knowledge at a single level, neglecting the inherent hierarchical structure of knowledge, which limits their ability to capture complex relations. In this paper, we propose a novel hierarchical knowledge tracing model (HKT), which integrates influences of multiple knowledge levels to predict learners’ performance. Specifically, we construct different types of hierarchical graphs to capture both intra-hierarchy dependencies and cross-hierarchy relations. To effectively combine information from multiple levels, we design weight allocation networks that dynamically assign weights to different knowledge levels, thereby synthesizing their effects for accurate performance prediction. Experimental results demonstrate that HKT outperforms baseline methods on multiple benchmark datasets, validating the effectiveness of integrating knowledge across all levels compared to single-level models.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104206"},"PeriodicalIF":7.4,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shu Zhou , Xin Wang , Jingwen Qiu , Xiaomin Li , Bin Shi , Hao Wang
{"title":"LOSDF: A logical optimization and semantic decoupling framework for question answering in multi-party conversations","authors":"Shu Zhou , Xin Wang , Jingwen Qiu , Xiaomin Li , Bin Shi , Hao Wang","doi":"10.1016/j.ipm.2025.104200","DOIUrl":"10.1016/j.ipm.2025.104200","url":null,"abstract":"<div><div>Multi-party conversation (MPC) bring unprecedented challenges due to the complex scenarios involving multiple speakers and crisscrossed utterance relationships. Existing models for MPC face several key challenges: firstly, they often ignore the logical structures in dialogs, compromising intent understanding. Secondly, these models overlook individual differences in speakers’ linguistic styles, potentially leading to inconsistent responses. Lastly, the frequent changes in utterance content and constant shifts in speakers significantly increase the difficulty of extracting key information from background noise. To address these challenges, we designed the <strong>L</strong>ogical <strong>O</strong>ptimization and <strong>S</strong>emantic <strong>D</strong>ecoupling <strong>F</strong>ramework (LOSDF). Our framework utilizes multi-party attention to manage the contextual information of different speakers over time, effectively handling complex information flow. By rewriting speakers’ utterances, we reduce semantic errors and enhance consistency. Additionally, our information decoupling module distinguishes semantic intent from noise, improving logical reasoning. Evaluations on the Molweni, FriendsQA and DailyDialog datasets show our method outperforms existing models, improving F1 scores by 2.1%, 1.2% and 2.3% respectively. Extensive ablation studies further validate the effectiveness.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104200"},"PeriodicalIF":7.4,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weiyue Li , Bowei Chen , Ming Gao , Jingmin An , Hao Dong , Cheng Chen , Weiguo Fan , Zhiguo Zhu
{"title":"Collaborative local–global context modeling for session-based recommendation","authors":"Weiyue Li , Bowei Chen , Ming Gao , Jingmin An , Hao Dong , Cheng Chen , Weiguo Fan , Zhiguo Zhu","doi":"10.1016/j.ipm.2025.104196","DOIUrl":"10.1016/j.ipm.2025.104196","url":null,"abstract":"<div><div>Session-based recommendation systems (SBRSs) predict the next item in a session by analyzing user interactions. While current methods emphasize sequential item relationships, they often overlook temporal information that highlights subtle shifts in user preferences. This gap can limit their ability to adapt to dynamic user behavior, and recent advances have yet to effectively integrate both sequential and non-sequential item transitions, which may lead to biased modeling. To address these limitations, this paper introduces Coase, a novel SBRS model that unifies local and global context modeling to capture fine-grained dynamic user preferences. Coase transforms session sequences into session star graphs, employing a Bi-Gated Graph Self-Attention Network for local context modeling, and introduces SudokuFormer to model time-aware sequential transitions within a global session context through disentangled attention and stable feature fusion. A triple attention mechanism is then utilized to fully integrate local and global contextual features. Comprehensive experiments conducted on four publicly available datasets demonstrate that Coase improves Recall by 1.71%–1.83%, Mean Reciprocal Rank (MRR) by 2.73%–2.80%, and Normalized Discounted Cumulative Gain (NDCG) by 2.32%–2.43% across the top 5, 10, 15, and 20 items. Ablation studies validate the framework and components of Coase, while additional analyses examine the effect of session length, and visualization studies illustrate diverse attention patterns. This research contributes a novel approach to SBRS, offering promising advancements in recommendation accuracy and user experience.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104196"},"PeriodicalIF":7.4,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distinguishing AI-generated versus real tourism photos: Visual differences, human judgment, and deep learning detection","authors":"Lei Hou, Yu Min, Xue Pan, Zaiwu Gong","doi":"10.1016/j.ipm.2025.104218","DOIUrl":"10.1016/j.ipm.2025.104218","url":null,"abstract":"<div><div>The widespread use of AI-generated photos in marketing raises concerns about authenticity, trust, and misinformation. Addressing these challenges requires understanding the differences between AI-generated and human-captured (real) photos and developing effective detection methods. In this study, we compiled a database of AI and real coastal-related tourism photos to analyze their visual differences, and the detection accuracy of both human and deep learning models. While real photos display diverse color schemes and richer textures, AI photos exhibit enhanced brightness and simplified textures. Despite such significant differences in color and texture features, human judgment largely fails to distinguish AI from real photos, resulting in an average accuracy of only 67.7 %. To address this limitation, a hybrid deep learning model was developed, combining a CNN module for image processing and a dense module for integrating explicit visual features. While a single CNN module achieved an accuracy of 92.9 %, largely outperforming human judgment, the inclusion of explicit features further improves the model’s accuracy to 96.1 %, highlighting the importance of multimodal feature integration. The study contributes to understanding the implications of generative AI in marketing, underscores the importance of transparency for online content, and advances methodological approaches for detecting AI-generated visuals.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104218"},"PeriodicalIF":7.4,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144068277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}