Shengzhi Huang , Qicong Wang , Wei Lu , Lingyu Liu , Zhenzhen Xu , Yong Huang
{"title":"PaperEval: A universal, quantitative, and explainable paper evaluation method powered by a multi-agent system","authors":"Shengzhi Huang , Qicong Wang , Wei Lu , Lingyu Liu , Zhenzhen Xu , Yong Huang","doi":"10.1016/j.ipm.2025.104225","DOIUrl":"10.1016/j.ipm.2025.104225","url":null,"abstract":"<div><div>The immediate and efficient evaluation of scientific papers is crucial for advancing scientific progress. However, traditional peer review faces numerous challenges, including reviewer bias, limited expertise, and an overwhelming volume of publications. Recent advancements in large language models (LLMs) suggest their potential as promising evaluators, capable of approximating human cognition and understanding both ordinary and scientific language. In this study, we propose a novel AI-empowered paper evaluation method, PaperEval (PE), which utilizes a multi-agent system powered by LLMs to design evaluation criteria, assess paper quality along different dimensions, and generate explainable scores. We also introduce two variants of PE, Multi-round PaperEval (MPE) and Self-correcting PaperEval (SPE), which produce comparable scores and iteratively refine the evaluation criteria, respectively. To test our methods, we conducted a comprehensive analysis of three curated datasets, encompassing about 66,000 target papers of varying quality across the fields of mathematics, physics, chemistry, and medicine. The results show that our methods can effectively discern between high- and low-quality papers based on scores derived in four dimensions: Question, Method, Result, and Conclusion. Moreover, the results highlight the evaluation’s stability over time, the impact of comparative papers, the advantages of the multi-round evaluation strategy, and the varying correlation between AI ratings and scientific impact across different disciplines. Our method can seamlessly integrate into the existing scientific evaluation system, offering valuable insights for the development of AI-driven scientific evaluation.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104225"},"PeriodicalIF":7.4,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144168132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junrui Tian , Zexi Lin , Yi Dai , Yang Ding , Jinlei Liu , Lei Cao , Ling Feng
{"title":"Keyframes selection from multiscene videos for stress detection","authors":"Junrui Tian , Zexi Lin , Yi Dai , Yang Ding , Jinlei Liu , Lei Cao , Ling Feng","doi":"10.1016/j.ipm.2025.104215","DOIUrl":"10.1016/j.ipm.2025.104215","url":null,"abstract":"<div><div>In the modern world, stress is a rising global issue that impacts both human physical and mental health. Early stress detection is vital for timely intervention and prevention of health decline. Although widely deployed video cameras in surroundings offer a contact-free channel for stress detection, the computational cost is exceedingly high compared with the methods based on physiological and linguistic signals. To use multiscene videos cost-efficiently, we propose a fine-grained two-stage keyframe selection framework for efficient stress detection. The first emotion-oriented keyframe selection stage intends to reduce irrelevant and redundant frames per video owing to the high frame rate. The second stress-oriented keyframes selection stage aims to grasp emotion dynamics reflecting one’s stressful states, expecting to achieve a decent effect with fewer frames through peer-attended, collaborative deep reinforcement learning. The performance analysis on the developed dataset highlights the benefits of our two-stage multiscene collaborative keyframe selection process for stress detection, achieving an accuracy of 83.61% and an F1-score of 83.48% in three-labeled stress detection and an accuracy of 71.80% and an F1-score of 66.78% in five-labeled stress detection, with a frame selection rate of 0.14% per video. Implications and further possible improvements are discussed at the end of the paper.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104215"},"PeriodicalIF":7.4,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144147426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenhui Li , Bo Li , Weizhi Nie , Lanjun Wang , An-An Liu
{"title":"Diversified perturbation guided by optimal target code for cross-modal adversarial attack","authors":"Wenhui Li , Bo Li , Weizhi Nie , Lanjun Wang , An-An Liu","doi":"10.1016/j.ipm.2025.104214","DOIUrl":"10.1016/j.ipm.2025.104214","url":null,"abstract":"<div><div>Cross-modal retrieval models are vulnerable to adversarial samples, thus exploring efficient attack methods can help researchers understand the essence of adversarial attack, evaluate the robustness of models, and promote the development of more reliable models. Although existing adversarial attack methods have achieved promising results, how to further improve the transferability of adversarial examples remains an open question. In this paper, we propose a novel transferable targeted attack method. First, we introduce an optimal target code optimization strategy to obtain representative target codes. Subsequently, when generating adversarial examples, we propose a random perturbation strategy to diversify the potential input patterns of perturbations by introducing randomness, thus automatically enhancing the generalization of samples. overcoming the limitations of existing methods that depend on specific image augmentation techniques. Experiments show that this framework can generate highly transferable adversarial samples, for example, when transferring attacks from VGG-F to ResNet50, the proposed method outperforms the SOTA by 10.41% in the I2T task on the MS-COCO dataset; in addition, samples generated by small models can also successfully attack large models, which will help researchers study the existence of adversarial attacks from a new perspective.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104214"},"PeriodicalIF":7.4,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengcheng Xu , Tianfeng Wang , Man Chen , Jun Chen , Wei Li , Zhisong Pan
{"title":"GRAIL: Graph contrastive learning with balanced negative sampling","authors":"Chengcheng Xu , Tianfeng Wang , Man Chen , Jun Chen , Wei Li , Zhisong Pan","doi":"10.1016/j.ipm.2025.104211","DOIUrl":"10.1016/j.ipm.2025.104211","url":null,"abstract":"<div><div>Currently, some graph contrastive learning methods mitigate the class imbalance by balancing the number of anchors, overlooking the crucial role of negative samples in forming a regular simplex. Moreover, existing strategies select a limited number of positive samples with poor quality, causing the model to erroneously push away nodes with similar semantics. To address these issues, we propose a <strong>g</strong>raph cont<strong>r</strong>astive learning method with b<strong>a</strong>lanced negat<strong>i</strong>ve samp<strong>l</strong>ing, named GRAIL. Specifically, GRAIL introduces a multi-head similarity metric that leverages mixed probability distributions related to dimensional elements to adaptively select an equal number of hard negative samples within each non-anchor cluster. As a result, GRAIL not only promotes the formation of a regular simplex by balancing the gradient contributions of different negative classes but also selects the most informative hard negative samples to improve the distinguishing ability of minority classes while minimizing the impact on majority classes. Furthermore, GRAIL selects multiple positive samples with a high correct ratio using structural similarity and feature similarity, thereby enabling the model to learn trustworthy node representations. Since traditional contrastive loss focuses on the majority class while neglecting the minority class, a balanced contrastive loss is introduced to optimize node representations. Experiments on node classification, node clustering, and link prediction tasks across six imbalanced graph datasets demonstrate that GRAIL outperforms existing state-of-the-art methods. The source code is available at <span><span>https://github.com/xushucheng-coder/GRAIL/tree/master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104211"},"PeriodicalIF":7.4,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xunlian Wu, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun
{"title":"Motif-based Contrastive Graph Clustering with clustering-oriented prompt","authors":"Xunlian Wu, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun","doi":"10.1016/j.ipm.2025.104208","DOIUrl":"10.1016/j.ipm.2025.104208","url":null,"abstract":"<div><div>Graph contrastive learning has shown significant promise in graph clustering, yet prevalent approaches face two limitations: (1) most existing methods primarily capture lower-order adjacency structures, overlooking high-order motifs that are essential building blocks of the network; (2) most of them do not address false-negative pairs and lack cluster-oriented guidance, potentially embedding irrelevant information in the node representations. To overcome these issues, we introduce a novel Motif-based Contrastive Graph Clustering approach with Clustering-Oriented Prompt (MCGC). Firstly, MCGC employs a specialized Siamese encoder network to obtain both lower-order and higher-order node embeddings. The encoder processes two views of the graph: one based on lower-order adjacency and the other on higher-order motif structures, where higher-order motif (such as triangles) is extracted using motif adjacency matrices. Then, structural contrastive learning is used to ensure cross-view structural consistency. Furthermore, node-level contrastive learning is designed to enhance the discriminative capability of node embeddings, while interactions between samples and centroids provide clustering-oriented prompts. Finally, a parameter-shared MLP aligns embeddings in a unified clustering space, refined by cluster-level contrastive learning. These contrastive learning strategy ensures better-defined cluster boundaries and improves the quality of node representations. The approach is versatile and can be applied in recommendation systems, where clustering similar users enhances personalized recommendations, and in anomaly detection, where it helps identify unusual patterns or outliers in transaction or social networks. Experimental results on six datasets demonstrate that MCGC outperforms state-of-the-art algorithms. For example, on the EAT dataset, MCGC achieves 58.68% in ACC, surpassing the runner-up (CCGC) by 4.71%, demonstrating the effectiveness of motif-based contrastive learning in improving clustering quality. The source code is available at: <span><span>https://github.com/CSLab208/MCGC-Motif-based</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104208"},"PeriodicalIF":7.4,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Qi, Yujun Wen, Junpeng Gong, Pengzhou Zhang, Yao Zheng
{"title":"Multimodal disentanglement implicit distillation for speech emotion recognition","authors":"Xin Qi, Yujun Wen, Junpeng Gong, Pengzhou Zhang, Yao Zheng","doi":"10.1016/j.ipm.2025.104213","DOIUrl":"10.1016/j.ipm.2025.104213","url":null,"abstract":"<div><div>Audio signals are generally utilized with textual data for speech emotion recognition. Nevertheless, cross-modal interactions suffer from distribution discrepancy and information redundancy, leading to an inaccurate multimodal representation. Hence, this paper proposes a multimodal disentanglement implicit distillation model (MDID) that excavates and exploits each modality’s sentiment and specific characteristics. Specifically, the pre-trained models extract high-level acoustic and textual features and align them via an attention mechanism. Then, each modality is disentangled into modality sentiment-specific features. Subsequently, feature-level and logit-level distillation distill the purified modality-specific feature into the modality-sentiment feature. Compared to the adaptive fusion feature, solely employing the refined modality-sentiment feature yields superior performance for emotion recognition. Comprehensive experiments on the IEMOCAP and RAVDESS datasets indicate that MDID outperforms state-of-the-art approaches.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104213"},"PeriodicalIF":7.4,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144124721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Zhang , Lingling Song , Jianfang Liu , Peihua Luo , Zhixin Li , Zhongwei Gong
{"title":"A novel framework for deep knowledge tracing via a dual-state joint interaction mechanism","authors":"Wei Zhang , Lingling Song , Jianfang Liu , Peihua Luo , Zhixin Li , Zhongwei Gong","doi":"10.1016/j.ipm.2025.104210","DOIUrl":"10.1016/j.ipm.2025.104210","url":null,"abstract":"<div><div>Although deep learning-based knowledge tracing (DLKT) models have shown promising results, they typically attribute student performance solely to knowledge states, neglecting the influence of students’ test-taking psychological states. Moreover, the complex interactions between knowledge states and test-taking psychological states remain underexplored, limiting the potential for further advances in these models. To address this, we propose a novel framework, termed the <strong>D</strong>ual-state <strong>J</strong>oint <strong>I</strong>nteraction <strong>M</strong>echanism for deep <strong>K</strong>nowledge <strong>T</strong>racing (DJIM-KT), which models the interactions between students’ knowledge states and test-taking psychological states, with the aim of further enhancing the performance of existing DLKT models. In DJIM-KT, DLKT models are first employed to model students’ knowledge states by extracting interaction information between students and exercises. Simultaneously, guided by behaviorist theory, students’ test-taking psychological states are modeled by capturing higher-order relations between exercises and their answering behaviors. Subsequently, we design the dual-state joint interaction mechanism (DJIM), which precisely quantifies the interactions between knowledge states and test-taking psychological states, and leverages reinforcement learning to analyze students’ real-time feedback in different exercises, thereby dynamically adjusting the prediction weights of the two states. This adaptive DJIM enables DJIM-KT to effectively capture individualized student information. Extensive experiments on three real-world datasets demonstrate that DJIM-KT significantly enhances the prediction accuracy and explainability of DLKT models. Specifically, the two representative DLKT models, deep knowledge tracing (DKT) and separated self-attentive neural knowledge tracing (SAINT), achieve average improvements of 17.46% in AUC and 10.37% in ACC with the help of DJIM-KT.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104210"},"PeriodicalIF":7.4,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144116484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanbing Zhang , Yinan Jing , Fei Zhang , Zhixin Li , X. Sean Wang , Zhenqiang Chen , Cheng Lv
{"title":"TabTransGAN: A hybrid approach integrating GAN and transformer architectures for tabular data synthesis","authors":"Hanbing Zhang , Yinan Jing , Fei Zhang , Zhixin Li , X. Sean Wang , Zhenqiang Chen , Cheng Lv","doi":"10.1016/j.ipm.2025.104220","DOIUrl":"10.1016/j.ipm.2025.104220","url":null,"abstract":"<div><div>While generative adversarial networks (GANs) have made significant advancements in the fields of image and text generation, their application to tabular data synthesis faces distinct challenges since they fail to effectively capture tabular data semantics, which leads to suboptimal performance. To address this challenge, we propose <em>TabTransGAN</em>, a novel architecture that combines the power of Transformer models and GANs to recognize the semantic integrity and attribute information of tabular data with more accuracy. TabTransGAN also introduces position encoding for each column to improve dimension recognition and facilitate correlation capture. Experimental results on 5 real-world datasets show that TabTransGAN outperforms existing methods in various aspects such as synthesis quality, machine learning performance, and privacy preservation.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104220"},"PeriodicalIF":7.4,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey on biomedical automatic text summarization with large language models","authors":"Zhenyu Huang , Xianlai Chen , Yunbo Wang , Jincai Huang , Xing Zhao","doi":"10.1016/j.ipm.2025.104216","DOIUrl":"10.1016/j.ipm.2025.104216","url":null,"abstract":"<div><div>Automatic text summarization in the biomedical field can support efficient literature screening, medical knowledge management, and innovative medical research. In recent years, Large Language Models (LLMs), as a disruptive technology in natural language processing, have shown great potential for Biomedical Automatic Text Summarization (BATS). This technology helps to better understand the terminology of biomedical texts, track medical hotspots, and generate personalized diagnoses and treatment plans. This paper provides an in-depth discussion on the development of BATS, and the opportunities as well as challenges brought by applying LLMs to biomedical automatic text summarization. Firstly, the development of BATS is reviewed, where traditional text summarization, neural network-based summarization, and LLMs-based summarization are analyzed systematically. Meanwhile, the applications of various LLMs (e.g., BERT and GPT series) in three types of BATS are presented in detail, including extractive summarization, abstractive summarization, and hybrid summarization. Next, the relevant datasets are introduced, such as PubMed, COVID-19 and MIMIC-Ⅲ. Then, traditional, emerging, and auxiliary metrics for evaluating the performance of BATS are shown, and the performance evaluation of different models is elaborated. Finally, the opportunities brought by applying LLMs to BATS are described, and the potential challenges along with the corresponding solutions are discussed.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104216"},"PeriodicalIF":7.4,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144090723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qing Li , Zhijun Huang , Jianwen Sun , Xin Yuan , Shengyingjie Liu , Zhonghua Yan
{"title":"HKT: Hierarchical structure-based knowledge tracing","authors":"Qing Li , Zhijun Huang , Jianwen Sun , Xin Yuan , Shengyingjie Liu , Zhonghua Yan","doi":"10.1016/j.ipm.2025.104206","DOIUrl":"10.1016/j.ipm.2025.104206","url":null,"abstract":"<div><div>Knowledge tracing (KT) is a fundamental task in Intelligent Tutoring Systems, aiming to predict learners’ performance on specific questions and trace their evolving knowledge state. With the advancement of deep learning in this field, various methods have been applied to model the relations between knowledge. However, most existing knowledge tracing methods focus on modeling knowledge at a single level, neglecting the inherent hierarchical structure of knowledge, which limits their ability to capture complex relations. In this paper, we propose a novel hierarchical knowledge tracing model (HKT), which integrates influences of multiple knowledge levels to predict learners’ performance. Specifically, we construct different types of hierarchical graphs to capture both intra-hierarchy dependencies and cross-hierarchy relations. To effectively combine information from multiple levels, we design weight allocation networks that dynamically assign weights to different knowledge levels, thereby synthesizing their effects for accurate performance prediction. Experimental results demonstrate that HKT outperforms baseline methods on multiple benchmark datasets, validating the effectiveness of integrating knowledge across all levels compared to single-level models.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104206"},"PeriodicalIF":7.4,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144084250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}