{"title":"QFAS-KE: Query focused answer summarization using keyword extraction","authors":"Rupali Goyal , Parteek Kumar , V.P. Singh","doi":"10.1016/j.ipm.2025.104104","DOIUrl":"10.1016/j.ipm.2025.104104","url":null,"abstract":"<div><div>Question answering (QA) portals like Quora, Stack Overflow, AskUbuntu, Yahoo! Answers, Reddit, and Wiki Answers have emerged as hubs of curiosity, highlighting the rising demands for easily accessible information and are drawing focus to hundreds of millions of questions. The efficient utilization of these questions and associated answers has become significantly vital for these QA websites. The similarity-based information retrieval methods provide a ranked list of potentially relevant questions, and the users have to spend significant time sifting through the results to discover the best answer. This paper aims to provide a precise, comprehensive, summarized answer to the user asked query using extracted keywords that offer valuable insights into relevant content. The research work presents a Query focused Answer Summarization framework using Keyword Extraction (QFAS-KE). It is a four-stage framework, including query question pre-processing, semantic question search (utilizing SBERT and FAISS vector database), answer retrieval and re-ranking (utilizing BERT-based bi-encoder and cross-encoder), and answer summary generation (using fine-tuned transformers such as BART, PEGASUS, T5) with keyword guidance (using a keyword extractor such as KeyBERT). The results conceptualize the efficacy of the proposed framework on task-specific datasets (CNN/DailyMail and MS-MARCO) over the ROUGE metric. The model outperformed existing baseline models on CNN/DailyMail dataset with a value of 47.5 (PEGASUS), 46.2 (BART), and 45.1 (T5) in terms of ROUGE-1 and on MS-MARCO dataset with a value of 75.18 (PEGASUS), 79.02 (BART), and 74.69 (T5) in terms of ROUGE-L.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104104"},"PeriodicalIF":7.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effectively detecting and diagnosing distributed multivariate time series anomalies via Unsupervised Federated Hypernetwork","authors":"Junfeng Hao, Peng Chen, Juan Chen, Xi Li","doi":"10.1016/j.ipm.2025.104107","DOIUrl":"10.1016/j.ipm.2025.104107","url":null,"abstract":"<div><div>Distributed multivariate time series anomaly detection is widely-used in industrial equipment monitoring, financial risk management, and smart cities. Although Federated learning (FL) has garnered significant interest and achieved decent performance in various scenarios, most existing FL-based distributed anomaly detection methods still face challenges including: inadequate detection performance in global model, insufficient essential features extraction caused by the fragmentation of local time series, and lack for practical anomaly localization. To address these challenges, we propose an Unsupervised Federated Hypernetwork Method for Distributed Multivariate Time Series Anomaly Detection and Diagnosis (uFedHy-DisMTSADD). Specifically, we introduce a federated hypernetwork architecture that effectively mitigates the heterogeneity and fluctuations in distributed environments while protecting client data privacy. Then, we adopt the Series Conversion Normalization Transformer (SC Nor-Transformer) to tackle the timing bias due to model aggregation through series conversion. Series normalization improves the temporal dependence of capturing subsequences. Finally, uFedHy-DisMTSADD simultaneously localizes the root cause of the anomaly by reconstructing the anomaly scores obtained from each subsequence. We performed an extensive evaluation on nine datasets, in which uFedHy-DisMTSADD outperformed the existing state-of-the-art baseline average F1 score by 9.19% and the average AUROC by 2.41%. Moreover, the average localization fault accuracy of uFedHy-DisMTSADD is 9.23% higher than that of the optimal baseline method. Code is available at this repository:<span><span>https://github.com/Hjfyoyo/uFedHy-DisMTSADD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104107"},"PeriodicalIF":7.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoxia Xu, Ruguo Fan, Dongxue Wang, Xiao Xie, Kang Du
{"title":"Mitigating collusive manipulation of reviews in e-commerce platforms: Evolutionary game and strategy simulation","authors":"Xiaoxia Xu, Ruguo Fan, Dongxue Wang, Xiao Xie, Kang Du","doi":"10.1016/j.ipm.2025.104080","DOIUrl":"10.1016/j.ipm.2025.104080","url":null,"abstract":"<div><div>The growing review manipulation has seriously hampered credit regulation on e-commerce platforms, yet few studies have explored its complex dynamics. Unlike current research centering on merchants creating various management strategies, this study examines the collusion between merchants and consumers. By integrating evolutionary game theory and a system dynamics approach, this study offers meaningful conclusions for platform credit management. First, our findings indicate that merchants can maintain honesty regardless of the regulatory strategy implemented. For positive regulation, platforms can impose higher penalties; for negative regulation, maintaining lower exposure is feasible. Second, our analysis illustrates the necessity of breaking the collusion between merchants and consumers. Under positive regulation, platforms can amplify penalties or enhance the regulatory impact on platform revenues. Conversely, negative regulation allows for reducing the short-term financial impact of reviews or adjusting cashback. Third, we uncover that dynamic punishment strategies are not always optimal. In some cases, static punishment strategies outperform linear dynamic punishment strategies, highlighting the importance of carefully evaluating the effectiveness of different regulatory approaches in various contexts.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104080"},"PeriodicalIF":7.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Expert-level policy style measurement via knowledge distillation with large language model collaboration","authors":"Yujie Zhang , Biao Huang , Weikang Yuan , Zhuoren Jiang , Longsheng Peng , Shuai Chen , Jie-Sheng Tan-Soo","doi":"10.1016/j.ipm.2025.104090","DOIUrl":"10.1016/j.ipm.2025.104090","url":null,"abstract":"<div><div>Policy style is a crucial concept in policy science that reflects persistent patterns in the policy process across different governance settings. Despite its importance, policy style measurement faces issues of complexity, subjectivity, data sparseness, and computational cost. To overcome these obstacles, we propose <strong>KOALA</strong>, a novel <strong><u>K</u></strong>n<strong><u>O</u></strong>wledge distillation framework based on large l<strong><u>A</u></strong>nguage mode<strong><u>L</u></strong> coll<strong><u>A</u></strong>boration. It transforms the weak scoring abilities of LLMs into a pairwise ranking problem, employs a small set of expert-annotated samples for non-parametric learning, and utilizes knowledge distillation to transfer insights from LLMs to a smaller, more efficient model. The framework incorporates multiple LLM-based agents (Prompter, Ranker, and Analyst) collaborating to comprehend complex measurement standards and self-explain policy style definitions. We validate KOALA on 4,572 Chinese government work reports (1954–2019) from central, provincial, and municipal levels, with a focus on the imposition dimension of policy style. Extensive experiments demonstrate KOALA’s effectiveness in measuring the intensity of policy style, highlighting its superiority over state-of-the-art methods. While GPT-4 achieves only 66% accuracy in pairwise ranking of policy styles, KOALA, despite being based on GPT-3.5, achieves a remarkable 85% accuracy, highlighting significant performance improvement. This framework offers a transferable approach for quantifying complex social science concepts in textual data, bridging computational techniques with social science research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104090"},"PeriodicalIF":7.4,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiajie Wang , Wanfang Hou , Yue Li , Jianjun Sun , Lele Kang
{"title":"Beyond boundaries: Exploring the interaction between science and technology in fusion knowledge communities","authors":"Jiajie Wang , Wanfang Hou , Yue Li , Jianjun Sun , Lele Kang","doi":"10.1016/j.ipm.2025.104102","DOIUrl":"10.1016/j.ipm.2025.104102","url":null,"abstract":"<div><div>Interaction between science and technology (S&T) is a vital mechanism for generating significant innovative breakthroughs. Prior studies have utilized indicators such as semantic similarity or citation analysis to measure the relationships between scientific communities and technological communities represented by papers and patents. However, shifts in innovation paradigms have progressively blurred the boundaries between S&T, leading to the formation of fusion knowledge communities (FKCs) that encompass both scientific and technological knowledge. Therefore, this study proposes a novel approach to exploring the S&T interaction within FKCs. To achieve this, we integrate semantic and citation information by combining BERT and Graph Auto-Encoder algorithms, and employ the Louvain algorithm for FKCs detection. We then conduct a two-step analysis. First, we quantify the strength of S&T interactions over different periods by defining an interaction intensity metric based on the coupling of keywords, and assess the knowledge depth. Second, we analyze the evolution of S&T interactions by measuring knowledge transfer, transmission direction, and degree, which involves computing knowledge similarity between papers and patents and constructing citation networks to highlight key transfer channels over time. We apply this approach to the field of Genetically Engineered Vaccines (GEV), analyzing 1,937 patents and 4,393 papers from 1980 to 2020. The results demonstrate that our method effectively reveals the fusion knowledge community structures between S&T and provides a detailed analysis of interaction patterns and their evolution within FKCs. This study advances the methodology for exploring S&T interactions within FKCs, offering a fine-grained analytical perspective for innovation management research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104102"},"PeriodicalIF":7.4,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangding Li , Yangyang Zeng , Xiangchao Zhao , Jiawei Chai , Hao Feng , Shaobin Fu , Cui Ye , Shichao Zhang
{"title":"GNN-transformer contrastive learning explores homophily","authors":"Yangding Li , Yangyang Zeng , Xiangchao Zhao , Jiawei Chai , Hao Feng , Shaobin Fu , Cui Ye , Shichao Zhang","doi":"10.1016/j.ipm.2025.104103","DOIUrl":"10.1016/j.ipm.2025.104103","url":null,"abstract":"<div><div>Graph Contrastive Learning (GCL) leverages graph structure and node feature information to learn powerful node representations in a self-supervised manner, attracting significant attention from researchers. Most GCL frameworks typically use Graph Neural Networks (GNNs) as their foundational encoders. Still, GNN methods have inherent drawbacks: local GNNs struggle to capture long-range dependencies, and deep GNNs face the oversmoothing problem. Moreover, existing GCL methods do not adequately model node feature information, relying on topology to learn neighbor features. In this paper, we introduce a novel contrastive learning mechanism that employs transformers to capture long-range dependency information while integrating the strong perceptual capabilities of GNNs for local topology, resulting in a GCL architecture that is highly robust across different levels of homophily. Specifically, we design three views: the original view, the long-range information view, and the feature view. By jointly contrasting these three views, the model effectively acquires rich information from the graph. Experimental results on seven real-world datasets with varying levels of homophily demonstrate that the proposed method significantly outperforms other baseline models, validating its effectiveness and rationality.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104103"},"PeriodicalIF":7.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Ren , Li Jiang , Tingting Huang , Ying Yang , Taeho Hong
{"title":"Temporal-spatial hierarchical contrastive learning for misinformation detection: A public-behavior perspective","authors":"Gang Ren , Li Jiang , Tingting Huang , Ying Yang , Taeho Hong","doi":"10.1016/j.ipm.2025.104108","DOIUrl":"10.1016/j.ipm.2025.104108","url":null,"abstract":"<div><div>The widespread dissemination of misinformation on social media platforms significantly affects public security. Current methods for detecting misinformation predominantly rely on semantic information and social context features. However, they often neglect the intricate noise issues and unreliable information interactions resulting from diverse public behaviors, such as cognitive biases, user prejudices, and bot activity. To tackle these challenges, we propose an approach named TSHCL (temporal-spatial hierarchical contrastive learning) for automatic misinformation detection from the public-behavior perspective. First, the integration of a graph convolutional network (GCN)-based autoencoder architecture with a hybrid augmentation method is designed to model typical public behaviors. Next, node-level contrastive learning is designed to maintain the heterogeneity of comments in the spatial view under the influence of complex public behaviors. Finally, cross-view graph-level contrastive learning is designed to promote collaborative learning between the temporal sequence view of events and the spatial propagation structure view. By conducting temporal-spatial hierarchical contrastive learning, the model effectively retains crucial node information and facilitates the interaction of temporal-spatial information. Extensive experiments conducted on real datasets from MCFEND and Weibo demonstrate that our model surpasses the state-of-the-art models. Our proposed model can effectively alleviate the noise and unreliable information interaction caused by public behavior, and enrich the research perspective of misinformation detection.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104108"},"PeriodicalIF":7.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving event representation learning via generating and utilizing synthetic data","authors":"Yubo Feng, Lishuang Li, Xueyang Qin, Beibei Zhang","doi":"10.1016/j.ipm.2025.104083","DOIUrl":"10.1016/j.ipm.2025.104083","url":null,"abstract":"<div><div>Representations of events are important in various event-related tasks. Recent advances in event representation learning have focused on Contrastive Learning (CL) resulting in remarkable progress. However, solely using <em>dropout</em> as the data augmentation technique in CL methods may cause the model to become sensitive to length differences between event pairs. Moreover, CL methods ignore the evidence that the similarities between positive pairs are different, and the encoder-aware similarities also change dynamically as training progresses. It may cause the event encoder to learn the alignment of positive pairs at a coarse-grained level. In this paper, we propose <strong>LLM-CL</strong>: a <strong>L</strong>arge <strong>L</strong>anguage <strong>M</strong>odels-driven self-adaptive <strong>C</strong>ontrastive <strong>L</strong>earning framework for event representation learning. Specifically, we present an event knowledge graph-augmented synthetic data generation method designed to alleviate the sensitivity of CL-based models to length differences between event pairs. This method generates large-scale, high-quality event pairs with equivalent semantics, little lexical overlap, and varying text lengths. Additionally, we propose a novel CL method called self-adaptive contrastive learning to help the event encoder effectively and efficiently learn the alignment of synthetic data at fine-grained levels. This method dynamically estimates encoder-aware similarities and scales the CL losses accordingly. Experimental results show that LLM-CL outperforms strong baselines in both intrinsic and extrinsic evaluations. Our code is publicly available at <span><span>https://github.com/YuboFeng2023/LLM-CL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104083"},"PeriodicalIF":7.4,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dongmei Chen , Yu Xiao , Jun Wu , Ignacio Javier Pérez , Enrique Herrera-Viedma
{"title":"A robust rank aggregation framework for collusive disturbance based on community detection","authors":"Dongmei Chen , Yu Xiao , Jun Wu , Ignacio Javier Pérez , Enrique Herrera-Viedma","doi":"10.1016/j.ipm.2025.104096","DOIUrl":"10.1016/j.ipm.2025.104096","url":null,"abstract":"<div><div>Rank aggregation plays a crucial role in diverse fields of science, economy, and society. Unfortunately, some users are driven by huge interests to disrupt the aggregated ranking. It may turn out to be more detrimental when such users collude to behave dishonestly as they can rank in an organized manner and take control of the results. Here, we propose a novel and general rank aggregation framework to combat collusive disturbance. This framework is inspired by the idea that collusive users follow the same/similar behavioral patterns, while normal users do not have such obvious patterns. Specifically, it first analyzes the behavioral similarities between users and constructs a user graph based on this. Second, a community detection algorithm is introduced to divide all users into closely related groups. Third, it assigns each group a weight corresponding to its collusiveness, so that groups comprising collusive users achieve low weight, and vice versa. Finally, we apply this framework to different rank aggregation algorithms, thereby improving their ability to combat collusive disturbance. Extensive experiments highlight that our proposed framework markedly enhances the accuracy and robustness of existing rank aggregation methods, especially for Competition graph method, e.g., it can achieve a relative Kendall tau distance of 0.8283, 0.4394, and 0.2653 on real data.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104096"},"PeriodicalIF":7.4,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuyang Zhou , Ye Wang , Fei Tao , Hong Yu , Qun Liu
{"title":"Hierarchical chat-based strategies with MLLMs for Spatio-temporal action detection","authors":"Xuyang Zhou , Ye Wang , Fei Tao , Hong Yu , Qun Liu","doi":"10.1016/j.ipm.2025.104094","DOIUrl":"10.1016/j.ipm.2025.104094","url":null,"abstract":"<div><div>Spatio-temporal action detection (STAD) in football matches is challenging due to the subtle, fast-paced actions involving multiple participants. Multimodal large language models (MLLMs) often fail to capture these nuances with standard prompts, producing results lacking the detailed descriptions needed to improve visual features. To address this issue, we propose a prompt strategy called Hierarchical Chat-Based Strategies (HCBS). Specifically, this strategy enables MLLMs to form a chain of thought (CoT), gradually generating content with increasingly detailed information. We conduct extensive experiments on three datasets: 126 videos from Multisports, 43 videos from J-HMDB, and 147 videos from UCF101-24, all focus on the football sections. Compared to baseline tasks, our method improves performance by 30.3%, 26.1%, and 25.5% on these three datasets, respectively. Through the experiment of Hierarchy Verification, we demonstrate that HCBS effectively guides MLLMs in generating hierarchical descriptions. Additionally, using HCBS to guide MLLMs in content generation, we create a frame-level description dataset with 120,511 frame descriptions across the three datasets. Our code and dataset are available at the following link: <span><span>https://github.com/TristanAlkaid/HCBS/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104094"},"PeriodicalIF":7.4,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}