Information Processing & Management最新文献

筛选
英文 中文
QFAS-KE: Query focused answer summarization using keyword extraction QFAS-KE:使用关键字提取的以查询为中心的答案摘要
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-26 DOI: 10.1016/j.ipm.2025.104104
Rupali Goyal , Parteek Kumar , V.P. Singh
{"title":"QFAS-KE: Query focused answer summarization using keyword extraction","authors":"Rupali Goyal ,&nbsp;Parteek Kumar ,&nbsp;V.P. Singh","doi":"10.1016/j.ipm.2025.104104","DOIUrl":"10.1016/j.ipm.2025.104104","url":null,"abstract":"<div><div>Question answering (QA) portals like Quora, Stack Overflow, AskUbuntu, Yahoo! Answers, Reddit, and Wiki Answers have emerged as hubs of curiosity, highlighting the rising demands for easily accessible information and are drawing focus to hundreds of millions of questions. The efficient utilization of these questions and associated answers has become significantly vital for these QA websites. The similarity-based information retrieval methods provide a ranked list of potentially relevant questions, and the users have to spend significant time sifting through the results to discover the best answer. This paper aims to provide a precise, comprehensive, summarized answer to the user asked query using extracted keywords that offer valuable insights into relevant content. The research work presents a Query focused Answer Summarization framework using Keyword Extraction (QFAS-KE). It is a four-stage framework, including query question pre-processing, semantic question search (utilizing SBERT and FAISS vector database), answer retrieval and re-ranking (utilizing BERT-based bi-encoder and cross-encoder), and answer summary generation (using fine-tuned transformers such as BART, PEGASUS, T5) with keyword guidance (using a keyword extractor such as KeyBERT). The results conceptualize the efficacy of the proposed framework on task-specific datasets (CNN/DailyMail and MS-MARCO) over the ROUGE metric. The model outperformed existing baseline models on CNN/DailyMail dataset with a value of 47.5 (PEGASUS), 46.2 (BART), and 45.1 (T5) in terms of ROUGE-1 and on MS-MARCO dataset with a value of 75.18 (PEGASUS), 79.02 (BART), and 74.69 (T5) in terms of ROUGE-L.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104104"},"PeriodicalIF":7.4,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effectively detecting and diagnosing distributed multivariate time series anomalies via Unsupervised Federated Hypernetwork 基于无监督联邦超网络的分布式多变量时间序列异常检测与诊断
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-24 DOI: 10.1016/j.ipm.2025.104107
Junfeng Hao, Peng Chen, Juan Chen, Xi Li
{"title":"Effectively detecting and diagnosing distributed multivariate time series anomalies via Unsupervised Federated Hypernetwork","authors":"Junfeng Hao,&nbsp;Peng Chen,&nbsp;Juan Chen,&nbsp;Xi Li","doi":"10.1016/j.ipm.2025.104107","DOIUrl":"10.1016/j.ipm.2025.104107","url":null,"abstract":"<div><div>Distributed multivariate time series anomaly detection is widely-used in industrial equipment monitoring, financial risk management, and smart cities. Although Federated learning (FL) has garnered significant interest and achieved decent performance in various scenarios, most existing FL-based distributed anomaly detection methods still face challenges including: inadequate detection performance in global model, insufficient essential features extraction caused by the fragmentation of local time series, and lack for practical anomaly localization. To address these challenges, we propose an Unsupervised Federated Hypernetwork Method for Distributed Multivariate Time Series Anomaly Detection and Diagnosis (uFedHy-DisMTSADD). Specifically, we introduce a federated hypernetwork architecture that effectively mitigates the heterogeneity and fluctuations in distributed environments while protecting client data privacy. Then, we adopt the Series Conversion Normalization Transformer (SC Nor-Transformer) to tackle the timing bias due to model aggregation through series conversion. Series normalization improves the temporal dependence of capturing subsequences. Finally, uFedHy-DisMTSADD simultaneously localizes the root cause of the anomaly by reconstructing the anomaly scores obtained from each subsequence. We performed an extensive evaluation on nine datasets, in which uFedHy-DisMTSADD outperformed the existing state-of-the-art baseline average F1 score by 9.19% and the average AUROC by 2.41%. Moreover, the average localization fault accuracy of uFedHy-DisMTSADD is 9.23% higher than that of the optimal baseline method. Code is available at this repository:<span><span>https://github.com/Hjfyoyo/uFedHy-DisMTSADD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104107"},"PeriodicalIF":7.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating collusive manipulation of reviews in e-commerce platforms: Evolutionary game and strategy simulation 减轻电子商务平台评论串通操纵:进化博弈与策略模拟
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-24 DOI: 10.1016/j.ipm.2025.104080
Xiaoxia Xu, Ruguo Fan, Dongxue Wang, Xiao Xie, Kang Du
{"title":"Mitigating collusive manipulation of reviews in e-commerce platforms: Evolutionary game and strategy simulation","authors":"Xiaoxia Xu,&nbsp;Ruguo Fan,&nbsp;Dongxue Wang,&nbsp;Xiao Xie,&nbsp;Kang Du","doi":"10.1016/j.ipm.2025.104080","DOIUrl":"10.1016/j.ipm.2025.104080","url":null,"abstract":"<div><div>The growing review manipulation has seriously hampered credit regulation on e-commerce platforms, yet few studies have explored its complex dynamics. Unlike current research centering on merchants creating various management strategies, this study examines the collusion between merchants and consumers. By integrating evolutionary game theory and a system dynamics approach, this study offers meaningful conclusions for platform credit management. First, our findings indicate that merchants can maintain honesty regardless of the regulatory strategy implemented. For positive regulation, platforms can impose higher penalties; for negative regulation, maintaining lower exposure is feasible. Second, our analysis illustrates the necessity of breaking the collusion between merchants and consumers. Under positive regulation, platforms can amplify penalties or enhance the regulatory impact on platform revenues. Conversely, negative regulation allows for reducing the short-term financial impact of reviews or adjusting cashback. Third, we uncover that dynamic punishment strategies are not always optimal. In some cases, static punishment strategies outperform linear dynamic punishment strategies, highlighting the importance of carefully evaluating the effectiveness of different regulatory approaches in various contexts.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104080"},"PeriodicalIF":7.4,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143474786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Expert-level policy style measurement via knowledge distillation with large language model collaboration 通过与大型语言模型协作的知识蒸馏进行专家级策略风格度量
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-22 DOI: 10.1016/j.ipm.2025.104090
Yujie Zhang , Biao Huang , Weikang Yuan , Zhuoren Jiang , Longsheng Peng , Shuai Chen , Jie-Sheng Tan-Soo
{"title":"Expert-level policy style measurement via knowledge distillation with large language model collaboration","authors":"Yujie Zhang ,&nbsp;Biao Huang ,&nbsp;Weikang Yuan ,&nbsp;Zhuoren Jiang ,&nbsp;Longsheng Peng ,&nbsp;Shuai Chen ,&nbsp;Jie-Sheng Tan-Soo","doi":"10.1016/j.ipm.2025.104090","DOIUrl":"10.1016/j.ipm.2025.104090","url":null,"abstract":"<div><div>Policy style is a crucial concept in policy science that reflects persistent patterns in the policy process across different governance settings. Despite its importance, policy style measurement faces issues of complexity, subjectivity, data sparseness, and computational cost. To overcome these obstacles, we propose <strong>KOALA</strong>, a novel <strong><u>K</u></strong>n<strong><u>O</u></strong>wledge distillation framework based on large l<strong><u>A</u></strong>nguage mode<strong><u>L</u></strong> coll<strong><u>A</u></strong>boration. It transforms the weak scoring abilities of LLMs into a pairwise ranking problem, employs a small set of expert-annotated samples for non-parametric learning, and utilizes knowledge distillation to transfer insights from LLMs to a smaller, more efficient model. The framework incorporates multiple LLM-based agents (Prompter, Ranker, and Analyst) collaborating to comprehend complex measurement standards and self-explain policy style definitions. We validate KOALA on 4,572 Chinese government work reports (1954–2019) from central, provincial, and municipal levels, with a focus on the imposition dimension of policy style. Extensive experiments demonstrate KOALA’s effectiveness in measuring the intensity of policy style, highlighting its superiority over state-of-the-art methods. While GPT-4 achieves only 66% accuracy in pairwise ranking of policy styles, KOALA, despite being based on GPT-3.5, achieves a remarkable 85% accuracy, highlighting significant performance improvement. This framework offers a transferable approach for quantifying complex social science concepts in textual data, bridging computational techniques with social science research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104090"},"PeriodicalIF":7.4,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143471177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond boundaries: Exploring the interaction between science and technology in fusion knowledge communities 超越边界:探索融合知识社区中科学与技术之间的相互作用
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-21 DOI: 10.1016/j.ipm.2025.104102
Jiajie Wang , Wanfang Hou , Yue Li , Jianjun Sun , Lele Kang
{"title":"Beyond boundaries: Exploring the interaction between science and technology in fusion knowledge communities","authors":"Jiajie Wang ,&nbsp;Wanfang Hou ,&nbsp;Yue Li ,&nbsp;Jianjun Sun ,&nbsp;Lele Kang","doi":"10.1016/j.ipm.2025.104102","DOIUrl":"10.1016/j.ipm.2025.104102","url":null,"abstract":"<div><div>Interaction between science and technology (S&amp;T) is a vital mechanism for generating significant innovative breakthroughs. Prior studies have utilized indicators such as semantic similarity or citation analysis to measure the relationships between scientific communities and technological communities represented by papers and patents. However, shifts in innovation paradigms have progressively blurred the boundaries between S&amp;T, leading to the formation of fusion knowledge communities (FKCs) that encompass both scientific and technological knowledge. Therefore, this study proposes a novel approach to exploring the S&amp;T interaction within FKCs. To achieve this, we integrate semantic and citation information by combining BERT and Graph Auto-Encoder algorithms, and employ the Louvain algorithm for FKCs detection. We then conduct a two-step analysis. First, we quantify the strength of S&amp;T interactions over different periods by defining an interaction intensity metric based on the coupling of keywords, and assess the knowledge depth. Second, we analyze the evolution of S&amp;T interactions by measuring knowledge transfer, transmission direction, and degree, which involves computing knowledge similarity between papers and patents and constructing citation networks to highlight key transfer channels over time. We apply this approach to the field of Genetically Engineered Vaccines (GEV), analyzing 1,937 patents and 4,393 papers from 1980 to 2020. The results demonstrate that our method effectively reveals the fusion knowledge community structures between S&amp;T and provides a detailed analysis of interaction patterns and their evolution within FKCs. This study advances the methodology for exploring S&amp;T interactions within FKCs, offering a fine-grained analytical perspective for innovation management research.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104102"},"PeriodicalIF":7.4,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GNN-transformer contrastive learning explores homophily gnn -变压器对比学习探讨同质性
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-20 DOI: 10.1016/j.ipm.2025.104103
Yangding Li , Yangyang Zeng , Xiangchao Zhao , Jiawei Chai , Hao Feng , Shaobin Fu , Cui Ye , Shichao Zhang
{"title":"GNN-transformer contrastive learning explores homophily","authors":"Yangding Li ,&nbsp;Yangyang Zeng ,&nbsp;Xiangchao Zhao ,&nbsp;Jiawei Chai ,&nbsp;Hao Feng ,&nbsp;Shaobin Fu ,&nbsp;Cui Ye ,&nbsp;Shichao Zhang","doi":"10.1016/j.ipm.2025.104103","DOIUrl":"10.1016/j.ipm.2025.104103","url":null,"abstract":"<div><div>Graph Contrastive Learning (GCL) leverages graph structure and node feature information to learn powerful node representations in a self-supervised manner, attracting significant attention from researchers. Most GCL frameworks typically use Graph Neural Networks (GNNs) as their foundational encoders. Still, GNN methods have inherent drawbacks: local GNNs struggle to capture long-range dependencies, and deep GNNs face the oversmoothing problem. Moreover, existing GCL methods do not adequately model node feature information, relying on topology to learn neighbor features. In this paper, we introduce a novel contrastive learning mechanism that employs transformers to capture long-range dependency information while integrating the strong perceptual capabilities of GNNs for local topology, resulting in a GCL architecture that is highly robust across different levels of homophily. Specifically, we design three views: the original view, the long-range information view, and the feature view. By jointly contrasting these three views, the model effectively acquires rich information from the graph. Experimental results on seven real-world datasets with varying levels of homophily demonstrate that the proposed method significantly outperforms other baseline models, validating its effectiveness and rationality.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104103"},"PeriodicalIF":7.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Temporal-spatial hierarchical contrastive learning for misinformation detection: A public-behavior perspective 错误信息检测的时空层次对比学习:公众行为视角
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-20 DOI: 10.1016/j.ipm.2025.104108
Gang Ren , Li Jiang , Tingting Huang , Ying Yang , Taeho Hong
{"title":"Temporal-spatial hierarchical contrastive learning for misinformation detection: A public-behavior perspective","authors":"Gang Ren ,&nbsp;Li Jiang ,&nbsp;Tingting Huang ,&nbsp;Ying Yang ,&nbsp;Taeho Hong","doi":"10.1016/j.ipm.2025.104108","DOIUrl":"10.1016/j.ipm.2025.104108","url":null,"abstract":"<div><div>The widespread dissemination of misinformation on social media platforms significantly affects public security. Current methods for detecting misinformation predominantly rely on semantic information and social context features. However, they often neglect the intricate noise issues and unreliable information interactions resulting from diverse public behaviors, such as cognitive biases, user prejudices, and bot activity. To tackle these challenges, we propose an approach named TSHCL (temporal-spatial hierarchical contrastive learning) for automatic misinformation detection from the public-behavior perspective. First, the integration of a graph convolutional network (GCN)-based autoencoder architecture with a hybrid augmentation method is designed to model typical public behaviors. Next, node-level contrastive learning is designed to maintain the heterogeneity of comments in the spatial view under the influence of complex public behaviors. Finally, cross-view graph-level contrastive learning is designed to promote collaborative learning between the temporal sequence view of events and the spatial propagation structure view. By conducting temporal-spatial hierarchical contrastive learning, the model effectively retains crucial node information and facilitates the interaction of temporal-spatial information. Extensive experiments conducted on real datasets from MCFEND and Weibo demonstrate that our model surpasses the state-of-the-art models. Our proposed model can effectively alleviate the noise and unreliable information interaction caused by public behavior, and enrich the research perspective of misinformation detection.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104108"},"PeriodicalIF":7.4,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving event representation learning via generating and utilizing synthetic data 通过生成和利用合成数据改进事件表示学习
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-19 DOI: 10.1016/j.ipm.2025.104083
Yubo Feng, Lishuang Li, Xueyang Qin, Beibei Zhang
{"title":"Improving event representation learning via generating and utilizing synthetic data","authors":"Yubo Feng,&nbsp;Lishuang Li,&nbsp;Xueyang Qin,&nbsp;Beibei Zhang","doi":"10.1016/j.ipm.2025.104083","DOIUrl":"10.1016/j.ipm.2025.104083","url":null,"abstract":"<div><div>Representations of events are important in various event-related tasks. Recent advances in event representation learning have focused on Contrastive Learning (CL) resulting in remarkable progress. However, solely using <em>dropout</em> as the data augmentation technique in CL methods may cause the model to become sensitive to length differences between event pairs. Moreover, CL methods ignore the evidence that the similarities between positive pairs are different, and the encoder-aware similarities also change dynamically as training progresses. It may cause the event encoder to learn the alignment of positive pairs at a coarse-grained level. In this paper, we propose <strong>LLM-CL</strong>: a <strong>L</strong>arge <strong>L</strong>anguage <strong>M</strong>odels-driven self-adaptive <strong>C</strong>ontrastive <strong>L</strong>earning framework for event representation learning. Specifically, we present an event knowledge graph-augmented synthetic data generation method designed to alleviate the sensitivity of CL-based models to length differences between event pairs. This method generates large-scale, high-quality event pairs with equivalent semantics, little lexical overlap, and varying text lengths. Additionally, we propose a novel CL method called self-adaptive contrastive learning to help the event encoder effectively and efficiently learn the alignment of synthetic data at fine-grained levels. This method dynamically estimates encoder-aware similarities and scales the CL losses accordingly. Experimental results show that LLM-CL outperforms strong baselines in both intrinsic and extrinsic evaluations. Our code is publicly available at <span><span>https://github.com/YuboFeng2023/LLM-CL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104083"},"PeriodicalIF":7.4,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A robust rank aggregation framework for collusive disturbance based on community detection 基于社团检测的合谋扰动鲁棒秩聚集框架
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-19 DOI: 10.1016/j.ipm.2025.104096
Dongmei Chen , Yu Xiao , Jun Wu , Ignacio Javier Pérez , Enrique Herrera-Viedma
{"title":"A robust rank aggregation framework for collusive disturbance based on community detection","authors":"Dongmei Chen ,&nbsp;Yu Xiao ,&nbsp;Jun Wu ,&nbsp;Ignacio Javier Pérez ,&nbsp;Enrique Herrera-Viedma","doi":"10.1016/j.ipm.2025.104096","DOIUrl":"10.1016/j.ipm.2025.104096","url":null,"abstract":"<div><div>Rank aggregation plays a crucial role in diverse fields of science, economy, and society. Unfortunately, some users are driven by huge interests to disrupt the aggregated ranking. It may turn out to be more detrimental when such users collude to behave dishonestly as they can rank in an organized manner and take control of the results. Here, we propose a novel and general rank aggregation framework to combat collusive disturbance. This framework is inspired by the idea that collusive users follow the same/similar behavioral patterns, while normal users do not have such obvious patterns. Specifically, it first analyzes the behavioral similarities between users and constructs a user graph based on this. Second, a community detection algorithm is introduced to divide all users into closely related groups. Third, it assigns each group a weight corresponding to its collusiveness, so that groups comprising collusive users achieve low weight, and vice versa. Finally, we apply this framework to different rank aggregation algorithms, thereby improving their ability to combat collusive disturbance. Extensive experiments highlight that our proposed framework markedly enhances the accuracy and robustness of existing rank aggregation methods, especially for Competition graph method, e.g., it can achieve a relative Kendall tau distance of 0.8283, 0.4394, and 0.2653 on real data.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104096"},"PeriodicalIF":7.4,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical chat-based strategies with MLLMs for Spatio-temporal action detection 基于分层聊天的mllm时空动作检测策略
IF 7.4 1区 管理学
Information Processing & Management Pub Date : 2025-02-17 DOI: 10.1016/j.ipm.2025.104094
Xuyang Zhou , Ye Wang , Fei Tao , Hong Yu , Qun Liu
{"title":"Hierarchical chat-based strategies with MLLMs for Spatio-temporal action detection","authors":"Xuyang Zhou ,&nbsp;Ye Wang ,&nbsp;Fei Tao ,&nbsp;Hong Yu ,&nbsp;Qun Liu","doi":"10.1016/j.ipm.2025.104094","DOIUrl":"10.1016/j.ipm.2025.104094","url":null,"abstract":"<div><div>Spatio-temporal action detection (STAD) in football matches is challenging due to the subtle, fast-paced actions involving multiple participants. Multimodal large language models (MLLMs) often fail to capture these nuances with standard prompts, producing results lacking the detailed descriptions needed to improve visual features. To address this issue, we propose a prompt strategy called Hierarchical Chat-Based Strategies (HCBS). Specifically, this strategy enables MLLMs to form a chain of thought (CoT), gradually generating content with increasingly detailed information. We conduct extensive experiments on three datasets: 126 videos from Multisports, 43 videos from J-HMDB, and 147 videos from UCF101-24, all focus on the football sections. Compared to baseline tasks, our method improves performance by 30.3%, 26.1%, and 25.5% on these three datasets, respectively. Through the experiment of Hierarchy Verification, we demonstrate that HCBS effectively guides MLLMs in generating hierarchical descriptions. Additionally, using HCBS to guide MLLMs in content generation, we create a frame-level description dataset with 120,511 frame descriptions across the three datasets. Our code and dataset are available at the following link: <span><span>https://github.com/TristanAlkaid/HCBS/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 4","pages":"Article 104094"},"PeriodicalIF":7.4,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信