Information Processing & Management最新文献_第4页

Learning path recommendation based on forgetting factors and knowledge graph awareness 基于遗忘因子和知识图谱感知的学习路径推荐

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-19 DOI: 10.1016/j.ipm.2025.104393

Yunxia Fan , Mingwen Tong , Duantengchuan Li

{"title":"Learning path recommendation based on forgetting factors and knowledge graph awareness","authors":"Yunxia Fan , Mingwen Tong , Duantengchuan Li","doi":"10.1016/j.ipm.2025.104393","DOIUrl":"10.1016/j.ipm.2025.104393","url":null,"abstract":"<div><div>Learning path recommendation involves generating sequences of learning objects that are adapted to learners’ needs, goals, abilities, and other factors through recommendation algorithms. Reinforcement learning (RL) has become an important approach for this task; however, it primarily emphasizes recommending new knowledge concepts while neglecting the necessity of revisiting forgotten ones. To overcome this limitation, FKGRec is introduced as a learning path recommendation framework that incorporates forgetting factors and knowledge graph awareness. To address the forgetting problem, a novel method named MemGNN is proposed, which integrates forgetting and knowledge graph features and employs a graph neural network with a memory gate structure to predict both new and previously learned knowledge concepts at each learning step. To further optimize the sequencing of new and previously learned knowledge concepts, an action space is constructed based on knowledge concept prediction, taking learners’ cognitive states into account. An RL algorithm is then applied to recommend optimal learning paths by balancing new and previously learned knowledge concepts using a designed reward function. Experiments conducted on three datasets demonstrate that FKGRec surpasses existing state-of-the-art frameworks. A case analysis shows that the FKGRec framework can recommend learning paths that integrate new and previously learned knowledge concepts, aligned with learners’ current cognitive state and forgetting factors.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104393"},"PeriodicalIF":6.9,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Navigating the perceived credibility and adoption of AI-generated review summaries in online shopping: An affordance perspective 在网上购物中导航人工智能生成的评论摘要的感知可信度和采用：一个可视性的角度

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-18 DOI: 10.1016/j.ipm.2025.104404

Mingxia Jia , Yuxiang Zhao Chris , Xiaoyu Zhang

{"title":"Navigating the perceived credibility and adoption of AI-generated review summaries in online shopping: An affordance perspective","authors":"Mingxia Jia , Yuxiang Zhao Chris , Xiaoyu Zhang","doi":"10.1016/j.ipm.2025.104404","DOIUrl":"10.1016/j.ipm.2025.104404","url":null,"abstract":"<div><div>As generative AI (GenAI) advances, e-commerce platforms increasingly leverage AI-generated review summaries to facilitate consumer decision-making. However, given the experience-driven nature of online review consumption, whether consumers perceive these summaries as credible, useful, and adoptable remains a key challenge to their effective implementation. Therefore, using affordance actualization theory, we conducted a scenario-based experiment and survey to analyze the quantitative data from 713 consumers (N_search product = 356, N_experience product = 357) regarding their perceptions of AI-generated review summaries. The findings show that functional affordances (algorithmic transparency, understandability, and convenience) and symbolic expressions (conveyed values and meanings) toward AI-generated review summaries play important roles in shaping consumers’ perceived credibility. Among them, algorithmic transparency, meaning conveyed, and understandability were identified as strong predictors. Perceived credibility further predicts perceived helpfulness, which, in turn, motivates users’ intentions to adopt AI-generated review summaries and contribute to consumer reviews. Interestingly, these influence pathways differ significantly depending on whether the product is a search or an experience product. This study provides an empirical investigation into the pathway from affordance to actualized belief and behavioral intention in the AI-generated review summaries context and offers practical insights for its effective application in AI-powered marketing.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104404"},"PeriodicalIF":6.9,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CADF: Real-time multi-biosignal modal recognition with causality-aware dimension fusion CADF：基于因果感知维度融合的实时多生物信号模态识别

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-18 DOI: 10.1016/j.ipm.2025.104378

Jing Tao , Zhuang Li , Lin Wang , Dahua Shou

{"title":"CADF: Real-time multi-biosignal modal recognition with causality-aware dimension fusion","authors":"Jing Tao , Zhuang Li , Lin Wang , Dahua Shou","doi":"10.1016/j.ipm.2025.104378","DOIUrl":"10.1016/j.ipm.2025.104378","url":null,"abstract":"<div><div>Real-time analysis of multiple biological signals offers social media systems valuable insights into user engagement, but capturing the complex temporal dynamics and inter-signal relationships remains a challenge. This study introduces a novel framework, CADF (Causality-Aware Dimension Fusion), for real-time multi-biosignal modality recognition. CADF introduces a causality-aware temporal encoder that preserves temporal causality while effectively modeling long-term dependencies in one-dimensional signals. Additionally, the time series data is converted to extract 2D spatial masks. The bi-dimensional features are fused to identify modalities with the aid of a streamlined MultiHead mechanism. Extensive experiments on the DSADS, WESAD, and CAP datasets show that CADF reduces the number of parameters by at least 58% and improves the accuracy by 8% compared to the SOTA model. In particular, the accuracy of the three-classification emotion recognition task reached 95%. These results emphasize the effectiveness and efficiency of CADF in real-time biosignal analysis, with important implications for user-centric applications.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104378"},"PeriodicalIF":6.9,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A guard against ambiguous sentiment for multimodal aspect-level sentiment classification 多模态方面级情感分类中对歧义情感的防范

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-17 DOI: 10.1016/j.ipm.2025.104375

Yanjing Wang , Kai Sun , Bin Shi , Hao Wu , Kaihao Zhang , Bo Dong

{"title":"A guard against ambiguous sentiment for multimodal aspect-level sentiment classification","authors":"Yanjing Wang , Kai Sun , Bin Shi , Hao Wu , Kaihao Zhang , Bo Dong","doi":"10.1016/j.ipm.2025.104375","DOIUrl":"10.1016/j.ipm.2025.104375","url":null,"abstract":"<div><div>Recent advances in multimodal learning have achieved state-of-the-art results in aspect-level sentiment classification by leveraging both text and image data. However, images can sometimes contain contradictory sentiment cues or convey complex messages, making it difficult to accurately determine the sentiment expressed in the text. Intuitively, we should only use image data to complement the text if the latter contains ambiguous sentiment or leans toward the neutral polarity. Therefore, instead of trying to forcefully use images as done in prior work, we develop a Guard against Ambiguous Sentiment (GAS) for multimodal aspect-level sentiment classification (MALSC). Built on a pretrained language model, GAS is equipped with a novel “ambiguity learning” strategy that focuses on learning the degree of sentiment ambiguity within the input text. The sentiment ambiguity then serves to determine the extent to which image information should be utilized for accurate sentiment classification. In our experiments with two benchmark twitter datasets, we found that GAS achieves a performance gain of up to 0.98% in macro-F1 score compared to recent methods in the task. Furthermore, we explore the efficacy of large language models (LLMs) in the MALSC task by employing the core ideas behind GAS to design tailored prompts. We show that multimodal LLMs such as LLaVA, when provided with GAS-principled prompts, yields a 2.4% improvement in macro-F1 score for few-shot learning on the MALSC task.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104375"},"PeriodicalIF":6.9,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HU-RNSP: Efficiently mining high-utility repeated negative sequential patterns HU-RNSP：高效挖掘高效用重复负序模式

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-16 DOI: 10.1016/j.ipm.2025.104402

Ke Xiao , Ping Qiu , Dun Lan , Xiangjun Dong , Lei Guo , Yuhai Zhao , Yongshun Gong , Long Zhao

{"title":"HU-RNSP: Efficiently mining high-utility repeated negative sequential patterns","authors":"Ke Xiao , Ping Qiu , Dun Lan , Xiangjun Dong , Lei Guo , Yuhai Zhao , Yongshun Gong , Long Zhao","doi":"10.1016/j.ipm.2025.104402","DOIUrl":"10.1016/j.ipm.2025.104402","url":null,"abstract":"<div><div>High-utility repeated negative sequential patterns (HURNSPs) mining plays a key role in behavioral analysis and user preference mining. However, existing HUSPM mining methods do not consider the importance of repeated negative sequential patterns (RNSPs) or high-utility negative sequential patterns (HUNSPs), which pose the following challenges for HURNSPs mining: (1) Lack of an effective method for calculating the utility of high-utility repeated positive sequential patterns (HURPSPs), (2) Lack of an effective method for calculating the utility value of high-utility repeated negative sequential candidate patterns (HURNSCs). To solve the above challenges, this paper proposes an effective algorithm, HU-RNSP, for mining HURNSPs. First, an algorithm, called HURSpan, is proposed to mine HURPSPs by integrating RNSP and HUSPM into the mining of HURNSPs. Second, an algorithm, NSPGwl, is proposed, which converts HURPSPs into HURNSCs, effectively calculates the utility of HURNSCs, and compares the utility of HURNSCs with a minimum utility threshold to obtain HURNSPs. Experimental results on nine datasets demonstrate that HU-RNSP is more effective than baseline methods in discovering HURNSPs. Additionally, we analyze the impact of data features on HURNSP mining. The results indicate that HU-RNSP demonstrates strong adaptability and computational efficiency across experiments on datasets with varying data factors.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104402"},"PeriodicalIF":6.9,"publicationDate":"2025-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MCIRP: A multi-granularity cross-modal interaction model based on relational propagation for Multimodal Named Entity Recognition with multiple images MCIRP：多图像多模态命名实体识别中基于关系传播的多粒度跨模态交互模型

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-15 DOI: 10.1016/j.ipm.2025.104384

Yongheng Mu , Ziyu Guo , Xuewei Li , Lixu Shao , Shijun Liu , Feng Li , Guangxu Mei

{"title":"MCIRP: A multi-granularity cross-modal interaction model based on relational propagation for Multimodal Named Entity Recognition with multiple images","authors":"Yongheng Mu , Ziyu Guo , Xuewei Li , Lixu Shao , Shijun Liu , Feng Li , Guangxu Mei","doi":"10.1016/j.ipm.2025.104384","DOIUrl":"10.1016/j.ipm.2025.104384","url":null,"abstract":"<div><div>Most existing Multimodal Named Entity Recognition (MNER) methods typically focus on processing textual content with a single image and fail to effectively handle content with multiple images. Therefore, MNER with multiple images presents significant research potential. However, current approaches for this task face two key limitations: (1) Treating all images equally without assessing their relevance to the text, which may introduce visual noise from unrelated images; (2) Relying solely on coarse-grained image features while disregarding fine-grained alignments between text and each image. To address the above limitations, this work introduces a novel <u>M</u>ulti-granularity <u>C</u>ross-modal <u>I</u>nteraction Model based on <u>R</u>elational <u>P</u>ropagation (MCIRP), which effectively leverages information from multiple images. For the first limitation, we propose a text–image relation propagation strategy that calculates the correlation score between the text and each image, enabling selective utilization of relevant image information. For the second limitation, we propose a multi-granularity cross-modal interaction fusion technique to facilitate the fusion of text and visual features at different levels of granularity. To the best of our knowledge, this is the first study to explore text–image relation propagation for the MNER task with multiple images. The results show that MCIRP improves the F1 scores on two MNER public datasets with multiple images (MNER-MI and MNER-MI-Plus) by 3.65% and 0.56%, respectively, achieving SOTA performance among existing multi-image methods.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104384"},"PeriodicalIF":6.9,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145106102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Propaganda by prompt: Tracing hidden linguistic strategies in large language models 提示式宣传：追踪大型语言模型中隐藏的语言策略

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-12 DOI: 10.1016/j.ipm.2025.104403

Arash Barfar, Lee Sommerfeldt

{"title":"Propaganda by prompt: Tracing hidden linguistic strategies in large language models","authors":"Arash Barfar, Lee Sommerfeldt","doi":"10.1016/j.ipm.2025.104403","DOIUrl":"10.1016/j.ipm.2025.104403","url":null,"abstract":"<div><div>As large language models become increasingly integrated into news production, concerns have grown over their potential to generate polarizing propaganda. This study introduces a scalable and flexible framework for systematically tracing the rhetorical strategies LLMs use to produce propaganda-style content. We apply the framework across three versions of GPT (GPT-3.5-Turbo, GPT-4o, and GPT-4.1), generating over 340,000 articles on selected politically divisive topics in the American news landscape. Supported by highly consistent distinctions (AUROC above 98 %), our findings reveal that the persuasive strategies adopted by GPT are both coherent and evolving across model versions. All three models rely heavily on cognitive language to simulate deliberation and interpretive reasoning, combined with consistent use of moral framing. Each version layers this rhetorical core with distinct stylistic choices: GPT-3.5-Turbo emphasizes collective identity and narrative looseness; GPT-4o adopts reflective detachment through its use of impersonal pronouns and tentative language; and GPT-4.1 deploys lexical sophistication and definitive assertions to project authority. These differences reflect a rhetorical evolution driven by architectural refinements, training updates, and changes in safety guard behavior. A comparison with human-authored propaganda further shows that GPT is not simply reproducing prompt-induced rhetorical biases but appears to exhibit distinct generative tendencies beyond those present in the human-authored baselines. The framework developed here offers a practical reverse-engineering tool for researchers, policymakers, and developers to explain and audit the persuasive capabilities of LLMs. It contributes to broader efforts in AI transparency, content moderation, and the promotion of epistemic resilience in digital communication.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104403"},"PeriodicalIF":6.9,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145049115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CITR: Context-driven implicit triple reasoning for joint multimodal entity-relation extraction 联合多模态实体关系抽取的上下文驱动隐式三重推理

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-12 DOI: 10.1016/j.ipm.2025.104388

Xinyu Liu , Guanglu Sun , Jing Jin , Fei Lang , Suxia Zhu

{"title":"CITR: Context-driven implicit triple reasoning for joint multimodal entity-relation extraction","authors":"Xinyu Liu , Guanglu Sun , Jing Jin , Fei Lang , Suxia Zhu","doi":"10.1016/j.ipm.2025.104388","DOIUrl":"10.1016/j.ipm.2025.104388","url":null,"abstract":"<div><div>Joint Multimodal Entity-Relation Extraction (JMERE) jointly models Multimodal Named Entity Recognition (MNER) and Multimodal Relation Extraction (MRE), aiming to extract valuable structured information from multimodal input. However, existing JMERE methods struggle to fully leverage bidirectional semantic interactions between tasks. To this end, this paper proposes a context-driven implicit triple reasoning framework (CITR), which takes type triples composed of entity and relation types as the foundation. Specifically, CITR first uses context generated by large multimodal models (LMMs) as semantic guidance cues to enhance modality representations, and prevent excessive semantic bias through a constraint module. Subsequently, CITR models the complex dependencies of different type triples to iteratively refine the representations associated with implicit triples. Finally, this paper reformulates the JMERE task as a type triple-centric sequence labeling problem and designs a dual-sequence joint tagging scheme, which reduces the computational complexity and label sparsity compared to previous schemes. Experimental results show that CITR achieves F1 score of 58.02% on the JMERE (Joint) task, significantly outperforming the state-of-the-art methods by 0.99%. Compared to methods with LMMs, CITR using LLaVA-1.5 achieves a superior F1 score of 58.49%.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104388"},"PeriodicalIF":6.9,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145049114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A survey of slow thinking-based reasoning LLMs using reinforcement learning and test-time scaling law 基于强化学习和测试时间标度律的慢思维推理llm研究

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-12 DOI: 10.1016/j.ipm.2025.104394

Qianjun Pan , Wenkai Ji , Yuyang Ding, Junsong Li, Shilian Chen, Junyi Wang, Jie Zhou, Qin Chen, Min Zhang, Yulan Wu, Liang He

{"title":"A survey of slow thinking-based reasoning LLMs using reinforcement learning and test-time scaling law","authors":"Qianjun Pan , Wenkai Ji , Yuyang Ding, Junsong Li, Shilian Chen, Junyi Wang, Jie Zhou, Qin Chen, Min Zhang, Yulan Wu, Liang He","doi":"10.1016/j.ipm.2025.104394","DOIUrl":"10.1016/j.ipm.2025.104394","url":null,"abstract":"<div><div>This survey presents a focused and conceptually distinct framework for understanding recent advancements in reasoning large language models (LLMs) designed to emulate “slow thinking”, a deliberate, analytical mode of cognition analogous to System 2 in dual-process theory from cognitive psychology. While prior review works have surveyed reasoning LLMs through fragmented lenses, such as isolated technical paradigms (e.g., reinforcement learning or test-time scaling) or broad post-training taxonomies, this work uniquely integrates reinforcement learning and test-time scaling as synergistic mechanisms within a unified “slow thinking” paradigm. By synthesizing insights from over 200 studies, we identify three interdependent pillars that collectively enable advanced reasoning: (1) Test-time scaling, which dynamically allocates computational resources based on task complexity via search, adaptive computation, and verification; (2) Reinforcement learning, which refines reasoning trajectories through reward modeling, policy optimization, and self-improvement; and (3) Slow-thinking frameworks, which structure reasoning into stepwise, hierarchical, or hybrid processes such as long Chain-of-Thought and multi-agent deliberation. Unlike existing surveys, our framework is goal-oriented, centering on the cognitive objective of “slow thinking” as both a unifying principle and a design imperative. This perspective enables a systematic analysis of how diverse techniques converge toward human-like deep reasoning. The survey charts a trajectory toward next-generation LLMs that balance cognitive fidelity with computational efficiency, while also outlining key challenges and future directions. Advancing such reasoning capabilities is essential for deploying LLMs in high-stakes domains including scientific discovery, autonomous agents, and complex decision support systems.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104394"},"PeriodicalIF":6.9,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145049113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DR-MIM: Zero-shot cross-lingual transfer via disentangled representation and mutual information maximization DR-MIM：通过解纠缠表示和相互信息最大化的零概率跨语言迁移

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-09-11 DOI: 10.1016/j.ipm.2025.104389

Wenwen Zhao, Zhisheng Yang, Li Li

{"title":"DR-MIM: Zero-shot cross-lingual transfer via disentangled representation and mutual information maximization","authors":"Wenwen Zhao, Zhisheng Yang, Li Li","doi":"10.1016/j.ipm.2025.104389","DOIUrl":"10.1016/j.ipm.2025.104389","url":null,"abstract":"<div><div>Multilingual models have made significant progress in cross-lingual transferability through large-scale pretraining. However, the generated global representations are often mixed with language-specific noise, limiting their effectiveness in low-resource language scenarios. This paper explores how to more efficiently utilize the representations learned by multilingual pretraining models by separating language-invariant features from language-specific ones. To this end, we propose a novel cross-lingual transfer framework, DR-MIM, which explicitly decouples universal and language-specific features, reduces noise interference, and improves model stability and accuracy. Additionally, we introduce a mutual information maximization mechanism to strengthen the correlation between universal features and model outputs, further optimizing the quality of semantic representations. We conducted a systematic evaluation of this method on three cross-lingual natural language understanding benchmark datasets. On the TyDiQA dataset, DR-MIM improved the F1 score by 1.7% and the EM score by 4.5% over the best baseline. To further validate the model’s generalization capability, we introduced two new tasks: paraphrase identification and natural language inference, and designed both within-language and cross-language analysis experiments. All experiments collectively covered 22 languages. Further ablation studies, generalization analysis, and visualization results all confirm the effectiveness and adaptability of our approach.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 2","pages":"Article 104389"},"PeriodicalIF":6.9,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145049111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0