Information Processing & Management最新文献_第10页

Multi-resolution leak detection based on shared expert MoE forecasting for natural gas pipelines 基于共享专家MoE预测的天然气管道多分辨率泄漏检测

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-20 DOI: 10.1016/j.ipm.2025.104353

Xuguang Li , Zhonglin Zuo , Zheng Dong , Hongke Zhao , Luanfei Wan , Hongfang Cheng

{"title":"Multi-resolution leak detection based on shared expert MoE forecasting for natural gas pipelines","authors":"Xuguang Li , Zhonglin Zuo , Zheng Dong , Hongke Zhao , Luanfei Wan , Hongfang Cheng","doi":"10.1016/j.ipm.2025.104353","DOIUrl":"10.1016/j.ipm.2025.104353","url":null,"abstract":"<div><div>Natural gas is a critical strategic energy resource, predominantly transported through extensive pipeline networks monitored by Supervisory Control and Data Acquisition (SCADA) systems. Developing accurate deep-learning models for pipeline leak detection using SCADA data is crucial for safeguarding this vital infrastructure. Reliable and timely leak detection remains challenging due to two inherent limitations: (1) severe sample imbalance from rare leak occurrences and (2) complex multi-resolution hydraulic patterns complicating leak characterization. To address the challenges, we propose a novel Multi-Resolution Shared-Expert Mixture-of-Experts (MR-SEMoE) framework for leakage detection. The framework employs multivariate time series forecasting, where deviations between predicted and observed sensor values trigger leak alarms through statistical thresholding. Two key innovations synergistically enhance detection performance: (1) a shared-expert MoE architecture improving generalization through cross-experts knowledge transfer. (2) A multi-resolution analysis framework featuring parallel multi-head forecasters with resolution-specific feature extractors that enable hierarchical representation learning across different time resolutions. Comprehensive experimental evaluations on real-world natural gas pipeline datasets demonstrate that the proposed MR-SEMoE effectively identifies leaks under imbalanced data conditions. Compared to the previous state-of-the-art method, MR-SEMoE’s F1-score improved by 1.67%. The MR-SEMoE model outperforms contemporary state-of-the-art approaches, establishing the premier natural gas pipeline leak detection framework. To our knowledge, this work constitutes the first successful implementation of the MoE methodology in this domain, facilitating future deployment of large-scale models.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104353"},"PeriodicalIF":6.9,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144864093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Digital orientation, knowledge acquisition, and digitization in non-digital native firms 非数字化本土企业的数字化取向、知识获取与数字化

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-19 DOI: 10.1016/j.ipm.2025.104361

Lipeng Pan , Shuchun Liu , Yongqing Li

{"title":"Digital orientation, knowledge acquisition, and digitization in non-digital native firms","authors":"Lipeng Pan , Shuchun Liu , Yongqing Li","doi":"10.1016/j.ipm.2025.104361","DOIUrl":"10.1016/j.ipm.2025.104361","url":null,"abstract":"<div><div>Understanding how non-digital native firms achieve successful digital transformation remains a significant challenge in information management. This study investigates the digital transformation process from a knowledge-based perspective, using a panel dataset comprising 13,882 firm-year observations from 3109 industrial firms in Zhejiang Province, China (2014–2022). Employing fixed-effects regression and mediation-moderation analyses, the study examines how digital orientation influences digital transformation across three core business processes: procurement, production, and sales. Key findings highlight distinct knowledge acquisition pathways: digital orientation directly enhances production digitization (β = 0.032, <em>p</em> < 0.05), whereas its influence on procurement and sales digitization operates indirectly through tactical (β = 0.058, <em>p</em> < 0.01) and strategic knowledge acquisition (β = 0.041, <em>p</em> < 0.001). Further analysis reveals that absorptive capacity significantly strengthens the effect of tactical knowledge acquisition (β = 0.034, <em>p</em> < 0.1) but does not affect strategic knowledge pathways. The study underscores the importance of targeted knowledge management practices and suggests firms optimize their digital investments and internal training resources to initiate and sustain digital transformation effectively. Moreover, clarifying strategic goals related to digitalization is essential for guiding firms toward consistent innovation. These insights provide practical guidelines for enhancing digital transformation strategies, ultimately assisting traditional firms in overcoming internal capability constraints and fostering sustainable digitization outcomes.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104361"},"PeriodicalIF":6.9,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144864091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MSPF: A multi-semantic prompting fusion framework for emotion-cause pair extraction in conversations 一种多语义提示融合框架，用于会话中情感原因对的提取

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-19 DOI: 10.1016/j.ipm.2025.104356

Bo Xie , Junhao Wang , Haixia Guo , Pengliang Chen , Hua Zhang , Bo Jiang , Ye Wang , Liwen Chen

{"title":"MSPF: A multi-semantic prompting fusion framework for emotion-cause pair extraction in conversations","authors":"Bo Xie , Junhao Wang , Haixia Guo , Pengliang Chen , Hua Zhang , Bo Jiang , Ye Wang , Liwen Chen","doi":"10.1016/j.ipm.2025.104356","DOIUrl":"10.1016/j.ipm.2025.104356","url":null,"abstract":"<div><div>Emotion-cause pair extraction in conversations (ECPEC) has garnered increasing attention but struggles with modeling multi-turn utterance dependencies. While semantic prompting improves language understanding, its high computational cost hinders widespread ECPEC adoption. To overcome these constraints, we innovatively develop a multi-semantic prompting fusion (MSPF) framework by introducing the pair-oriented sampling strategy, focusing on candidate utterance pairs and transforming the ECPEC task into a pair verification issue. This shift enables us to incorporate three specialized semantic prompts, including tagging, synonym, and causal claim prompts, designed to enrich the semantics of emotion sentiment and emotion-cause relationships. We further present a knowledge attention module for the integration of tagging and synonym prompts, and a two-layer attention pooling module for merging tagging and dual causal claim prompts. Experimental results demonstrate that our proposed MSPF models outperforms the best existing baselines by 4.91 %, 4.08 %, and 2.86 % in F1 score on the ConvECPE, ECPE-D-DD, and ECPE-D-IE (for the domain adaptation experiment) datasets, respectively, with further ablation analysis confirming the effectiveness of our framework.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104356"},"PeriodicalIF":6.9,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144864092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Information needs of South Sudanese refugees in North Western Uganda at Bidi Bidi refugee settlement 乌干达西北部比迪比迪难民定居点的南苏丹难民的信息需求

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-18 DOI: 10.1016/j.ipm.2025.104359

Isaac Mukungu, Patrick Ngulube

{"title":"Information needs of South Sudanese refugees in North Western Uganda at Bidi Bidi refugee settlement","authors":"Isaac Mukungu, Patrick Ngulube","doi":"10.1016/j.ipm.2025.104359","DOIUrl":"10.1016/j.ipm.2025.104359","url":null,"abstract":"<div><div>This study was qualitative in nature and adopted phenomenological research design. The study population included South Sudanese refugees in North Western Uganda (NWU) as well as refugee administrators at Bidi Bidi refugee settlement which hosts over 240,000 refugees. A purposively selected sample of 60 participants was used in the study including 50 refugees and 10 refugees’ administrators. Non participatory observation, group and individual interview techniques were used to collect and gather data. Findings from the study revealed that refugees had various information needs which are presented in the themes of health, education, economic, peace and security, faith and spiritual categories. Information needs included information on agriculture, employment, saving and investment, markets, among others. Information needs on health and education were more visible among the refugees while information needs on the spiritual being and faith of the refugees were the least visible. Adult male refugees were more conscious on security in the settlement and sought this information more than the other categories of refugees in the settlement. On the basis of study findings, it’s recommended that refugees’ providers respond to refugees with focus on their information needs to support their integration and sustenance.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104359"},"PeriodicalIF":6.9,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144864027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evolvable psychology informed neural network for memory behavior modeling 进化心理学为神经网络记忆行为建模提供了信息

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-16 DOI: 10.1016/j.ipm.2025.104312

Xiaoxuan Shen , Zhihai Hu , Qirong Chen , Pei Wang

{"title":"Evolvable psychology informed neural network for memory behavior modeling","authors":"Xiaoxuan Shen , Zhihai Hu , Qirong Chen , Pei Wang","doi":"10.1016/j.ipm.2025.104312","DOIUrl":"10.1016/j.ipm.2025.104312","url":null,"abstract":"<div><div>Memory behavior modeling is a fundamental issue in the fields of cognitive psychology and education. Classical theoretical models of memory are characterized by insufficient accuracy and ongoing controversies, while data-driven memory modeling methods often require large amount of training data and lack interpretability, highlighting the need for new approaches to memory behavior modeling. This paper integrates classic psychological theories of memory to explore the feasibility of knowledge-driven neural networks in memory behavior modeling. It proposes the EPsyINN model, which combines temporal neural networks with sparse differential regression in a unified framework, enabling the joint optimization of neural networks and classical symbolic models. More specifically, to address the controversies in classical psychological theories and the ambiguity of descriptors, it proposes a descriptor evolution method based on differential operators to achieve precise descriptor characterization and advance the evolution of classical symbolic models. Additionally, it introduces a caching mechanism for regression coefficient matrices and an alternating iterative optimization method for multiple modules, effectively alleviating local optima in model optimization. On five large-scale real-world memory behavior datasets, the proposed method surpasses state-of-the-art memory modeling approaches in predictive accuracy, while the evolved classical symbolic models also achieve performance improvements. Ablation experiments validate the effectiveness of the proposed improvements, and application experiments demonstrate its potential to inspire psychological research. The code for the experiments is available at: <span><span>https://github.com/hellowads/PsyINN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104312"},"PeriodicalIF":6.9,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144852757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-head divide-and-conquer residual-attention mechanism with pointer network for multimodal question summarization in healthcare 基于指针网络的医疗保健多模态问题总结多头分治剩余注意机制

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-16 DOI: 10.1016/j.ipm.2025.104348

S. Priskilla Manonmani, S. Malathi

{"title":"Multi-head divide-and-conquer residual-attention mechanism with pointer network for multimodal question summarization in healthcare","authors":"S. Priskilla Manonmani, S. Malathi","doi":"10.1016/j.ipm.2025.104348","DOIUrl":"10.1016/j.ipm.2025.104348","url":null,"abstract":"<div><div>In contemporary medicine, summaries of medical questions are vital for effective and precise patient care. Current techniques handle only text-based summarization without considering the merit of incorporating visual information. To meet this, this research presents a multimodal summarization system that combines textual queries with medical images to support the extraction of meaningful details. The proposed system has three phases. In the first step, a gradual fusion decoder bidirectional encoder representation from transformers with vision transformers is utilized to produce fine-grained feature maps and diagnose diseases. The Multi-Agent Contextualized Diffusion Model (MACDM) is then utilized to contextualize knowledge using cross-modal information. Lastly, a Multi-head Divide-and-Conquer Residual-Attention mechanism with Pointer Network (MDCRAPN) is utilized to provide brief and relevant summaries. Furthermore, the hermit crab shell exchange algorithm is integrated to optimize hyperparameters for improved performance. The experimental results indicate that this proposed approach performs better than existing approaches with a recall-oriented understudy for gisting evaluation-1 score of 48.11 on the Multimodal Medical Question Summarization (MMQS) dataset. This approach significantly enhances the identification and summarization of medical disorders, demonstrating the potential to enhance healthcare communication and decision-making.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104348"},"PeriodicalIF":6.9,"publicationDate":"2025-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144858241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adaptive confidence-driven learning and cross-modal hard sample mining for unsupervised visible-infrared person re-identification 无监督可见红外人再识别的自适应信心驱动学习和跨模态硬样本挖掘

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-15 DOI: 10.1016/j.ipm.2025.104346

Yifeng Zhang , Canlong Zhang , Haifei Ma , Zhixin Li , Zhiwen Wang , Chunrong Wei

{"title":"Adaptive confidence-driven learning and cross-modal hard sample mining for unsupervised visible-infrared person re-identification","authors":"Yifeng Zhang , Canlong Zhang , Haifei Ma , Zhixin Li , Zhiwen Wang , Chunrong Wei","doi":"10.1016/j.ipm.2025.104346","DOIUrl":"10.1016/j.ipm.2025.104346","url":null,"abstract":"<div><div>This research addresses the critical challenges in Cross-modal Visible-Infrared Person Re-ID (VI-ReID), including significant modal differences, lack of cross-modal correspondence, and pseudo-label noise accumulation. To mitigate these issues, we propose an innovative framework integrating an adaptive multidimensional enhanced clustering method and a confidence-driven dynamic label correction mechanism. Specifically, we design a dynamic clustering framework leveraging neighborhood consistency and intra-class distribution entropy to autonomously model data distributions. A confidence-driven dynamic label correction mechanism is introduced, employing multi-prototype similarity probability models to filter pseudo-label noise effectively. Moreover, a cross-modal feature alignment strategy based on optimal transport theory addresses many-to-many feature matching between visible and infrared modalities. Additionally, a Hard Sample Aware Contrastive Learning (HCL) strategy is implemented to enhance feature learning in complex data distributions through dynamic feature storage. Extensive experiments conducted on SYSU-MM01 and RegDB datasets, comprising 29,533 and 4120 image pairs, respectively, demonstrate the framework’s effectiveness. The proposed method achieves a 3.9% mAP improvement on average compared to state-of-the-art methods, highlighting its advantages in cross-modal feature alignment and pseudo-label optimization.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104346"},"PeriodicalIF":6.9,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144841886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Is generative AI reshaping academic practices worldwide? A survey of adoption, benefits, and concerns 生成式人工智能正在重塑全球的学术实践吗？对采用、利益和关注点的调查

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-15 DOI: 10.1016/j.ipm.2025.104350

Ehsan Mohammadi , Mike Thelwall , Yizhou Cai , Taylor Collier , Iman Tahamtan , Azar Eftekhar

{"title":"Is generative AI reshaping academic practices worldwide? A survey of adoption, benefits, and concerns","authors":"Ehsan Mohammadi , Mike Thelwall , Yizhou Cai , Taylor Collier , Iman Tahamtan , Azar Eftekhar","doi":"10.1016/j.ipm.2025.104350","DOIUrl":"10.1016/j.ipm.2025.104350","url":null,"abstract":"<div><div>Although generative AI is transforming academic research and education, little is known about the role, gender, international, and disciplinary variations in uptake and use. This 20-country survey of publishing academics shows the widespread awareness and adoption of generative AI tools in academia, but with substantial international and disciplinary differences, and some role and gender differences. In particular, females were 10 % less likely to use Gen AI frequently (daily or weekly) for research, which may exacerbate gender inequalities. Perhaps surprisingly, the highest adoption rates occurred in some non-Western nations, possibly because of a greater need for translation services. The highest awareness is in the social sciences, perhaps because of the greater need for text analysis. Across all groups, these tools were mainly used for academic writing rather than data analysis and support for critical thinking. Despite this, personalized instruction and problem-solving are among generative AI's most generally claimed benefits. However, participants in all groups were skeptical about the creativity, accuracy, and consistency of AI-generated content in academic contexts. The most significant concerns about using generative AI in academia were inaccuracy, plagiarism, discouraging critical thinking, a lack of transparency and explainability, intellectual property rights violations, and data privacy risks. For policymakers, the findings point to fields and countries that may need action to prevent falling behind, as well as the ongoing need to investigate and monitor the impacts of generative AI on research practices.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104350"},"PeriodicalIF":6.9,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144841887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Empowering Arabic diacritic restoration models with robustness, generalization, and minimal diacritization 赋予阿拉伯语变音符恢复模型鲁棒性，泛化和最小的变音符化

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-13 DOI: 10.1016/j.ipm.2025.104345

Ruba Kharsa , Ashraf Elnagar , Sane Yagi

{"title":"Empowering Arabic diacritic restoration models with robustness, generalization, and minimal diacritization","authors":"Ruba Kharsa , Ashraf Elnagar , Sane Yagi","doi":"10.1016/j.ipm.2025.104345","DOIUrl":"10.1016/j.ipm.2025.104345","url":null,"abstract":"<div><div>Arabic diacritization is essential for ensuring accurate pronunciation, clarity, and disambiguation of texts. It is a vital task in Arabic natural language processing. Despite substantial progress in the field, existing models struggle to generalize across the diverse forms of Arabic and perform poorly in noisy, error-prone environments. These limitations may be tied to problems in training data and, more critically, to insufficient contextual understanding. To address these gaps, we present SukounBERT.v2, a BERT-based Arabic diacritization system that is built using a multi-phase approach. We refine the Arabic Diacritization (AD) dataset by correcting spelling mistakes, introducing a line-splitting mechanism, and by injecting various forms of noise into the dataset, such as spelling errors, transliterated non-Arabic words, and nonsense tokens. Furthermore, we develop a context-aware training dataset that incorporates explicit diacritic markings and the diacritic naming of classical grammar treatises. Our work also introduces the Sukoun Corpus, a large-scale, diverse dataset comprising over 5.2 million lines and 71 million tokens that were sourced from Classical Arabic texts, Modern Standard Arabic writings, dictionaries, poetry, and purpose-built contextual sentences. Complementing this is a token-level mapping dictionary that enables minimal diacritization without sacrificing accuracy. This is a previously unreported feature in Arabic diacritization research. Trained on this enriched dataset, SukounBERT.v2 delivers state-of-the-art performance with over 55% relative reduction in Diacritic Error Rate (DER) and Word Error Rate (WER) compared to leading models. These results underscore the impact of context-aware and noise-resilient modeling in advancing the field of Arabic text processing.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104345"},"PeriodicalIF":6.9,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EASeg: Environmental adaptation for weakly-supervised autonomous driving semantic segmentation 弱监督自动驾驶语义分割的环境适应

IF 6.9 1区管理学

Information Processing & Management Pub Date : 2025-08-13 DOI: 10.1016/j.ipm.2025.104349

Yongqiang Li , Chuanping Hu , Kai Ren , Hao Xi , Jinhao Fan

{"title":"EASeg: Environmental adaptation for weakly-supervised autonomous driving semantic segmentation","authors":"Yongqiang Li , Chuanping Hu , Kai Ren , Hao Xi , Jinhao Fan","doi":"10.1016/j.ipm.2025.104349","DOIUrl":"10.1016/j.ipm.2025.104349","url":null,"abstract":"<div><div>Weakly supervised semantic segmentation (WSSS) offers a promising solution to reduce annotation costs in autonomous driving perception systems. However, existing methods struggle with the complex environmental conditions inherent to real-world driving scenarios, including adverse weather, variable lighting, and challenging visibility conditions. To address these limitations, we introduce EASeg, a novel framework that enhances segmentation robustness across diverse environmental conditions while requiring only image-level supervision. Our approach introduces three key innovations: (1) a multi-scale feature module that captures objects at varying scales followed by a boundary-aware enhancement component for precise delineation; (2) a dual-stream environmental adaptation mechanism that separately models global weather patterns and local illumination variations; and (3) a reliability-guided feature integration strategy that dynamically combines backbone features with foundation models based on their estimated reliability. Extensive experiments demonstrate that EASeg outperforms previous best methods, increasing mIoU by 24.5% on Cityscapes, 27.5% on CamVid, and 22.5% on WildDash2. Ablation studies confirm that our work represents a significant advancement toward practical, all-weather autonomous driving systems that enhance safety through improved segmentation of small objects and precise boundary delineation, while minimizing annotation requirements.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104349"},"PeriodicalIF":6.9,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0