Information Fusion最新文献

筛选
英文 中文
Information-theoretic graph fusion with vision-language-action model for policy reasoning and dual robotic control 基于视觉语言-动作模型的信息图融合策略推理与双机器人控制
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.inffus.2026.104193
Shunlei Li , Longsen Gao , Jin Wang , Chang Che , Xi Xiao , Jiuwen Cao , Yingbai Hu , Hamid Reza Karimi
{"title":"Information-theoretic graph fusion with vision-language-action model for policy reasoning and dual robotic control","authors":"Shunlei Li ,&nbsp;Longsen Gao ,&nbsp;Jin Wang ,&nbsp;Chang Che ,&nbsp;Xi Xiao ,&nbsp;Jiuwen Cao ,&nbsp;Yingbai Hu ,&nbsp;Hamid Reza Karimi","doi":"10.1016/j.inffus.2026.104193","DOIUrl":"10.1016/j.inffus.2026.104193","url":null,"abstract":"<div><div>Teaching robots dexterous skills from human videos remains challenging due to the reliance on low-level trajectory imitation, which fails to generalize across object types, spatial layouts, and manipulator configurations. We propose Graph-Fused Vision-Language-Action (GF-VLA), a framework that enables dual-arm robotic systems to perform task-level reasoning and execution directly from RGB(-D) human demonstrations. GF-VLA first extracts Shannon-information-based cues to identify hands and objects with the highest task relevance, then encodes these cues into temporally ordered scene graphs that capture both hand-object and object-object interactions. These graphs are fused with a language-conditioned transformer that generates hierarchical behavior trees and interpretable Cartesian motion commands. To improve execution efficiency in bimanual settings, we further introduce a cross-hand selection policy that infers optimal gripper assignment without explicit geometric reasoning. We evaluate GF-VLA on four structured dual-arm block assembly tasks involving symbolic shape construction and spatial generalization. Experimental results show that the information-theoretic scene representation achieves over 95% graph accuracy and 93% subtask segmentation, supporting the LLM planner in generating reliable and human-readable task policies. When executed by the dual-arm robot, these policies yield 94% grasp success, 89% placement accuracy, and 90% overall task success across stacking, letter-building, and geometric reconfiguration scenarios, demonstrating strong generalization and robustness across diverse spatial and semantic variations.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104193"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Vision-language model with siamese bilateral difference network and text-guided image feature enhancement for acute ischemic stroke outcome prediction on CT angiography 基于Siamese双侧差异网络和文本引导图像特征增强的视觉语言模型在急性缺血性卒中CT血管造影预后预测中的应用
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.inffus.2026.104195
Hulin Kuang , Bin Hu , Shuai Yang , Dongcui Wang , Guanghua Luo , Weihua Liao , Wu Qiu , Shulin Liu , Jianxin Wang
{"title":"Vision-language model with siamese bilateral difference network and text-guided image feature enhancement for acute ischemic stroke outcome prediction on CT angiography","authors":"Hulin Kuang ,&nbsp;Bin Hu ,&nbsp;Shuai Yang ,&nbsp;Dongcui Wang ,&nbsp;Guanghua Luo ,&nbsp;Weihua Liao ,&nbsp;Wu Qiu ,&nbsp;Shulin Liu ,&nbsp;Jianxin Wang","doi":"10.1016/j.inffus.2026.104195","DOIUrl":"10.1016/j.inffus.2026.104195","url":null,"abstract":"<div><div>Acute ischemic stroke (AIS) outcome prediction is crucial for treatment decisions. However, AIS outcome prediction is challenging due to the combined influence of lesion characteristics, vascular status, and other health conditions. In this study, we introduce a vision-language model with a Siamese bilateral difference network and a text-guided image feature enhancement module for predicting AIS outcome (e.g., modified Rankin Scale, mRS) on CT angiography. In the Siamese bilateral difference network, based on fine-tuning the foundation model LVM-Med, we design an interactive Transformer fine-tuning encoder and a vision question answering guided bilateral difference awareness module, which generates bilateral difference text via image-text pair question answering as a prompt to enhance the extracted brain vascular difference features. Additionally, in the text-guided image feature enhancement module, we propose a text feature extraction module to extract patient phrase-level and inter-phrase embeddings from clinical notes, and employ a multi-scale image-text interaction module to obtain fine-grained phrase-enhanced image attention feature and coarse-grained phrase context-aware image attention feature. We validate our model on the public ISLES2024 dataset, a private dataset A, and an external AIS dataset. It achieves accuracies of 81.11%, 83.05%, and 80.00% and AUCs of 80.06%, 85.48% and 82.62% for 90-day mRS prediction on the 3 datasets, respectively, outperforming several state-of-the-art methods and demonstrating its generalization ability. Moreover, the proposed method can be effectively extended to glaucoma visual field progression prediction, which is also related to vascular differences and clinical notes.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104195"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal spatio-temporal fusion: A generalizable GCN-LSTM with attention framework for urban application 多模态时空融合:一个具有城市应用关注框架的广义GCN-LSTM
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-20 DOI: 10.1016/j.inffus.2026.104164
Yunfei Guo
{"title":"Multimodal spatio-temporal fusion: A generalizable GCN-LSTM with attention framework for urban application","authors":"Yunfei Guo","doi":"10.1016/j.inffus.2026.104164","DOIUrl":"10.1016/j.inffus.2026.104164","url":null,"abstract":"<div><div>The proliferation of urban big data presents unprecedented opportunities for understanding cities, yet the analytical methods to harness this data are often fragmented and domain-specific. Existing predictive models in urban computing are typically highly specialized, creating analytical silos that inhibit knowledge transfer and are difficult to adapt across domains such as public safety, housing and transport. This paper confronts this critical gap by developing a generalizable, multimodal spatio-temporal deep learning framework engineered for both high predictive performance and interpretability, which is capable of mastering diverse urban prediction tasks without architectural modification. The hybrid architecture fuses a Multi-Head Graph Convolutional Network (GCN) for spatial diffusion, a Long Short-Term Memory (LSTM) network for temporal dynamics, and a learnable Gating Mechanism that weights the influence of spatial graph versus static external features. To validate this generalizability, the framework was tested on three distinct urban domains in London: crime forecasting, housing price estimation and transport network demand. The model outperformed traditional baselines (ARIMA, XGBoost) and state-of-the-art deep learning models (TabNet, TFT). Moreover, the framework moves beyond prediction to explanation by incorporating attention mechanisms and permutation feature importance analysis.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104164"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the security and privacy of federated learning: A survey with attacks, defenses, frameworks, applications, and future directions 关于联邦学习的安全和隐私:攻击、防御、框架、应用和未来方向的调查
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-16 DOI: 10.1016/j.inffus.2026.104155
Daniel M. Jimenez-Gutierrez , Yelizaveta Falkouskaya , José L. Hernandez-Ramos , Aris Anagnostopoulos , Ioannis Chatzigiannakis , Andrea Vitaletti
{"title":"On the security and privacy of federated learning: A survey with attacks, defenses, frameworks, applications, and future directions","authors":"Daniel M. Jimenez-Gutierrez ,&nbsp;Yelizaveta Falkouskaya ,&nbsp;José L. Hernandez-Ramos ,&nbsp;Aris Anagnostopoulos ,&nbsp;Ioannis Chatzigiannakis ,&nbsp;Andrea Vitaletti","doi":"10.1016/j.inffus.2026.104155","DOIUrl":"10.1016/j.inffus.2026.104155","url":null,"abstract":"<div><div>Federated Learning (FL) is an emerging distributed machine learning paradigm enabling multiple clients to train a global model collaboratively without sharing their raw data. While FL enhances data privacy by design, it remains vulnerable to various security and privacy threats. This survey provides a comprehensive overview of 203 papers regarding the state-of-the-art attacks and defense mechanisms developed to address these challenges, categorizing them into security-enhancing and privacy-preserving techniques. Security-enhancing methods aim to improve FL robustness against malicious behaviors such as byzantine attacks, poisoning, and Sybil attacks. At the same time, privacy-preserving techniques focus on protecting sensitive data through cryptographic approaches, differential privacy, and secure aggregation. We critically analyze the strengths and limitations of existing methods, highlight the trade-offs between privacy, security, and model performance, and discuss the implications of non-IID data distributions on the effectiveness of these defenses. Furthermore, we identify open research challenges and future directions, including the need for scalable, adaptive, and energy-efficient solutions operating in dynamic and heterogeneous FL environments. Our survey aims to guide researchers and practitioners in developing robust and privacy-preserving FL systems, fostering advancements safeguarding collaborative learning frameworks’ integrity and confidentiality.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104155"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data fusion for low-cost sensors: A systematic literature review 低成本传感器的数据融合:系统文献综述
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-18 DOI: 10.1016/j.inffus.2026.104124
Gabriel Oduori , Chaira Cocco , Payam Sajadi , Francesco Pilla
{"title":"Data fusion for low-cost sensors: A systematic literature review","authors":"Gabriel Oduori ,&nbsp;Chaira Cocco ,&nbsp;Payam Sajadi ,&nbsp;Francesco Pilla","doi":"10.1016/j.inffus.2026.104124","DOIUrl":"10.1016/j.inffus.2026.104124","url":null,"abstract":"<div><div>Data fusion (DF) addresses the challenge of integrating heterogeneous data sources to improve decision-making and inference. Although DF has been widely explored, no prior systematic review has specifically focused on its application to low-cost sensor (LCS) data in environmental monitoring. To address this gap, we conduct a systematic literature review (SLR) following the PRISMA framework, synthesising findings from 82 peer-reviewed articles. The review addresses three key questions: (1) What fusion methodologies are employed in conjunction with LCS data? (2) In what environmental contexts are these methods applied? (3) What are the methodological challenges and research gaps? Our analysis reveals that geostatistical and machine learning approaches dominate current practice, with air quality monitoring emerging as the primary application domain. Additionally, artificial intelligence (AI)-based methods are increasingly used to integrate spatial, temporal, and multimodal data. However, limitations persist in uncertainty quantification, validation standards, and the generalisability of fusion frameworks. This review provides a comprehensive synthesis of current techniques and outlines key directions for future research, including the development of robust, uncertainty-aware fusion methods and broader application to less-studied environmental variables.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104124"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grading-inspired complementary enhancing for multimodal sentiment analysis 基于评分的多模态情感分析互补增强
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-23 DOI: 10.1016/j.inffus.2026.104174
Zhijing Huang , Wen-Jue He , Baotian Hu, Zheng Zhang
{"title":"Grading-inspired complementary enhancing for multimodal sentiment analysis","authors":"Zhijing Huang ,&nbsp;Wen-Jue He ,&nbsp;Baotian Hu,&nbsp;Zheng Zhang","doi":"10.1016/j.inffus.2026.104174","DOIUrl":"10.1016/j.inffus.2026.104174","url":null,"abstract":"<div><div>Due to its strong capacity for integrating heterogeneous multi-source information, multimodal sentiment analysis (MSA) has achieved remarkable progress in affective computing. However, existing methods typically adopt symmetric fusion strategies that treat all modalities equally, overlooking their inherent performance disparities that some modalities excel at discriminative representation, while others carry underutilized supportive cues. This limitation leads to insufficiency in cross-modal complementary correlation exploration. To address this issue, we propose a novel Grading-Inspired Complementary Enhancing (GCE) framework for MSA, which is one of the first attempts to conduct dynamic assessment for knowledge transfer in progressive multimodal fusion and cooperation. Specifically, based on cross-modal interaction, a task-aware grading mechanism categorizes modality-pair associations into dominant (high-performing) and supplementary (low-performing) branches according to their task performance. Accordingly, a relation filtering module selectively identifies the trustworthy information from the dominant branch to enhance consistency exploration in supplementary modality pairs with minimized redundancy. Afterwards, a weight adaptation module is adopted to dynamically adjust the guiding weight of individual samples for adaptability and generalization. Extensive experiments conducted on three benchmark datasets evidence that our proposed GCE approach can outperform the state-of-the-art MSA methods. Our code is available at <span><span>https://github.com/hka-7/GCEforMSA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104174"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146047985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shape-aware osteoarthritis network: Bidirectional fusion of MRI and 3D point clouds for knee osteoarthritis diagnosis 形状感知骨关节炎网络:MRI和3D点云双向融合诊断膝关节骨关节炎
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.inffus.2026.104198
Dawei Zhang , Chenglin Sang , Tianyi Lyu
{"title":"Shape-aware osteoarthritis network: Bidirectional fusion of MRI and 3D point clouds for knee osteoarthritis diagnosis","authors":"Dawei Zhang ,&nbsp;Chenglin Sang ,&nbsp;Tianyi Lyu","doi":"10.1016/j.inffus.2026.104198","DOIUrl":"10.1016/j.inffus.2026.104198","url":null,"abstract":"<div><div>Knee osteoarthritis (KOA) is a common degenerative joint disease, and accurate diagnosis and severity grading are crucial for effective treatment. At present, although deep learning techniques based on X-rays or magnetic resonance imaging (MRI) have greatly improved diagnostic accuracy, two-dimensional images often cannot fully capture the complex three-dimensional morphology and texture changes related to KOA. To address these challenges, we propose a shape aware osteoarthritis diagnostic network, which is a novel bidirectional cross modal fusion framework that integrates 3D point clouds and MRI sequences. This framework consists of three parts: (1) a local relation aware dynamic graph convolutional neural network (CNN) used to extract complex geometric features from point clouds representing the surfaces of knee joint bones and cartilage; (2) For MRI sequences, a sequence aggregation method was adopted, which combines 2D CNN for spatial feature extraction and self-attention mechanism for cross slice sequences. (3) The bidirectional transmembrane fusion module is capable of conducting in-depth interactive feature learning between the geometric domain of point clouds and the texture spatiotemporal domain of MRI, enabling these two modes to improve and enhance each other’s representations. Extensive experiments conducted on a large cohort of osteoarthritis initiatives (OAI) have shown that our model achieves state-of-the-art performance. Its accuracy in the challenging 5-level Kellgren Lawrence (KL) classification is 0.73, which represents a improvement of approximately 23.7% over the 0.59 achieved by using 3D shape features alone in the ShapeMed-Knee benchmark. Furthermore, its AUC in binary OA diagnosis is 0.95, significantly better than existing unimodal and multimodal baselines.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104198"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146072486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Code-driven programming prediction enhanced by LLM with a feature fusion approach 基于特征融合的LLM增强代码驱动编程预测
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-20 DOI: 10.1016/j.inffus.2026.104165
Shengyingjie Liu , Jianxin Li , Qian Wan , Bo He , Zhijun Huang , Qing Li
{"title":"Code-driven programming prediction enhanced by LLM with a feature fusion approach","authors":"Shengyingjie Liu ,&nbsp;Jianxin Li ,&nbsp;Qian Wan ,&nbsp;Bo He ,&nbsp;Zhijun Huang ,&nbsp;Qing Li","doi":"10.1016/j.inffus.2026.104165","DOIUrl":"10.1016/j.inffus.2026.104165","url":null,"abstract":"<div><div>Programming education is essential for equipping individuals with digital literacy skills and developing the problem-solving abilities necessary for success in the modern workforce. In online programming tutoring systems, knowledge tracing (KT) techniques are crucial for programming prediction, as they monitor user performance and model user cognition. However, both universal and programming-specific knowledge transfer methods depend on traditional state-driven paradigms that indirectly predict programming outcomes based on users’ knowledge states. It does not align with the core objective of programming prediction, which is to determine whether submitted code can solve the question. To address this, we present the code-driven feature fusion KT (CFKT), which integrates large language models (LLM) and encoders for both individualized and common code features. It consists of two modules: pass prediction and code prediction. The pass prediction module leverages LLM to incorporate semantic information from the question and code through embedding, extracting key features that determine code correctness through proxy tasks and effectively narrowing the solution space with vectorization. The code prediction module integrates user historical data and data from other users through feature fusion blocks, allowing for accurate predictions of submitted code and effectively mitigating the cold start problem. Experiments on multiple real-world public programming datasets demonstrate that CFKT significantly outperforms existing baseline methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104165"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Crowdsourced federated learning with inconsistent label representation 不一致标签表示的众包联邦学习
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-30 DOI: 10.1016/j.inffus.2026.104194
Yunlong He, Fei Chen, Hanlin Zhang, Jia Yu
{"title":"Crowdsourced federated learning with inconsistent label representation","authors":"Yunlong He,&nbsp;Fei Chen,&nbsp;Hanlin Zhang,&nbsp;Jia Yu","doi":"10.1016/j.inffus.2026.104194","DOIUrl":"10.1016/j.inffus.2026.104194","url":null,"abstract":"<div><div>When personalized federated learning meets crowdsourced label annotation, it can potentially form a complete ecosystem from large-scale data labeling, through model training in massive devices, toward flexible service for diverse end users. Actually, most common crowdsourced annotators can hardly follow a uniform annotation regulation and make the annotations in their own way. Even though they can share the cognitive consistency on the perception, the label annotation can still be expressed in various ways. This situation can be specifically serious in the federated learning scenario, in which the diverse label expressions are always kept locally in distributed clients for privacy concerns and can hardly be unified. In this work, we are motivated to propose CrowdFed, a systematic solution for crowdsourced federated learning systems with underlying label representation skew issue. Specifically, the global model is trained through federated learning for global categorical alignment, and the personalized layers are learned through an auxiliary network in each client for local representation alignment. Furthermore, a category-level similarity matching strategy is presented for the alignment of inconsistent label representations between the local category and the global category. Evaluated by four benchmark datasets, our proposed strategy proves its superiority in terms of system efficiency and cost.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104194"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146089494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptive regularized topological segmentation network integrating inter-class relations and occlusion information for vehicle component recognition 一种融合类间关系和遮挡信息的自适应正则化拓扑分割网络用于车辆部件识别
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2026-07-01 Epub Date: 2026-01-17 DOI: 10.1016/j.inffus.2026.104157
Xunqi Zhou , Zhenqi Zhang , Zifeng Wu , Qianming Wang , Jing Teng , Jinlong Liu , Yongjie Zhai
{"title":"An adaptive regularized topological segmentation network integrating inter-class relations and occlusion information for vehicle component recognition","authors":"Xunqi Zhou ,&nbsp;Zhenqi Zhang ,&nbsp;Zifeng Wu ,&nbsp;Qianming Wang ,&nbsp;Jing Teng ,&nbsp;Jinlong Liu ,&nbsp;Yongjie Zhai","doi":"10.1016/j.inffus.2026.104157","DOIUrl":"10.1016/j.inffus.2026.104157","url":null,"abstract":"<div><div>In intelligent vehicle damage assessment, component recognition faces challenges such as significant intra-class variability and minimal inter-class differences, which hinder detection, as well as occlusions and ambiguous boundaries, which complicate segmentation. We generalize these problems into three core aspects: inter-object relational modeling, semantic-detail information balancing, and occlusion-aware decoupling. To this end, we propose the Adaptive Regularized Topological Segmentation (ARTSeg) network, comprising three complementary modules: Inter-Class Graph Constraint (ICGC), Constrained Detail Feature Backtracking (CDFB), and Topological Decoupling Segmentation (TDS). Each module is purposefully designed, integrated in a progressive structure, and synergistically reinforces the others to enhance overall performance. Specifically, ICGC clusters intra-class features and establishes implicit topological constraints among categories during feature extraction, enabling the model to better capture inter-class relationships and improve detection representation. Subsequently, CDFB evaluates the impact of channel-wise feature information within each candidate region on segmentation accuracy and computational cost, dynamically selecting appropriate feature resolutions for individual instances while balancing the demands of detection and segmentation tasks. Finally, TDS introduces topological associations between occluded and occluding regions at the feature level and decouples them at the task level, explicitly modeling generalized occlusion regions and enhancing segmentation performance. We quantitatively and qualitatively evaluate ARTSeg on a 59-category vehicle component dataset constructed for insurance damage assessment, achieving notable improvements in addressing the aforementioned problems. Experiments on two public datasets, DSMLR and Carparts, further validate the generalization capability of the proposed method. Results indicate that ARTSeg provides practical guidance for component recognition in intelligent vehicle damage assessment.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"131 ","pages":"Article 104157"},"PeriodicalIF":15.5,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书