Information Fusion最新文献

筛选
英文 中文
Map4comm: A map-aware collaborative perception framework with efficient-bandwidth information fusion Map4comm:一个具有高效带宽信息融合的地图感知协同感知框架
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-29 DOI: 10.1016/j.inffus.2025.103567
Huan Qiu , Jian Zhou , Bijun Li , Qin Zou , Youchen Tang , Man Luo
{"title":"Map4comm: A map-aware collaborative perception framework with efficient-bandwidth information fusion","authors":"Huan Qiu ,&nbsp;Jian Zhou ,&nbsp;Bijun Li ,&nbsp;Qin Zou ,&nbsp;Youchen Tang ,&nbsp;Man Luo","doi":"10.1016/j.inffus.2025.103567","DOIUrl":"10.1016/j.inffus.2025.103567","url":null,"abstract":"<div><div>V2I (Vehicle-to-Infrastructure) collaborative perception enhances the ability to perceive dynamic driving environments by sharing multi-viewpoint information from the same scene through communication, gradually becoming an essential part of intelligent transportation systems. However, it inevitably introduces an inherent trade-off between communication bandwidth and perception performance. To address this bottleneck, we introduce a map-mask precisely aligned with perceptual spatial features. This mask can accurately filter out the background of the real-time perceptual feature information so as to selectively extract the perceptually critical areas as communication content. Based on this novel map-mask, we propose Map4comm, a unified map-aware collaborative perception framework, to achieve an efficient balance between communication bandwidth and perception performance. In order to save communication bandwidth, Map4comm introduces a Local Communication Area Selection (LCAS) mechanism based on map-mask to optimize the communication area selection of the system. In terms of performance, Map4comm presents an Adaptive Covoxel Feature Alignment (ACFA) strategy to achieve coarse alignment of vehicle–infrastructure-map heterogeneous low-dimensional voxel features, which in turn improves the overall perceptual performance. Based on these two approaches, Map4comm realizes an efficient trade-off between communication bandwidth and perception performance. To evaluate Map4comm, we conducted mapping and testing on the large-scale vehicle–infrastructure collaborative sequential perception dataset V2X-Seq-SPD. The experimental results show that Map4comm outperforms all other collaborative perception methods in terms of perceptual performance while realizing the least communication transmission cost compared to the state-of-the-art collaborative perception methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103567"},"PeriodicalIF":15.5,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-way decision oriented rational behavior multi-attribute decision-making 三向决策导向的理性行为多属性决策
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-29 DOI: 10.1016/j.inffus.2025.103538
Tingquan Deng , Wenjie Wang , Chaoyue Wang , Jianming Zhan
{"title":"Three-way decision oriented rational behavior multi-attribute decision-making","authors":"Tingquan Deng ,&nbsp;Wenjie Wang ,&nbsp;Chaoyue Wang ,&nbsp;Jianming Zhan","doi":"10.1016/j.inffus.2025.103538","DOIUrl":"10.1016/j.inffus.2025.103538","url":null,"abstract":"<div><div>Three-way decision is a methodology that utilizes human cognitive thinking to handle multi-attribute decision-making with deferral decision-making. The key to achieving this goal is to reasonably fuse evaluation values from multiple attributes of alternatives to extract rules for classifying and ranking alternatives. There have been already lots of literature addressing this issue by considering the regret psychology of decision-makers. However, the regret theory oriented multi-attribute decision-making may lead to irrational or incomplete decision strategies. To tackle such a challenge, this paper introduces the idea of game theory into behavior three-way decision based multi-attribute decision-making (GBMADM) to weighted fuse regret and rejoice values of each alternative across all attributes to extract game rules for classifying and ranking alternatives. Firstly, a decision state set is fuzzified with the idea of TOPSIS and the weight of each attribute is determined based on the self-information of approximate accuracy from generalized fuzzy rough set theory. Secondly, a pair of utility functions is introduced to act as players to play in game. Two pairs of weighted regret functions and weighted rejoice functions are achieved by fusing the regret values and rejoice values of each alternative across all attributes, respectively. A payoff matrix for each alternative is then constructed by fusing the weighted regret value and weighted rejoice value. All decision rules for classification and ranking are thereafter extracted from the payoff matrix through optimal Nash equilibrium solutions. Finally, a practical example and simulation experiments on two benchmark datasets demonstrate the superiority of the proposed method compared to representative methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103538"},"PeriodicalIF":15.5,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Query-guided multimodal learning network for anomaly detection underneath high-speed trains 高速列车地下异常检测的查询导向多模态学习网络
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-28 DOI: 10.1016/j.inffus.2025.103530
Wei Liu , Xiaobo Lu , Yun Wei , Zhidan Ran
{"title":"Query-guided multimodal learning network for anomaly detection underneath high-speed trains","authors":"Wei Liu ,&nbsp;Xiaobo Lu ,&nbsp;Yun Wei ,&nbsp;Zhidan Ran","doi":"10.1016/j.inffus.2025.103530","DOIUrl":"10.1016/j.inffus.2025.103530","url":null,"abstract":"<div><div>Anomaly detection on the bottom of high-speed trains is crucial for train safety. However, the complexity and variability of anomalies, along with the intricate environment in which they occur, pose significant challenges to timely detection. To address this issue, we propose the Query-guided Multimodal Learning Network (QMLNet) that exploits multimodal information to discover anomalies. Specifically, in QMLNet, the CNN-Transformer Feature Fusion (CFF) Module uses queries to guide the learning and fusion of multi-level visual features, enriching the expressiveness of features at each level for improved prediction of anomaly masks. The Attention-based Mask Refinement (AMR) module generates masks based on the attention mechanism to enhance the features, learns the features using queries, and obtains a global representation of the features at different levels, which will be used along with the textual features for better prediction of the anomaly categories. Compared to state-of-the-art methods, the proposed method achieves superior results on our anomaly dataset, leading by a significant margin. In addition, our method achieves the best performance on the public defect detection dataset, the VISION dataset, significantly outperforming most methods, which proves the generalization and robustness of our approach.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103530"},"PeriodicalIF":15.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144722019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLaVA-based semantic feature modulation diffusion model for underwater image enhancement 基于llava的水下图像增强语义特征调制扩散模型
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-28 DOI: 10.1016/j.inffus.2025.103566
Guodong Fan , Shengning Zhou , Zhen Hua , Jinjiang Li , Jingchun Zhou
{"title":"LLaVA-based semantic feature modulation diffusion model for underwater image enhancement","authors":"Guodong Fan ,&nbsp;Shengning Zhou ,&nbsp;Zhen Hua ,&nbsp;Jinjiang Li ,&nbsp;Jingchun Zhou","doi":"10.1016/j.inffus.2025.103566","DOIUrl":"10.1016/j.inffus.2025.103566","url":null,"abstract":"<div><div>Underwater Image Enhancement (UIE) is critical for numerous marine applications; however, existing methods often fall short in addressing severe color distortion, detail loss, and lack of semantic understanding, particularly under spatially varying degradation conditions. While Generative AI (GenAI), particularly diffusion models and multimodal large language models (MLLMs), offers new prospects for UIE, effectively leveraging their capabilities for fine-grained, semantic-aware enhancement remains a challenge. We proposed a LLaVA-based semantic feature modulation diffusion model (LSFM-Diff), which integrates multi-level semantic guidance collaboratively into the backbone network of the diffusion model. Specifically, an optimized prompt learning strategy is first employed to obtain concise, UIE-relevant textual descriptions from LLaVA. These semantics then guide the enhancement process in two key stages: (1) The windowed text-image fusion for condition refinement (WTIF-CR) module aligns and fuses textual semantics with local image features spatially, generating fine-grained external conditions that provide an initial spatially aware semantic blueprint for the diffusion model. (2) The semantic-guided deformable attention (SGDA) mechanism, leveraging a gradient-based image-text interaction to generate a semantic navigation map, guides the attention within the denoising network to focus on key semantic regions. Experiments conducted on several challenging benchmark datasets demonstrate that LSFM-Diff outperforms current state-of-the-art methods. Our work highlights the effectiveness of deep integration of multi-level semantic guidance fusion strategies in advancing GenAI-based UIE development.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103566"},"PeriodicalIF":15.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InKrat: Interpretable diagnosis prediction models based on cross-modal knowledge graph semantic retrieval fusion 基于跨模态知识图语义检索融合的可解释诊断预测模型
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-28 DOI: 10.1016/j.inffus.2025.103546
Qing Li, Zehao Li, Jingjing Song, Jianshuo Bao, Jin Yang, Zhuhong You
{"title":"InKrat: Interpretable diagnosis prediction models based on cross-modal knowledge graph semantic retrieval fusion","authors":"Qing Li,&nbsp;Zehao Li,&nbsp;Jingjing Song,&nbsp;Jianshuo Bao,&nbsp;Jin Yang,&nbsp;Zhuhong You","doi":"10.1016/j.inffus.2025.103546","DOIUrl":"10.1016/j.inffus.2025.103546","url":null,"abstract":"<div><div>Although deep learning models have made breakthrough achievements in many fields, they still face some challenges in diagnostic prediction tasks in healthcare. Existing methods either use graph structures or sequence structures one-sidedly or disjointedly, failing to obtain high-quality representations of EMR data. Some knowledge-enhanced methods rely on strategies based on name or identifier matching, lacking flexibility while introducing semantically mismatched noise. On the other hand, attention-based models for interpretable analysis can only provide the importance of different factors rather than intuitive and easily understandable natural language descriptions. To address the above issues, we propose InKrat, a new KG-enhanced method. Specifically, we designed a novel temporal graph structure that models the structure and temporal information in EMR by integrating anchor nodes as a bridge. We also developed a cross-modal semantic retrieval method, utilizing a large language model (LLM) to compute the semantic similarity between the KG and medical notes, filtering the knowledge accordingly. Finally, based on the knowledge prompts, the LLM generates interpretable descriptions of the prediction results. We have extensively validated the effectiveness of InKrat through experiments on two commonly used real-world datasets. The results demonstrate that our proposed method achieves state-of-the-art performance. Our code can be found at <span><span>https://github.com/lzh-nwpu/InKrat</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103546"},"PeriodicalIF":15.5,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-modality-enhanced visual Scene Graph Generation 跨模态增强的视觉场景图形生成
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-27 DOI: 10.1016/j.inffus.2025.103430
Fei Yu , Hui Ji , Yuehua Li
{"title":"Cross-modality-enhanced visual Scene Graph Generation","authors":"Fei Yu ,&nbsp;Hui Ji ,&nbsp;Yuehua Li","doi":"10.1016/j.inffus.2025.103430","DOIUrl":"10.1016/j.inffus.2025.103430","url":null,"abstract":"<div><div>Humans perceive scenes through multisensory cues, yet existing Scene Graph Generation (SGG) methods predominantly rely on visual input alone, neglecting the complementary information provided by auditory signals and cross-modal interactions. To overcome this limitation, we propose Audio-Enhanced Scene Graph Generation (AESGG), a novel framework that integrates audio cues to enhance both object detection and relation prediction. AESGG improves visual object proposals by incorporating aligned audio features, thereby reducing ambiguity in detection. It further employs a spatio-temporal transformer to model dynamic inter-object relationships over time. A self-supervised learning strategy is introduced to capture relation transitions across video frames effectively. To facilitate research in audio-visual scene understanding, we also present the VALM dataset. Experimental results demonstrate that AESGG consistently outperforms state-of-the-art baselines, achieving up to a 2.0 percentage point improvement in relation prediction metrics (R@50, PredCls, with constraints), reflecting its robust and generalizable performance gains.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103430"},"PeriodicalIF":15.5,"publicationDate":"2025-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144722020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Goal-oriented multi-robot collaborative source search with dynamic exploration–exploitation balance in large-scale constrained areas 大规模约束区域动态勘探-开发平衡的目标多机器人协同源搜索
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-26 DOI: 10.1016/j.inffus.2025.103539
Mengyu Yan , Zhengqiu Zhu , Yong Zhao , Bin Chen , Yatai Ji , Kai Xu , Shuohao Li
{"title":"Goal-oriented multi-robot collaborative source search with dynamic exploration–exploitation balance in large-scale constrained areas","authors":"Mengyu Yan ,&nbsp;Zhengqiu Zhu ,&nbsp;Yong Zhao ,&nbsp;Bin Chen ,&nbsp;Yatai Ji ,&nbsp;Kai Xu ,&nbsp;Shuohao Li","doi":"10.1016/j.inffus.2025.103539","DOIUrl":"10.1016/j.inffus.2025.103539","url":null,"abstract":"<div><div>Rapid localization of unknown gas leak sources in urban areas is critical for effective emergency response and impact mitigation. While deploying autonomous robots to assess and localize emission sources has proven effective, current approaches are inadequate in a large-scale, constrained area. To address this, we propose a <strong>G</strong>oal-oriented multi-<strong>R</strong>obot coll<strong>A</strong>borative <strong>S</strong>ource <strong>S</strong>earch (GRASS) framework for large-scale constrained environments. This framework employs a three-step coupled strategy—goal determination, allocation, and execution—leveraging the fusion of posterior probabilities and hybrid sensed information to achieve efficient and reliable gas source localization. Specifically, goal determination module defines two distinct goals to dynamically balance exploration (reducing estimation uncertainty through systematic coverage) and exploitation (directing robots toward estimated source locations via Gaussian Mixture Models). Moreover, goal allocation module adapts to source estimation reliability and local context, enabling robots to prioritize exploration during sparse data periods and shift to exploitation as estimations improve. In terms of goal execution, it resolves conflicts between goal pursuit and collision avoidance through designing adaptive path planning mechanisms that integrates a modified A* algorithm with a contour-tracing method. Finally, extensive simulations demonstrate that GRASS significantly outperforms two baseline methods, achieving higher success rates (increasing at least 9%) and requiring less search time (reducing at least 215.95 s) in various settings. These advantages are also confirmed by a real-world case study. Our work advances information fusion-driven environmental monitoring for resilient cities by providing an autonomous solution for source localization in complex urban environments.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103539"},"PeriodicalIF":15.5,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144721903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M2WLLM: Multi-modal multi-task ultra-short-term wind power prediction algorithm based on large language model M2WLLM:基于大语言模型的多模态多任务超短期风电预测算法
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-26 DOI: 10.1016/j.inffus.2025.103541
Hang Fan , Mingxuan Li , Zuhan Zhang , Long Cheng , Yujian Ye , Weican Liu , Dunnan Liu
{"title":"M2WLLM: Multi-modal multi-task ultra-short-term wind power prediction algorithm based on large language model","authors":"Hang Fan ,&nbsp;Mingxuan Li ,&nbsp;Zuhan Zhang ,&nbsp;Long Cheng ,&nbsp;Yujian Ye ,&nbsp;Weican Liu ,&nbsp;Dunnan Liu","doi":"10.1016/j.inffus.2025.103541","DOIUrl":"10.1016/j.inffus.2025.103541","url":null,"abstract":"<div><div>The integration of wind energy into power grids necessitates accurate ultra-short-term wind power forecasting to ensure grid stability and optimize resource allocation. This study introduces M2WLLM, an innovative model that leverages the capabilities of Large Language Models (LLMs) for predicting wind power output at granular time intervals. M2WLLM overcomes the limitations of traditional and deep learning methods by seamlessly integrating textual information and temporal numerical data, significantly improving wind power forecasting accuracy through multi-modal data. Its architecture features a Prompt Embedder and a Data Embedder, enabling an effective fusion of textual prompts and numerical inputs within the LLMs framework. The Semantic Augmenter within the Data Embedder translates temporal data into a format that the LLMs can comprehend, enabling it to extract latent features and improve prediction accuracy. The empirical evaluations conducted on wind farm data from three Chinese provinces demonstrate that M2WLLM consistently outperforms existing methods, such as Generative Pre-trained Transforme for Time Series (GPT4TS), across various datasets and prediction horizons. The results highlight LLMs’ ability to enhance accuracy and robustness in ultra-short-term forecasting and showcase their strong few-shot learning capabilities.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103541"},"PeriodicalIF":15.5,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144721904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MARNet: Multi-scale adaptive relational network for robust point cloud completion via cross-modal fusion MARNet:基于跨模态融合的多尺度自适应关系网络
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2025-07-26 DOI: 10.1016/j.inffus.2025.103505
Jinlong Xie , Liping Zhang , Long Cheng , Jian Yao , Pengjiang Qian , Binrong Zhu , Guiran Liu
{"title":"MARNet: Multi-scale adaptive relational network for robust point cloud completion via cross-modal fusion","authors":"Jinlong Xie ,&nbsp;Liping Zhang ,&nbsp;Long Cheng ,&nbsp;Jian Yao ,&nbsp;Pengjiang Qian ,&nbsp;Binrong Zhu ,&nbsp;Guiran Liu","doi":"10.1016/j.inffus.2025.103505","DOIUrl":"10.1016/j.inffus.2025.103505","url":null,"abstract":"<div><div>Point cloud completion, pivotal for enabling robust 3D understanding in autonomous systems and augmented reality, faces persistent challenges in structural fidelity preservation and detail adaptive restoration. This paper presents MARNet—a novel multi-scale adaptive relational network via cross-modal guidance. We first design a spatial-relational adaptive feature descriptor integrating space-feature adaptive downsampling with relation-aware weighted edge convolution. This integration effectively preserves structural integrity while suppressing outlier interference. We then introduce a hierarchical cross-modal fusion module establishing bidirectional feature interaction pathways between 3D point clouds and multi-view images through attention-guided mechanisms, which significantly enhances feature representation capacity. Additionally, our adaptive multi-resolution point generator dynamically adjusts upsampling stages based on each shape’s geometric complexity, restoring highly detailed structures while mitigating over- and under-completion issues. Extensive experiments demonstrate state-of-the-art performance with Chamfer Distance of <span><math><mrow><mn>6</mn><mo>.</mo><mn>36</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>4</mn></mrow></msup></mrow></math></span> on Completion 3D and <span><math><mrow><mn>6</mn><mo>.</mo><mn>42</mn><mo>×</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mn>3</mn></mrow></msup></mrow></math></span> on PCN datasets. Our method outperforms existing approaches in preserving fine details and global consistency, particularly for complex structures, while exhibiting robustness to noise and viewpoint variations. The code is available at <span><span>https://github.com/long-git22/MARNet.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103505"},"PeriodicalIF":14.7,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144712953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the representational power of graph neural networks via mixed substructure learning 通过混合子结构学习提高图神经网络的表示能力
IF 15.5 1区 计算机科学
Information Fusion Pub Date : 2025-07-26 DOI: 10.1016/j.inffus.2025.103558
Zhenpeng Wu , Jiamin Chen , Jianliang Gao
{"title":"Improving the representational power of graph neural networks via mixed substructure learning","authors":"Zhenpeng Wu ,&nbsp;Jiamin Chen ,&nbsp;Jianliang Gao","doi":"10.1016/j.inffus.2025.103558","DOIUrl":"10.1016/j.inffus.2025.103558","url":null,"abstract":"<div><div>The recent trend in graph representation learning is to use Graph Neural Networks (GNNs) to approximate specific functions to capture specific graph substructures when performing aggregation, achieving stronger representational power than the 1-dimensional Weisfeiler-Leman (1-WL) graph isomorphism test. However, different graph substructures have different contributions in various scenarios, such as the clique substructure for social networks. Moreover, these methods suffer from high computational costs in capturing graph substructures, making it impractical to directly count all graph substructures when performing aggregation. Therefore, adapting the optimal graph substructure for different scenarios is an obvious challenge. To address the above challenge, we propose a simple yet effective solution, MixSL, which is flexible enough to work with any GNN backbone. Based on theoretical analysis, we offer a straightforward strategy that restricts the information of all graph substructures to the input feature space in advance, rather than the aggregation process, thereby significantly reducing computational costs. Then, we apply mixed substructure learning to all graph substructures, so that the GNN backbone can automatically learn the sample distribution of graph substructures. Without changing the GNN backbone architecture and training settings, MixSL brings a consistent and significant performance improvement on multiple graph classification benchmarks from different scenarios.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 ","pages":"Article 103558"},"PeriodicalIF":15.5,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144721906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信