Information FusionPub Date : 2025-01-01DOI: 10.1016/j.inffus.2024.102907
Hong-Bo Zhang, Jia-Xin Hong, Jing-Hua Liu, Qing Lei, Ji-Xiang Du
{"title":"Images, normal maps and point clouds fusion decoder for 6D pose estimation","authors":"Hong-Bo Zhang, Jia-Xin Hong, Jing-Hua Liu, Qing Lei, Ji-Xiang Du","doi":"10.1016/j.inffus.2024.102907","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102907","url":null,"abstract":"6D pose estimation plays a crucial role in enabling intelligent robots to interact with their environment by understanding 3D scene information. This task is challenging due to factors such as texture-less objects, illumination variations, and scene occlusions. In this work, we present a novel approach that integrates feature fusion from multiple data modalities—specifically, RGB images, normal maps, and point clouds—to enhance the accuracy of 6D pose estimation. Unlike previous methods that rely solely on RGB-D data or focus on either shallow or deep feature fusion, the proposed method uniquely incorporates both shallow and deep feature fusion across heterogeneous modalities, compensating for the information often lost in point clouds. Specifically, the proposed method includes an adaptive feature fusion module designed to improve the communication and fusion of shallow features between RGB images and normal maps. Additionally, a multi-modal fusion decoder is implemented to facilitate cross-modal feature fusion between image and point cloud data. Experimental results demonstrate that the proposed method achieves state-of-the-art performance, with 6D pose estimation accuracy reaching 97.7% on the Linemod dataset, 71.5% on the Occlusion Linemod dataset, and 95.8% on the YCB-Video dataset. These results underline the robustness and effectiveness of the proposed approach in complex environments.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"126 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-12-31DOI: 10.1016/j.inffus.2024.102901
Binbin Sang, Lei Yang, Weihua Xu, Hongmei Chen, Tianrui Li, Wentao Li
{"title":"VCOS: Multi-scale information fusion to feature selection using fuzzy rough combination entropy","authors":"Binbin Sang, Lei Yang, Weihua Xu, Hongmei Chen, Tianrui Li, Wentao Li","doi":"10.1016/j.inffus.2024.102901","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102901","url":null,"abstract":"Multi-scale information fusion has attracted extensive attention in data mining, in which the optimal scale combination principles and feature selection algorithms are two core issues. However, the traditional optimal scale combination is obtained by satisfying the consistency of the conditional feature scales with the decision classification. This consistency principle is too strict and not fault-tolerant. It leads to the knowledge granularity being too fine, is likely to reduce the feature selection algorithms performance, and does not meet the needs of practical applications. Therefore, this paper develops a novel optimal scale combination selection method to fuse multi-scale information, establishes a new fuzzy rough set model, defines uncertainty measures, and designs a feature selection algorithm for Multi-scale Fuzzy Decision Systems (MsFDSs). First, the Variable-Consistency Optimal Scale (VCOS) selection principle is defined by introducing the variable-consistency rate. The VCOS-based fuzzy rough set model is proposed, a derived uncertainty measure based on this model is defined as well as related properties are proved. Then, the VCOS-based Fuzzy Rough Combinatorial Entropy (VCOS-FRCE) is defined, and its monotonicity with respect to the feature subsets and the variable consistency rate is proved, respectively. Finally, we define the relative reduct principle and the significance of features based on VCOS-FRCE and design a forward greedy multi-scale feature selection algorithm. Our proposed VCOS-based multi-scale fusion method can adjust the consistency degree between knowledge granules and decision classification according to actual needs. This multi-scale information fusion method has better generalization and can be applied to various complex data. The performance of the multi-scale feature selection method developed based on this method is also further improved. Experiments are performed on twelve public datasets from UCI, and the proposed algorithm is compared with eight existing algorithms. The experimental results show that the proposed algorithm can effectively remove redundant features and improve the classification performance.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"27 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-12-31DOI: 10.1016/j.inffus.2024.102908
Zhao Zhang, Senlin Luo, Yongxin Lu, Limin Pan
{"title":"Obfuscation-resilient detection of Android third-party libraries using multi-scale code dependency fusion","authors":"Zhao Zhang, Senlin Luo, Yongxin Lu, Limin Pan","doi":"10.1016/j.inffus.2024.102908","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102908","url":null,"abstract":"Third-Party Library (TPL) detection is a crucial aspect of Android application security assessment, but it faces significant challenges due to code obfuscation. Existing methods often rely on single-scale features, such as class dependencies or instruction opcodes. This reliance can overlook critical dependencies, leading to incomplete library representation and reduced detection recall. Furthermore, the high similarity between a TPL and its adjacent versions causes overlaps in the feature space, reducing the accuracy of version identification. To address these limitations, we propose LibMD, a multi-scale code dependency fusion approach for TPL detection in Android apps. LibMD enhances library code representation by combining class reference syntax augmentation, cross-scale function mapping, and control flow reconstruction of basic blocks. It also extracts metadata dependencies and constructs a library dependency graph that integrates app-code similarity with multiple libraries. By applying Bayes’ theorem to compute posterior probabilities, LibMD effectively evaluates the likelihood of TPL integration and improves the precision of library version identification. Experimental results demonstrate that LibMD outperforms state-of-the-art methods across diverse datasets, achieving robust TPL detection and accurate version identification, even under various obfuscation techniques.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"73 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-12-31DOI: 10.1016/j.inffus.2024.102903
Jiangang Ding, Yuanlin Zhao, Lili Pei, Yihui Shan, Yiquan Du, Wei Li
{"title":"Modal-invariant progressive representation for multimodal image registration","authors":"Jiangang Ding, Yuanlin Zhao, Lili Pei, Yihui Shan, Yiquan Du, Wei Li","doi":"10.1016/j.inffus.2024.102903","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102903","url":null,"abstract":"Many applications, such as autonomous driving, rely heavily on multimodal data. However, differences in resolution, viewing angle, and optical path structure cause pixel misalignment between multimodal images, leading to distortions in the fusion result and edge artifacts. In addition to the widely used manual calibration, learning-based methods typically employ a two-stage registration process, referred to as “translating-then-registering”. However, the gap between modalities makes this approach less cohesive. It introduces more uncertainty during registration, misleading feature alignment at different locations and limiting the accuracy of the deformation field. To tackle these challenges, we introduce the Modality-Invariant Progressive Representation (MIPR) approach. The key behind MIPR is to decouple features from different modalities into a modality-invariant domain based on frequency bands, followed by a progressive correction at multiple feature scales. Specifically, MIPR consists two main components: the Field Adaptive Fusion (FAF) module and the Progressive Field Estimation (PFE) module. FAF integrates all previous multi-scale deformation subfields. PFE progressively estimates the remaining deformation subfields at different scales. Furthermore, we propose a two-stage pretraining strategy for end-to-end registration. Our approach is simple and robust, achieving impressive visual results in several benchmark tasks, even surpassing the ground truth from manual calibration, and advancing downstream tasks.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"2 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142929203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-12-27DOI: 10.1016/j.inffus.2024.102900
Jun Hu, Shuting Fan, Raquel Caballero-Águila, Mingqing Zhu, Guangchen Zhang
{"title":"Distributed fusion filtering for multi-rate nonlinear systems with binary measurements under encryption and decryption scheme","authors":"Jun Hu, Shuting Fan, Raquel Caballero-Águila, Mingqing Zhu, Guangchen Zhang","doi":"10.1016/j.inffus.2024.102900","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102900","url":null,"abstract":"This paper discusses the distributed fusion filtering problem for multi-rate nonlinear systems with binary measurements (BMs) based on an encryption and decryption scheme (EDS), in which the measurement outputs are represented by vectors with elements taking the values of 0 or 1. The expectation of the BMs is described by the cumulative distribution function of the standard normal distribution, where a newly defined random variable is utilized for reconstructing the BMs model. In order to ensure information security, the EDS is introduced in the data transmission process among the sensor nodes. Based on the information obtained, the local distributed filtering algorithm is proposed to obtain an upper bound on the local filtering error covariance, and the local filter gain is designed to minimize the resulting upper bound. In addition, the fusion filter is obtained with the parallel covariance intersection fusion criterion and the filtering performance is analyzed in terms of boundedness with theoretical proof. Finally, a target tracking experiment is taken to show the effectiveness and applicability of the proposed fusion filtering scheme.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"2 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey on multi-view fusion for predicting links in biomedical bipartite networks: Methods and applications","authors":"Yuqing Qian, Yizheng Wang, Junkai Liu, Quan Zou, Yijie Ding, Xiaoyi Guo, Weiping Ding","doi":"10.1016/j.inffus.2024.102894","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102894","url":null,"abstract":"Biomedical research increasingly relies on the analysis of complex interactions between biological entities, such as genes, proteins, and drugs. Although advancements in biomedical technologies have led to a vast accumulation of relational data, the high cost and time demands of wet-lab experiments have limited the number of verified interactions. Thus, computational methods have become essential for predicting potential links by leveraging diverse datasets to efficiently and accurately identify promising interactions. Multi-view fusion, which combines complementary information from multiple sources, has shown significant promise for enhancing the prediction accuracy and robustness. We introduce the framework of multi-view fusion methods by elaborating on key components. This includes a comprehensive examination of multi-view data sources covering various omics and biological databases. We then describe the feature extraction techniques and explore how meaningful features can be derived from heterogeneous data formats. Next, we offer an in-depth review of the fusion strategies and categorize them as early fusion, late fusion, and fusion during the training phase. We discuss the advantages and limitations of each approach, emphasizing the need for sophisticated techniques that consider the unique attributes of biological link prediction. We also provide an overview of the commonly used datasets, evaluation metrics, and validation techniques. Commonly used datasets serve as reliable benchmarks for evaluating the computational models. Evaluation metrics and validation techniques are crucial for reliably assessing the performances of link prediction models. Subsequently, a comparative analysis of different fusion methods is conducted to empirically evaluate their performances on widely available biomedical datasets. This yielded valuable insights into the strengths and limitations of each approach in real-world applications. Finally, we identify key obstacles such as data heterogeneity, model robustness, and missing data and suggest potential directions for future research. Our findings offer valuable insights into the applications and future directions of multi-view fusion methods for biomedical link prediction, highlighting their potential to accelerate discovery and innovation in the field.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"9 1 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OmniFuse: A general modality fusion framework for multi-modality learning on low-quality medical data","authors":"Yixuan Wu, Jintai Chen, Lianting Hu, Hongxia Xu, Huiying Liang, Jian Wu","doi":"10.1016/j.inffus.2024.102890","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102890","url":null,"abstract":"Mirroring the practice of human medical experts, the integration of diverse medical examination modalities enhances the performance of predictive models in clinical settings. However, traditional multi-modal learning systems face significant challenges when dealing with low-quality medical data, which is common due to factors such as inconsistent data collection across multiple sites and varying sensor resolutions, as well as information loss due to poor data management. To address these issues, in this paper, we identify and explore three core technical challenges surrounding multi-modal learning on low-quality medical data: (i) the absence of informative modalities, (ii) imbalanced clinically useful information across modalities, and (iii) the entanglement of valuable information with noise in the data. To fully harness the potential of multi-modal low-quality data for automated high-precision disease diagnosis, we propose a general medical multi-modality learning framework that addresses these three core challenges on varying medical scenarios involving multiple modalities. To compensate for the absence of informative modalities, we utilize existing modalities to selectively integrate valuable information and then perform imputation, which is effective even in extreme absence scenarios. For the issue of modality information imbalance, we explicitly quantify the relationships between different modalities for individual samples, ensuring that the effective information from advantageous modalities is fully utilized. Moreover, to mitigate the conflation of information with noise, our framework traceably identifies and activates lazy modality combinations to eliminate noise and enhance data quality. Extensive experiments demonstrate the superiority and broad applicability of our framework. In predicting in-hospital mortality using joint EHR, Chest X-ray, and Report dara, our framework surpasses existing methods, improving the AUROC from 0.811 to 0.872. When applied to lung cancer pathological subtyping using PET, CT, and Report data, our approach achieves an impressive AUROC of 0.894.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"6 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-12-24DOI: 10.1016/j.inffus.2024.102891
Dipon Kumar Ghosh, Yong Ju Jung
{"title":"Depth cue fusion for event-based stereo depth estimation","authors":"Dipon Kumar Ghosh, Yong Ju Jung","doi":"10.1016/j.inffus.2024.102891","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102891","url":null,"abstract":"Inspired by the biological retina, event cameras utilize dynamic vision sensors to capture pixel intensity changes asynchronously. Event cameras offer numerous advantages, such as high dynamic range, high temporal resolution, less motion blur, and low power consumption. These features make event cameras particularly well-suited for depth estimation, especially in challenging scenarios involving rapid motion and high dynamic range imaging conditions. The human visual system perceives the scene depth by combining multiple depth cues such as monocular pictorial depth, stereo depth, and motion parallax. However, most existing algorithms of the event-based depth estimation utilize only single depth cue such as either stereo depth or monocular depth. While it is feasible to estimate depth from a single cue, estimating dense disparity in challenging scenarios and lightning conditions remains a challenging problem. Following this, we conduct extensive experiments to explore various methods for the depth cue fusion. Inspired by the experiment results, in this study, we propose a fusion architecture that systematically incorporates multiple depth cues for the event-based stereo depth estimation. To this end, we propose a depth cue fusion (DCF) network to fuse multiple depth cues by utilizing a novel fusion method called SpadeFormer. The proposed SpadeFormer is a full y context-aware fusion mechanism, which incorporates two modulation techniques (i.e., spatially adaptive denormalization (Spade) and cross-attention) for the depth cue fusion in a transformer block. The adaptive denormalization modulates both input features by adjusting the global statistics of features in a cross manner, and the modulated features are further fused by the cross-attention technique. Experiments conducted on a real-world dataset show that our method reduces the one-pixel error rate by at least 47.63% (3.708 for the best existing method vs. 1.942 for ours) and the mean absolute error by 40.07% (0.302 for the best existing method vs. 0.181 for ours). The results reveal that the depth cue fusion method outperforms the state-of-the-art methods by significant margins and produces better disparity maps.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"44 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-12-23DOI: 10.1016/j.inffus.2024.102883
Xi-Yu Wang, Ying-Ming Wang
{"title":"Minimum adjustment consensus model for multi-person multi-criteria large scale decision-making with trust consistency propagation and opinion dynamics","authors":"Xi-Yu Wang, Ying-Ming Wang","doi":"10.1016/j.inffus.2024.102883","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102883","url":null,"abstract":"The consensus reaching process (CRP) represents a multi-round dynamic method essential for harmonizing the interests of multiple parties. With the rise of instant messaging and social media, the complexity of individual social trust networks and structures. Therefore, it is crucial to explore the inherent value of trust networks in the context of multi-person multi-criteria large-scale decision-making (MpMcLSDM) to facilitate consensus. This paper develops a minimum adjustment consensus model (MACM) for MpMcLSDM based on social trust network analysis (STNA). First, the consistency path rule and personal traits are defined through STNA, leading to a formulated strategy for the completion of the trust relationship. Subsequently, a novel centrality measure, informed by the consistency path rule, is proposed, and a weight method is devised to determine decision-maker (DM) weights and sub-cluster weights after clustering. This paper further elucidates the implications of consensus level fluctuations on DM self-confidence and opinion inclination. Ultimately, a MACM is constructed within the MpMcLSDM framework, integrating opinion dynamics. A numerical example demonstrates the model’s effectiveness, and comparisons with other methods show its rationale and improvement in performance.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"33 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142902102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comprehensive survey of large language models and multimodal large language models in medicine","authors":"Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang","doi":"10.1016/j.inffus.2024.102888","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102888","url":null,"abstract":"Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal large language models (MLLMs) have attracted widespread attention for their exceptional capabilities in understanding, reasoning, and generation, introducing transformative paradigms for integrating artificial intelligence into medicine. This survey provides a comprehensive overview of the development, principles, application scenarios, challenges, and future directions of LLMs and MLLMs in medicine. Specifically, it begins by examining the paradigm shift, tracing the transition from traditional models to LLMs and MLLMs, and highlighting the unique advantages of these LLMs and MLLMs in medical applications. Next, the survey reviews existing medical LLMs and MLLMs, providing detailed guidance on their construction and evaluation in a clear and systematic manner. Subsequently, to underscore the substantial value of LLMs and MLLMs in healthcare, the survey explores five promising applications in the field. Finally, the survey addresses the challenges confronting medical LLMs and MLLMs and proposes practical strategies and future directions for their integration into medicine. In summary, this survey offers a comprehensive analysis of the technical methodologies and practical clinical applications of medical LLMs and MLLMs, with the goal of bridging the gap between these advanced technologies and clinical practice, thereby fostering the evolution of the next generation of intelligent healthcare systems.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"32 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}