{"title":"Resolving multimodal ambiguity via knowledge-injection and ambiguity learning for multimodal sentiment analysis","authors":"Xianbing Zhao , Xuejiao Li , Ronghuan Jiang , Buzhou Tang","doi":"10.1016/j.inffus.2024.102745","DOIUrl":"10.1016/j.inffus.2024.102745","url":null,"abstract":"<div><div>Multimodal Sentiment Analysis (MSA) utilizes complementary multimodal features to predict sentiment polarity, which mainly involves language, vision, and audio modalities. Existing multimodal fusion methods primarily consider the complementarity of different modalities, while neglecting the ambiguity caused by conflicts between modalities (i.e. the text modality predicts positive sentiment while the visual modality predicts negative sentiment). To well diminish these conflicts, we develop a novel multimodal ambiguity learning framework, namely RMA, Resolving Multimodal Ambiguity via Knowledge-Injection and Ambiguity Learning for Multimodal Sentiment Analysis. Specifically, We introduce and filter external knowledge to enhance the consistency of cross-modal sentiment polarity prediction. Immediately, we explicitly measure ambiguity and dynamically adjust the impact between the subordinate modalities and the dominant modality to simultaneously consider the complementarity and conflicts of multiple modalities during multimodal fusion. Experiments demonstrate the dominantity of our proposed model across three public multimodal sentiment analysis datasets CMU-MOSI, CMU-MOSEI, and MELD.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102745"},"PeriodicalIF":14.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-30DOI: 10.1016/j.inffus.2024.102754
Qinli Zhang , Pengfei Zhang , Tianrui Li
{"title":"Information fusion for large-scale multi-source data based on the Dempster-Shafer evidence theory","authors":"Qinli Zhang , Pengfei Zhang , Tianrui Li","doi":"10.1016/j.inffus.2024.102754","DOIUrl":"10.1016/j.inffus.2024.102754","url":null,"abstract":"<div><div>There exists many large-scale multi-source data, ranging from genetic information to medical records, and military intelligence. The inherent intricacies and uncertainties embedded within these data sources pose significant challenges to the process of information fusion. Owing to its exceptional capacity to represent data uncertainty, Dempster-Shafer (D-S) evidence theory has emerged as a widely utilized approach in information fusion. However, the evidence theory encounters three significant issues when applied to multi-source data information fusion: (1) the conversion of sample information into evidence and the construction of the basic probability assignment (BPA) function; (2) the resolution of conflicting evidence; and (3) the mitigation of exponential explosion in computation. Addressing the aforementioned challenges, this paper delves into the information fusion strategies for large-scale multi-source data based on Dempster-Shafer evidence theory. Initially, the concept of support matrix is introduced and the data matrix is transformed into a support matrix to address the construction challenges associated with BPA. Next, a method for addressing evidence conflicts is introduced by incorporating an additional data source composed of average values. Furthermore, a solution for mitigating high computational complexity is presented through the utilization of a hierarchical fusion approach. Finally, experimental results show that compared with other five advanced information fusion methods, our information method has improved the classification accuracy by 4.66% on average and reduced the time by 66.35% on average. Hence, our method is both efficient and effective, demonstrating exceptional performance in information fusion.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102754"},"PeriodicalIF":14.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-29DOI: 10.1016/j.inffus.2024.102770
Jingchun Zhou , Tianyu Liang , Dehuan Zhang , Siyuan Liu , Junsheng Wang , Edmond Q. Wu
{"title":"WaterHE-NeRF: Water-ray matching neural radiance fields for underwater scene reconstruction","authors":"Jingchun Zhou , Tianyu Liang , Dehuan Zhang , Siyuan Liu , Junsheng Wang , Edmond Q. Wu","doi":"10.1016/j.inffus.2024.102770","DOIUrl":"10.1016/j.inffus.2024.102770","url":null,"abstract":"<div><div>Neural Radiance Field (NeRF) technology demonstrates immense potential in novel viewpoint synthesis tasks due to its physics-based volumetric rendering process, which is particularly promising in underwater scenes. However, existing underwater NeRF methods face challenges in handling light attenuation caused by the water medium and the lack of real Ground Truth (GT) supervision. To address these issues, we propose WaterHE-NeRF, a novel approach incorporating a water-ray matching field developed based on Retinex theory. This field precisely encodes color, density, and illuminance attenuation in three-dimensional space. WaterHE-NeRF employs an illuminance attenuation mechanism to generate degraded and clear multi-view images, optimizing image restoration by combining reconstruction loss with Wasserstein distance. Furthermore, using histogram equalization (HE) as pseudo-GT, WaterHE-NeRF enhances the network’s accuracy in preserving original details and color distribution. Extensive experiments on real underwater and synthetic datasets validate the effectiveness of WaterHE-NeRF.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102770"},"PeriodicalIF":14.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142587229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-29DOI: 10.1016/j.inffus.2024.102778
Feifei Jin , Xiaoxuan Gao , Ligang Zhou
{"title":"Bounded rationality consensus reaching process with regret theory and weighted Moment estimation for multi-attribute group decision making","authors":"Feifei Jin , Xiaoxuan Gao , Ligang Zhou","doi":"10.1016/j.inffus.2024.102778","DOIUrl":"10.1016/j.inffus.2024.102778","url":null,"abstract":"<div><div>Probabilistic linguistic term sets perform a particularly active role in the field of decision-making, particularly regarding decision-makers (DMs) who are inclined to convey evaluative information through natural linguistic variables. To effectively improve the current dilemma of multi-attribute group decision-making (MAGDM), this article put forward a new probabilistic linguistic MAGDM method with weighted Moment estimation. First, taking into account the psychological aspect of regret aversion among DMs, we use regret theory to transform the original decision-making matrix into the utility matrix, in which DMs usually exhibit limited rationality during the process of MAGDM. Then, a combined weighting method and a weighted Moment estimation model are investigated to determine the attribute weights, which are more scientifically and reasonably. Subsequently, in the process of consensus reaching process, a new trust propagation mechanism is designed to derive the weights of experts and the adjustment coefficients, in which we consider the shortest and longest propagation paths among DMs. Finally, an empirical validation of the MAGDM method's applicability is conducted utilizing raw coal quality assessment, accompanied by sensitivity and comparative analyses that underscore its advantages and robustness.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102778"},"PeriodicalIF":14.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-29DOI: 10.1016/j.inffus.2024.102760
Iris Dominguez-Catena, Daniel Paternain, Mikel Galar
{"title":"DSAP: Analyzing bias through demographic comparison of datasets","authors":"Iris Dominguez-Catena, Daniel Paternain, Mikel Galar","doi":"10.1016/j.inffus.2024.102760","DOIUrl":"10.1016/j.inffus.2024.102760","url":null,"abstract":"<div><div>In the last few years, Artificial Intelligence (AI) systems have become increasingly widespread. Unfortunately, these systems can share many biases with human decision-making, including demographic biases. Often, these biases can be traced back to the data used for training, where large uncurated datasets have become the norm. Despite our awareness of these biases, we still lack general tools to detect, quantify, and compare them across different datasets. In this work, we propose DSAP (Demographic Similarity from Auxiliary Profiles), a two-step methodology for comparing the demographic composition of datasets. First, DSAP uses existing demographic estimation models to extract a dataset’s demographic profile. Second, it applies a similarity metric to compare the demographic profiles of different datasets. While these individual components are well-known, their joint use for demographic dataset comparison is novel and has not been previously addressed in the literature. This approach allows three key applications: the identification of demographic blind spots and bias issues across datasets, the measurement of demographic bias, and the assessment of demographic shifts over time. DSAP can be used on datasets with or without explicit demographic information, provided that demographic information can be derived from the samples using auxiliary models, such as those for image or voice datasets. To show the usefulness of the proposed methodology, we consider the Facial Expression Recognition task, where demographic bias has previously been found. The three applications are studied over a set of twenty datasets with varying properties. The code is available at <span><span>https://github.com/irisdominguez/DSAP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102760"},"PeriodicalIF":14.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-29DOI: 10.1016/j.inffus.2024.102753
Fei Ma , Yucheng Yuan , Yifan Xie , Hongwei Ren , Ivan Liu , Ying He , Fuji Ren , Fei Richard Yu , Shiguang Ni
{"title":"Generative technology for human emotion recognition: A scoping review","authors":"Fei Ma , Yucheng Yuan , Yifan Xie , Hongwei Ren , Ivan Liu , Ying He , Fuji Ren , Fei Richard Yu , Shiguang Ni","doi":"10.1016/j.inffus.2024.102753","DOIUrl":"10.1016/j.inffus.2024.102753","url":null,"abstract":"<div><div>Affective computing stands at the forefront of artificial intelligence (AI), seeking to imbue machines with the ability to comprehend and respond to human emotions. Central to this field is emotion recognition, which endeavors to identify and interpret human emotional states from different modalities, such as speech, facial images, text, and physiological signals. In recent years, important progress has been made in generative models, including Autoencoder, Generative Adversarial Network, Diffusion Model, and Large Language Model. These models, with their powerful data generation capabilities, emerge as pivotal tools in advancing emotion recognition. However, up to now, there remains a paucity of systematic efforts that review generative technology for emotion recognition. This survey aims to bridge the gaps in the existing literature by conducting a comprehensive analysis of over 330 research papers until June 2024. Specifically, this survey will firstly introduce the mathematical principles of different generative models and the commonly used datasets. Subsequently, through a taxonomy, it will provide an in-depth analysis of how generative techniques address emotion recognition based on different modalities in several aspects, including data augmentation, feature extraction, semi-supervised learning, cross-domain, etc. Finally, the review will outline future research directions, emphasizing the potential of generative models to advance the field of emotion recognition and enhance the emotional intelligence of AI systems.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102753"},"PeriodicalIF":14.7,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142561069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-28DOI: 10.1016/j.inffus.2024.102722
Yong He , Hongshan Yu , Xiaoyan Liu , Zhengeng Yang , Wei Sun , Saeed Anwar , Ajmal Mian
{"title":"Deep learning based 3D segmentation in computer vision: A survey","authors":"Yong He , Hongshan Yu , Xiaoyan Liu , Zhengeng Yang , Wei Sun , Saeed Anwar , Ajmal Mian","doi":"10.1016/j.inffus.2024.102722","DOIUrl":"10.1016/j.inffus.2024.102722","url":null,"abstract":"<div><div>3D segmentation is a fundamental and challenging problem in computer vision with applications in autonomous driving and robotics. It has received significant attention from the computer vision, graphics and machine learning communities. Conventional methods for 3D segmentation, based on hand-crafted features and machine learning classifiers, lack generalization ability. Driven by their success in 2D computer vision, deep learning techniques have recently become the tool of choice for 3D segmentation tasks. This has led to an influx of many methods in the literature that have been evaluated on different benchmark datasets. Whereas survey papers on RGB-D and point cloud segmentation exist, there is a lack of a recent in-depth survey that covers all 3D data modalities and application domains. This paper fills the gap and comprehensively surveys the recent progress in deep learning-based 3D segmentation techniques. We cover over 230 works from the last six years, analyze their strengths and limitations, and discuss their competitive results on benchmark datasets. The survey provides a summary of the most commonly used pipelines and finally highlights promising research directions for the future.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102722"},"PeriodicalIF":14.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-28DOI: 10.1016/j.inffus.2024.102755
Kelvin Du , Yazhi Zhao , Rui Mao , Frank Xing , Erik Cambria
{"title":"Natural language processing in finance: A survey","authors":"Kelvin Du , Yazhi Zhao , Rui Mao , Frank Xing , Erik Cambria","doi":"10.1016/j.inffus.2024.102755","DOIUrl":"10.1016/j.inffus.2024.102755","url":null,"abstract":"<div><div>This survey presents an in-depth review of the transformative role of Natural Language Processing (NLP) in finance, highlighting its impact on ten major financial applications: (1) financial sentiment analysis, (2) financial narrative processing, (3) financial forecasting, (4) portfolio management, (5) question answering, virtual assistant and chatbot, (6) risk management, (7) regulatory compliance monitoring, (8) Environmental, Social, Governance (ESG) and sustainable finance, (9) explainable artificial intelligence (XAI) in finance and (10) NLP for digital assets. With the integration of vast amounts of unstructured financial data and advanced NLP techniques, the study explores how NLP enables data-driven decision-making and innovation in the financial sector, alongside the limitations and challenges. By providing a comprehensive analysis of NLP applications combining both academic and industrial perspectives, this study postulates the future trends and evolution of financial services. It introduces a unique review framework to understand the interaction of financial data and NLP technologies systematically and outlines the key drivers, transformations, and emerging areas in this field. This survey targets researchers, practitioners, and professionals, aiming to close their knowledge gap by highlighting the significance and future direction of NLP in enhancing financial services.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102755"},"PeriodicalIF":14.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142658119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-28DOI: 10.1016/j.inffus.2024.102777
Dongxin Zhao , Jianhua Liu , Peng Geng , Jiaxin Yang , Ziqian Zhang , Yin Zhang
{"title":"Mid-Net: Rethinking efficient network architectures for small-sample vascular segmentation","authors":"Dongxin Zhao , Jianhua Liu , Peng Geng , Jiaxin Yang , Ziqian Zhang , Yin Zhang","doi":"10.1016/j.inffus.2024.102777","DOIUrl":"10.1016/j.inffus.2024.102777","url":null,"abstract":"<div><div>Deep learning-based medical image segmentation methods have demonstrated significant clinical applications. However, training these methods on small-sample vascular datasets remains challenging due to the scarcity of labeled data and severe category imbalance. To address this, this paper proposes Mid-Net, which fully exploits the often-overlooked feature representation potential of the middle-layer network through cross-layer guidance to improve model learning efficiency in data-constrained environments. Mid-Net consists of three core components: the encoding path, the guidance path, and the calibration path. In the encoding path, a feature pyramid structure with large kernel convolutions is used to extract semantic information at different scales. The guidance path combines the sensitivity of the shallow-layer network to spatial details with the global perceptual abilities of the deep-layer network to provide more discriminative guidance to the middle-layer network in a feature-decoupled manner. The calibration path further calibrates the spatial location information of the middle-layer network through end-to-end supervised learning. Experiments conducted on the publicly available retinal vascular datasets DRIVE, STARE, and CHASE_DB1, as well as coronary angiography datasets DCA1 and CHUAC, demonstrate that Mid-Net achieves superior segmentation results with lower computational resource requirements compared to state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102777"},"PeriodicalIF":14.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142571538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information FusionPub Date : 2024-10-28DOI: 10.1016/j.inffus.2024.102758
Ying Xie , Jixiang Wang , Zhiqiang Xu , Junnan Shen , Lijie Wen , Rongbin Xu , Hang Xu , Yun Yang
{"title":"Alignable kernel network","authors":"Ying Xie , Jixiang Wang , Zhiqiang Xu , Junnan Shen , Lijie Wen , Rongbin Xu , Hang Xu , Yun Yang","doi":"10.1016/j.inffus.2024.102758","DOIUrl":"10.1016/j.inffus.2024.102758","url":null,"abstract":"<div><div>To enhance the adaptability and performance of Convolutional Neural Networks (CNN), we present an adaptable mechanism called Alignable Kernel (AliK) unit, which dynamically adjusts the receptive field (RF) dimensions of a model in response to varying stimuli. The branches of AliK unit are integrated through a novel align transformation softmax attention, incorporating prior knowledge through rank ordering constraints. The attention weightings across the branches establish the effective RF scales, leveraged by neurons in the fusion layer. This mechanism is inspired by neuroscientific observations indicating that the RF dimensions of neurons in the visual cortex vary with the stimulus, a feature often overlooked in CNN architectures. By aggregating successive AliK ensembles, we develop a deep network architecture named the Alignable Kernel Network (AliKNet). AliKNet with interdisciplinary design improves the network’s performance and interpretability by taking direct inspiration from the structure and function of human neural systems, especially the visual cortex. Empirical evaluations in the domains of image classification and semantic segmentation have demonstrated that AliKNet excels over numerous state-of-the-art architectures, achieving this without increasing model complexity. Furthermore, we demonstrate that AliKNet can identify target objects across various scales, confirming their ability to dynamically adapt their RF sizes in response to the input data.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102758"},"PeriodicalIF":14.7,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}