Information Fusion最新文献

筛选
英文 中文
Physical prior-guided deep fusion network with shading cues for shape from polarization 物理先验引导的深度融合网络,通过偏振的阴影线索了解形状
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-23 DOI: 10.1016/j.inffus.2024.102805
Rui Liu , Zhiyuan Zhang , Yini Peng , Jiayi Ma , Xin Tian
{"title":"Physical prior-guided deep fusion network with shading cues for shape from polarization","authors":"Rui Liu ,&nbsp;Zhiyuan Zhang ,&nbsp;Yini Peng ,&nbsp;Jiayi Ma ,&nbsp;Xin Tian","doi":"10.1016/j.inffus.2024.102805","DOIUrl":"10.1016/j.inffus.2024.102805","url":null,"abstract":"<div><div>Shape from polarization (SfP) is a powerful passive three-dimensional imaging technique that enables the reconstruction of surface normal with dense textural details. However, existing deep learning-based SfP methods only focus on the polarization prior, which makes it difficult to accurately reconstruct targets with rich texture details under complicated scenes. Aiming to improve the reconstruction accuracy, we utilize the surface normal estimated from shading cues and the innovatively proposed specular confidence as shading prior to provide additional feature information. Furthermore, to efficiently combine the polarization and shading priors, a novel deep fusion network named SfPSNet is proposed for the information extraction and the reconstruction of surface normal. SfPSNet is implemented based on a dual-branch architecture to handle different physical priors. A feature correction module is specifically designed to mutually rectify the defects in channel-wise and spatial-wise dimensions, respectively. In addition, a feature fusion module is proposed to fuse the feature maps of polarization and shading priors based on an efficient cross-attention mechanism. Our experimental results show that the fusion of polarization and shading priors can significantly improve the reconstruction quality of surface normal, especially for objects or scenes illuminated by complex lighting sources. As a result, SfPSNet shows state-of-the-art performance compared with existing deep learning-based SfP methods benefiting from its efficiency in extracting and fusing information from different priors.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102805"},"PeriodicalIF":14.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incomplete multi-view clustering based on hypergraph 基于超图的不完全多视图聚类
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-23 DOI: 10.1016/j.inffus.2024.102804
Jin Chen , Huafu Xu , Jingjing Xue , Quanxue Gao , Cheng Deng , Ziyu Lv
{"title":"Incomplete multi-view clustering based on hypergraph","authors":"Jin Chen ,&nbsp;Huafu Xu ,&nbsp;Jingjing Xue ,&nbsp;Quanxue Gao ,&nbsp;Cheng Deng ,&nbsp;Ziyu Lv","doi":"10.1016/j.inffus.2024.102804","DOIUrl":"10.1016/j.inffus.2024.102804","url":null,"abstract":"<div><div>The graph-based incomplete multi-view clustering aims at integrating information from multiple views and utilizes graph models to capture the global and local structure of the data for reconstructing missing data, which is suitable for processing complex data. However, ordinary graph learning methods usually only consider pairwise relationships between data points and cannot unearth higher-order relationships latent in the data. And existing graph clustering methods often divide the process of learning the representations and the clustering process into two separate steps, which may lead to unsatisfactory clustering results. Besides, they also tend to consider only intra-view similarity structures and overlook inter-view ones. To this end, this paper introduces an innovative one-step <em>incomplete multi-view clustering based on hypergraph (IMVC_HG)</em>. Specifically, we use a hypergraph to reconstruct missing views, which can better explore the local structure and higher-order information between sample points. Moreover, we use non-negative matrix factorization with orthogonality constraints to equate K-means, which eliminates post-processing operations and avoids the problem of suboptimal results caused by the two-step approach. In addition, the tensor Schatten <span><math><mi>p</mi></math></span>-norm is used to better capture the complementary information and low-rank structure between the cluster label matrices of multiple views. Numerous experiments verify the superiority of IMVC_HG.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102804"},"PeriodicalIF":14.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-supervised learning-based multi-source spectral fusion for fruit quality evaluation:a case study in mango fruit ripeness prediction 基于自监督学习的水果质量评估多源光谱融合:芒果果实成熟度预测案例研究
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-23 DOI: 10.1016/j.inffus.2024.102814
Liu Zhang , Jincun Liu , Yaoguang Wei , Dong An , Xin Ning
{"title":"Self-supervised learning-based multi-source spectral fusion for fruit quality evaluation:a case study in mango fruit ripeness prediction","authors":"Liu Zhang ,&nbsp;Jincun Liu ,&nbsp;Yaoguang Wei ,&nbsp;Dong An ,&nbsp;Xin Ning","doi":"10.1016/j.inffus.2024.102814","DOIUrl":"10.1016/j.inffus.2024.102814","url":null,"abstract":"<div><div>Rapid and non-destructive techniques for fruit quality evaluation are widely concerned in modern agro-industry. Spectroscopy is one of the most commonly used techniques in this field. With the growing popularity of various spectroscopic instruments, it is indeed worthwhile to explore modeling with multi-source spectral data to achieve more accurate predictions. Nonetheless, a major challenge is acquiring enough labeled samples, as measuring fruit chemical values is laborious, expensive, and time-consuming, which hinders the development of a reliable prediction model. Therefore, this study aims to develop a model for predicting the internal chemical composition of fruits by integrating multi-source spectral fusion combined with self-supervised learning (SSL). A visible (Vis) and near-infrared (NIR) spectral dataset related to dry matter content (DMC) prediction in mango fruit is used as an example to validate the effectiveness of the proposed method. To obtain multi-source spectral data, the Vis and NIR portions are processed as two separate spectral ranges. An SSL pre-training is performed utilizing a large amount of raw unlabeled spectral data to extract general knowledge, which is subsequently migrated to a downstream task for fine-tuning. The experimental results indicate that the multi-source spectral fusion model performs better than the single-source spectral model. Moreover, SSL solves the data scarcity problem and outperforms non-pre-trained models in downstream DMC prediction tasks with less computational overhead. Remarkably, utilizing only less than 10% of the total samples is sufficient to achieve a performance close to 99% of the best results. The presented method has great potential in spectral analysis of food and agro-products.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102814"},"PeriodicalIF":14.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph convolutional network for compositional data
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-22 DOI: 10.1016/j.inffus.2024.102798
Shan Lu , Huiwen Wang , Jichang Zhao
{"title":"Graph convolutional network for compositional data","authors":"Shan Lu ,&nbsp;Huiwen Wang ,&nbsp;Jichang Zhao","doi":"10.1016/j.inffus.2024.102798","DOIUrl":"10.1016/j.inffus.2024.102798","url":null,"abstract":"<div><div>Graph convolutional network (GCN) has garnered significant attention and become a powerful tool for learning graph representations. However, when dealing with compositional data prevalent in various fields, the traditional GCN faces theoretical challenges due to the intrinsic constraints of such data. This paper generalizes the spectral graph theory in simplex space, aiming to address the graph structures among observations for compositional data analysis and to extend GCN by assigning mathematical objects of compositions to each vertex of a graph. We propose the graph Fourier transformation in simplex space, based on which a compositional graph convolutional network (CGCN) layer is introduced. This novel layer enables a GCN to appropriately capture the sample space of compositional data, allowing it to handle compositional features as model inputs. We then propose a new GCN architecture called COMP-GCN, incorporating the CGCN layer at the initial stage. We evaluate the effectiveness of COMP-GCN through simulation studies and two real-world applications: stock networks derived from co-investors in the Chinese stock market and student social networks based on co-locations in campus activities. The results demonstrate its superior performance over competitive methods with modest additional computational cost compared to traditional GCN. Our findings suggest the potential of the proposed model to inspire a new class of powerful algorithms for graph inference on compositional data in virtue of the generalization of graph convolution on simplex space.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102798"},"PeriodicalIF":14.7,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When multi-view meets multi-level: A novel spatio-temporal transformer for traffic prediction
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-22 DOI: 10.1016/j.inffus.2024.102801
Jiaqi Lin, Qianqian Ren, Xingfeng Lv, Hui Xu, Yong Liu
{"title":"When multi-view meets multi-level: A novel spatio-temporal transformer for traffic prediction","authors":"Jiaqi Lin,&nbsp;Qianqian Ren,&nbsp;Xingfeng Lv,&nbsp;Hui Xu,&nbsp;Yong Liu","doi":"10.1016/j.inffus.2024.102801","DOIUrl":"10.1016/j.inffus.2024.102801","url":null,"abstract":"<div><div>Traffic prediction is a vital aspect of Intelligent Transportation Systems with widespread applications. The main challenge is accurately modeling the complex spatial and temporal relationships in traffic data. Spatial–temporal Graph Neural Networks (GNNs) have emerged as one of the most promising methods to solve this problem. However, several key issues have not been well addressed in existing studies. Firstly, traffic patterns have significant periodic trends, existing methods often overlook the importance of periodicity. Secondly, most methods model spatial dependencies in a static manner, which limits the ability to learn dynamic traffic patterns. Lastly, achieving satisfactory results for both long-term and short-term forecasting remains a challenge. To tackle the above problems, this paper proposes a Multi-level Multi-view Augmented Spatio-temporal Transformer (LVSTformer) for traffic prediction, which captures spatial dependencies from three different levels: local geographic, global semantic, and pivotal nodes, along with long- and short-term temporal dependencies. Specifically, we design three spatial augmented views to delve into the spatial information from above three levels. By combining three spatial augmented views with three parallel spatial self-attention mechanisms, the model can comprehensively captures spatial dependencies at different levels. We design a gated temporal self-attention mechanism to dynamically capture long- and short-term temporal dependencies. Furthermore, a spatio-temporal context broadcasting module is introduced between two spatio-temporal layers to ensure a well-distributed allocation of attention scores, alleviating overfitting and information loss, and enhancing the generalization ability and robustness of the model. A comprehensive set of experiments are conducted on six well-known traffic benchmarks, the experimental results demonstrate that LVSTformer achieves state-of-the-art performance compared to competing baselines, with the maximum improvement reaching up to 4.32%.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102801"},"PeriodicalIF":14.7,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fusion of probabilistic linguistic term sets for enhanced group decision-making: Foundations, survey and challenges 融合概率语言术语集,加强群体决策:基础、调查与挑战
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-20 DOI: 10.1016/j.inffus.2024.102802
Xueling Ma , Xinru Han , Zeshui Xu , Rosa M. Rodríguez , Jianming Zhan
{"title":"Fusion of probabilistic linguistic term sets for enhanced group decision-making: Foundations, survey and challenges","authors":"Xueling Ma ,&nbsp;Xinru Han ,&nbsp;Zeshui Xu ,&nbsp;Rosa M. Rodríguez ,&nbsp;Jianming Zhan","doi":"10.1016/j.inffus.2024.102802","DOIUrl":"10.1016/j.inffus.2024.102802","url":null,"abstract":"<div><div>Probabilistic linguistic term set (PLTS) provides a flexible and comprehensive approach to reflecting qualitative information about decision makers (DMs) by fusing linguistic terms and probability distributions. This fusion makes PLTS an important focus of fuzzy decision theory. Dealing with uncertainty and ambiguity has always been a major challenge in the group decision-making (GDM) process, and PLTS provides a versatile and effective approach to address these issues. PLTS is able to more accurately represent the preferences and opinions of the DMs, thus improving the accuracy and consistency of decision-making, thereby improving the accuracy and consistency of decision-making. Therefore, the application of PLTSs in GDM (PLTS-GDM) has attracted more and more attention and shown great potential. In this paper, we provide a comprehensive overview of the underlying theories of PLTS-GDM, the existing approaches and the challenges they face. Specifically, we explore how the PLTS utilizes fuzzy information systems to manage imprecise and ambiguous data to enhance the effectiveness of decision-making. In addition, through an extensive review and analysis of the current literature, we summarize the major advances in the field and identify important gaps in the existing research. Finally, we point out future research directions aimed at addressing these challenges and further advancing the application and development of PLTS-GDM. In summary, this paper provides a valuable resource for scholars and practitioners to help them understand and promote the practical applications of PLTS-GDM.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"116 ","pages":"Article 102802"},"PeriodicalIF":14.7,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flare-aware cross-modal enhancement network for multi-spectral vehicle Re-identification 用于多光谱车辆再识别的耀斑感知跨模态增强网络
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-20 DOI: 10.1016/j.inffus.2024.102800
Aihua Zheng , Zhiqi Ma , Yongqi Sun , Zi Wang , Chenglong Li , Jin Tang
{"title":"Flare-aware cross-modal enhancement network for multi-spectral vehicle Re-identification","authors":"Aihua Zheng ,&nbsp;Zhiqi Ma ,&nbsp;Yongqi Sun ,&nbsp;Zi Wang ,&nbsp;Chenglong Li ,&nbsp;Jin Tang","doi":"10.1016/j.inffus.2024.102800","DOIUrl":"10.1016/j.inffus.2024.102800","url":null,"abstract":"<div><div>Multi-spectral vehicle Re-identification (Re-ID) aims to incorporate complementary visible and infrared information to tackle the challenge of re-identifying vehicles in complex lighting conditions. However, in harsh environments, the discriminative cues in RGB (visible) and NI (near infrared) modalities are significantly lost by the strong flare from vehicle lamps or the sunlight. To handle this problem, we propose a Flare-Aware Cross-modal Enhancement Network (FACENet) to adaptively restore the flare-corrupted RGB and NI features with the guidance from the flare-immunized TI (thermal infrared) spectra. First, to reduce the influence of locally degraded appearance by the intense flare, we propose a Mutual Flare Mask Prediction (MFMP) module to jointly obtain the flare-corrupted masks in RGB and NI modalities in a self-supervised manner. Second, to utilize the flare-immunized TI information to enhance the masked RGB and NI, we propose a Flare-aware Cross-modal Enhancement module (FCE) to adaptively guide feature extraction of masked RGB and NI spectra with the prior flare-immunized knowledge from the TI spectra. Third, to mine the common semantic information of RGB and NI, and alleviate the severe semantic loss in the NI spectra using TI, we propose a Multi-modality Consistency (MC) loss to enhance the semantic consistency among the three modalities. Finally, to evaluate the proposed FACENet while handling the intense flare problem, we contribute a new multi-spectral vehicle Re-ID dataset, named WMVEID863 with additional challenges, such as motion blur, huge background changes, and especially intense flare degradation. Comprehensive experiments on both the newly collected dataset and public benchmark multi-spectral vehicle Re-ID datasets verify the superior performance of the proposed FACENet compared to the state-of-the-art methods, especially in handling the strong flares. The codes and dataset will be released at <span><span>this link.</span><svg><path></path></svg></span></div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"116 ","pages":"Article 102800"},"PeriodicalIF":14.7,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A hybrid opinion dynamics model with leaders and followers fusing dynamic social networks in large-scale group decision-making 大规模群体决策中融合动态社交网络的领导者与追随者混合舆论动力学模型
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-20 DOI: 10.1016/j.inffus.2024.102799
Yufeng Shen , Xueling Ma , Muhammet Deveci , Enrique Herrera-Viedma , Jianming Zhan
{"title":"A hybrid opinion dynamics model with leaders and followers fusing dynamic social networks in large-scale group decision-making","authors":"Yufeng Shen ,&nbsp;Xueling Ma ,&nbsp;Muhammet Deveci ,&nbsp;Enrique Herrera-Viedma ,&nbsp;Jianming Zhan","doi":"10.1016/j.inffus.2024.102799","DOIUrl":"10.1016/j.inffus.2024.102799","url":null,"abstract":"<div><h3>Objectives:</h3><div>In this study, our goal is to enhance consensus efficiency in complex decision-making scenarios by constructing a large-scale group decision-making (LSGDM) method that integrates dynamic social network (DSN) and opinion dynamics. To this end, we design a model that can effectively cluster experts and dynamically adjust the network structure to more accurately reflect the diversity and complexity of the actual decision-making process.</div></div><div><h3>Methods:</h3><div>Specifically, we first design an improved Louvain algorithm based on social influence to effectively cluster participants with similar opinions into the same community. Then, we utilize structural hole theory to distinguish opinion leaders and followers in the community, and construct a DSN updating mechanism based on opinion disagreement and trust relationship. Finally, we combine the advantages of the DeGroot and Hegselmann–Krause (HK) models and propose a hybrid opinion dynamics (HOD) model in the LSGDM framework, referred to as DSN-HOD-LSGDM.</div></div><div><h3>Findings:</h3><div>Experimental results demonstrate that the DSN-HOD-LSGDM model significantly enhances consensus-building efficiency across diverse decision-making communities. The model effectively tracks opinion evolution in complex networks, outperforming conventional methods in both adaptability and scalability.</div></div><div><h3>Novelty:</h3><div>In this study, we propose an improved Louvain algorithm and dynamic weight allocation mechanism based on influence index, and design a personalized opinion evolution mechanism combined with structural hole theory. By fusing opinion evolution and dynamic trust, we construct a new LSGDM consensus model that realizes the dynamic adjustment of the trust relationship between individuals.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"116 ","pages":"Article 102799"},"PeriodicalIF":14.7,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal sentiment analysis with unimodal label generation and modality decomposition 利用单模态标签生成和模态分解进行多模态情感分析
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-20 DOI: 10.1016/j.inffus.2024.102787
Linan Zhu , Hongyan Zhao , Zhechao Zhu , Chenwei Zhang , Xiangjie Kong
{"title":"Multimodal sentiment analysis with unimodal label generation and modality decomposition","authors":"Linan Zhu ,&nbsp;Hongyan Zhao ,&nbsp;Zhechao Zhu ,&nbsp;Chenwei Zhang ,&nbsp;Xiangjie Kong","doi":"10.1016/j.inffus.2024.102787","DOIUrl":"10.1016/j.inffus.2024.102787","url":null,"abstract":"<div><div>Multimodal sentiment analysis aims to combine information from different modalities to enhance the understanding of emotions and achieve accurate prediction. However, existing methods face issues of information redundancy and modality heterogeneity during the fusion process, and common multimodal sentiment analysis datasets lack unimodal labels. To address these issues, this paper proposes a multimodal sentiment analysis approach based on unimodal label generation and modality decomposition (ULMD). This method employs a multi-task learning framework, dividing the multimodal sentiment analysis task into a multimodal task and three unimodal tasks. Additionally, a modality representation separator is introduced to decompose modality representations into modality-invariant representations and modality-specific representations. This approach explores the fusion between modalities and generates unimodal labels to enhance the performance of the multimodal sentiment analysis task. Extensive experiments on two public benchmark datasets demonstrate the effectiveness of this method.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"116 ","pages":"Article 102787"},"PeriodicalIF":14.7,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Has multimodal learning delivered universal intelligence in healthcare? A comprehensive survey 多模态学习是否为医疗保健提供了通用智能?全面调查
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-11-19 DOI: 10.1016/j.inffus.2024.102795
Qika Lin , Yifan Zhu , Xin Mei , Ling Huang , Jingying Ma , Kai He , Zhen Peng , Erik Cambria , Mengling Feng
{"title":"Has multimodal learning delivered universal intelligence in healthcare? A comprehensive survey","authors":"Qika Lin ,&nbsp;Yifan Zhu ,&nbsp;Xin Mei ,&nbsp;Ling Huang ,&nbsp;Jingying Ma ,&nbsp;Kai He ,&nbsp;Zhen Peng ,&nbsp;Erik Cambria ,&nbsp;Mengling Feng","doi":"10.1016/j.inffus.2024.102795","DOIUrl":"10.1016/j.inffus.2024.102795","url":null,"abstract":"<div><div>The rapid development of artificial intelligence has constantly reshaped the field of intelligent healthcare and medicine. As a vital technology, multimodal learning has increasingly garnered interest because of data complementarity, comprehensive information fusion, and great application potential. Currently, numerous researchers are dedicating their attention to this field, conducting extensive studies and constructing abundant intelligent systems. Naturally, an open question arises that <em>has multimodal learning delivered universal intelligence in healthcare?</em> To answer this question, we adopt three unique viewpoints for a holistic analysis. Firstly, we conduct a comprehensive survey of the current progress of medical multimodal learning from the perspectives of datasets, task-oriented methods, and universal foundation models. Based on them, we further discuss the proposed question from five issues to explore the real impacts of advanced techniques in healthcare, from data and technologies to performance and ethics. The answer is that current technologies have <strong>NOT</strong> achieved universal intelligence and there remains a significant journey to undertake. Finally, in light of the above reviews and discussions, we point out ten potential directions for exploration to promote multimodal fusion technologies in the domain, towards the goal of universal intelligence in healthcare.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"116 ","pages":"Article 102795"},"PeriodicalIF":14.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信