Information Fusion最新文献

筛选
英文 中文
Towards facial micro-expression detection and classification using modified multimodal ensemble learning approach 利用改进的多模态集合学习方法实现面部微表情检测和分类
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-10 DOI: 10.1016/j.inffus.2024.102735
Fuli Zhang , Yu Liu , Xiaoling Yu , Zhichen Wang , Qi Zhang , Jing Wang , Qionghua Zhang
{"title":"Towards facial micro-expression detection and classification using modified multimodal ensemble learning approach","authors":"Fuli Zhang ,&nbsp;Yu Liu ,&nbsp;Xiaoling Yu ,&nbsp;Zhichen Wang ,&nbsp;Qi Zhang ,&nbsp;Jing Wang ,&nbsp;Qionghua Zhang","doi":"10.1016/j.inffus.2024.102735","DOIUrl":"10.1016/j.inffus.2024.102735","url":null,"abstract":"<div><div>A micro-expression is a fleeting, delicate and localized facial gesture. It can expose the true feelings that someone is trying to hide and is seen to be a crucial indicator for spotting lies. Because of its possible applications in a variety of sectors, micro-expression research has garnered a lot of attention. The accuracy of micro-expression recognition still needs to be improved, though, because of the brief and weak motions that make up micro-expressions. In recent years, Deep convolution neural methods have depicted a higher degree of efficiency for complex challenge of face detection. Although several attempts were made for micro-expression recognition (MER), the problem is far from being resolved problem which is portrayed by the lowest accuracy rate depicted by the other models. In this study, present a Facial Micro-Expression Detection and Classification using Modified Multimodal Ensemble Learning (FMEDC-MMEL) approach. The major intention of the FMEDC-MMEL technique lies in the proficient identification of MEs that exist in the facial images. As a pre-processing phase, the FMEDC-MMEL technique exploits histogram equalization (HE) approach to improve the contrast level of the image. In the FMEDC-MMEL technique, improved densely connected networks (DenseNet) model is used for learning feature patterns from the pre-processed images. To enhance the proficiency of the improved DenseNet model, stochastic gradient descent (SGD) approach is used for hyperparameter selection process. For facial ME detection, the FMEDC-MMEL technique follows an ensemble of three classifiers namely bi-directional gated recurrent unit (Bi-GRU), long short-term memory (LSTM) and extreme learning machine (ELM). A tailored ensemble learning approach is shown, which combines many machine learning models to improve classification performance and detection accuracy. Sophisticated feature extraction methods are utilized to extract the subtle aspects of micro-expressions, and precision is maintained by optimizations that minimize computing cost. Empirical findings reveal that this methodology notably surpasses conventional techniques, providing enhanced precision and resilience on a variety of complex and demanding datasets. In addition to pushing the boundaries of micro-expression analysis research, the proposed strategy has potential uses in the real world in fields including security, psychology testing, and human-computer interaction.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102735"},"PeriodicalIF":14.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey of evidential clustering: Definitions, methods, and applications 证据聚类调查:定义、方法和应用
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-10 DOI: 10.1016/j.inffus.2024.102736
Zuowei Zhang , Yiru Zhang , Hongpeng Tian , Arnaud Martin , Zhunga Liu , Weiping Ding
{"title":"A survey of evidential clustering: Definitions, methods, and applications","authors":"Zuowei Zhang ,&nbsp;Yiru Zhang ,&nbsp;Hongpeng Tian ,&nbsp;Arnaud Martin ,&nbsp;Zhunga Liu ,&nbsp;Weiping Ding","doi":"10.1016/j.inffus.2024.102736","DOIUrl":"10.1016/j.inffus.2024.102736","url":null,"abstract":"<div><div>In the realm of information fusion, clustering stands out as a common subject and is extensively applied across various fields. Evidential clustering, an increasingly popular method in the soft clustering family, derives its strength from the theory of belief functions, which enables it to effectively characterize the uncertainty and imprecision of data distributions. This survey provides a comprehensive overview of evidential clustering, detailing its theoretical foundations, methodologies, and applications. Specifically, we start by briefly recalling the theory of belief functions with its transformations into other uncertainty reasoning theories. Then, we introduce the concepts of soft data, partitions, and methods with an emphasis on data and partitioning within the theory of belief functions. Subsequently, we summarize the advancements and quantitative evaluations of existing evidential clustering methods and provide a roadmap to help in selecting an appropriate method based on specific application needs. Finally, we identify the major challenges faced in the development and application of evidential clustering, pointing out promising avenues for future research, including theoretical limitations, applicable datasets, and application domains. The survey offers a structured understanding of existing evidential clustering methods, highlighting their theoretical underpinnings, practical implementations, and future research directions. It serves as a valuable resource for researchers seeking to deepen their understanding of evidential clustering.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102736"},"PeriodicalIF":14.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-guided hierarchical fusion U-Net for uncertainty-driven medical image segmentation 用于不确定性驱动医学图像分割的注意力引导分层融合 U-Net
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-09 DOI: 10.1016/j.inffus.2024.102719
Afsana Ahmed Munia , Moloud Abdar , Mehedi Hasan , Mohammad S. Jalali , Biplab Banerjee , Abbas Khosravi , Ibrahim Hossain , Huazhu Fu , Alejandro F. Frangi
{"title":"Attention-guided hierarchical fusion U-Net for uncertainty-driven medical image segmentation","authors":"Afsana Ahmed Munia ,&nbsp;Moloud Abdar ,&nbsp;Mehedi Hasan ,&nbsp;Mohammad S. Jalali ,&nbsp;Biplab Banerjee ,&nbsp;Abbas Khosravi ,&nbsp;Ibrahim Hossain ,&nbsp;Huazhu Fu ,&nbsp;Alejandro F. Frangi","doi":"10.1016/j.inffus.2024.102719","DOIUrl":"10.1016/j.inffus.2024.102719","url":null,"abstract":"<div><div>Small inaccuracies in the system components or artificial intelligence (AI) models for medical imaging could have significant consequences leading to life hazards. To mitigate those risks, one must consider the precision of the image analysis outcomes (e.g., image segmentation), along with the confidence in the underlying model predictions. U-shaped architectures, based on the convolutional encoder–decoder, have established themselves as a critical component of many AI-enabled diagnostic imaging systems. However, most of the existing methods focus on producing accurate diagnostic predictions without assessing the uncertainty associated with such predictions or the introduced techniques. Uncertainty maps highlight areas in the predicted segmented results, where the model is uncertain or less confident. This could lead radiologists to pay more attention to ensuring patient safety and pave the way for trustworthy AI applications. In this paper, we therefore propose the Attention-guided Hierarchical Fusion U-Net (named AHF-U-Net) for medical image segmentation. We then introduce the uncertainty-aware version of it called UA-AHF-U-Net which provides the uncertainty map alongside the predicted segmentation map. The network is designed by integrating the Encoder Attention Fusion module (EAF) and the Decoder Attention Fusion module (DAF) on the encoder and decoder sides of the U-Net architecture, respectively. The EAF and DAF modules utilize spatial and channel attention to capture relevant spatial information and indicate which channels are appropriate for a given image. Furthermore, an enhanced skip connection is introduced and named the Hierarchical Attention-Enhanced (HAE) skip connection. We evaluated the efficiency of our model by comparing it with eleven well-established methods for three popular medical image segmentation datasets consisting of coarse-grained images with unclear boundaries. Based on the quantitative and qualitative results, the proposed method ranks first in two datasets and second in a third. The code can be accessed at: <span><span>https://github.com/AfsanaAhmedMunia/AHF-Fusion-U-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102719"},"PeriodicalIF":14.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretability research of deep learning: A literature survey 深度学习的可解释性研究:文献调查
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-09 DOI: 10.1016/j.inffus.2024.102721
Biao Xu, Guanci Yang
{"title":"Interpretability research of deep learning: A literature survey","authors":"Biao Xu,&nbsp;Guanci Yang","doi":"10.1016/j.inffus.2024.102721","DOIUrl":"10.1016/j.inffus.2024.102721","url":null,"abstract":"<div><div>Deep learning (DL) has been widely used in various fields. However, its black-box nature limits people's understanding and trust in its decision-making process. Therefore, it becomes crucial to research the DL interpretability, which can elucidate the model's decision-making processes and behaviors. This review provides an overview of the current status of interpretability research. First, the DL's typical models, principles, and applications are introduced. Then, the definition and significance of interpretability are clarified. Subsequently, some typical interpretability algorithms are introduced into four groups: active, passive, supplementary, and integrated explanations. After that, several evaluation indicators for interpretability are briefly described, and the relationship between interpretability and model performance is explored. Next, the specific applications of some interpretability methods/models in actual scenarios are introduced. Finally, the interpretability research challenges and future development directions are discussed.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102721"},"PeriodicalIF":14.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Segmentation of acute ischemic stroke lesions based on deep feature fusion 基于深度特征融合的急性缺血性脑卒中病灶分割
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-03 DOI: 10.1016/j.inffus.2024.102724
Linfeng Li , Jiayang Liu , Shanxiong Chen , Jingjie Wang , Yongmei Li , Qihua Liao , Lin Zhang , Xihua Peng , Xu Pu
{"title":"Segmentation of acute ischemic stroke lesions based on deep feature fusion","authors":"Linfeng Li ,&nbsp;Jiayang Liu ,&nbsp;Shanxiong Chen ,&nbsp;Jingjie Wang ,&nbsp;Yongmei Li ,&nbsp;Qihua Liao ,&nbsp;Lin Zhang ,&nbsp;Xihua Peng ,&nbsp;Xu Pu","doi":"10.1016/j.inffus.2024.102724","DOIUrl":"10.1016/j.inffus.2024.102724","url":null,"abstract":"<div><div>Acute ischemic stroke (AIS) is a common brain disease worldwide, and diagnosing AIS requires effectively utilizing information from multiple Computed Tomography Perfusion (CTP) maps. As far as we know, most methods independently process each CTP map or fail to fully utilize medical prior information when integrating the information from CTP maps. Considering the characteristics of AIS lesions, we propose a method for efficient information fusion of CTP maps to achieve accurate segmentation results. We propose Window Multi-Head Cross-Attention Net (WMHCA-Net), which employs a multi-path U-shaped architecture for encoding and decoding. After encoding, multiple independent windowed cross-attentions are used to deeply integrate information from different maps. During the decoding phase, a Channel Cross-Attention (CCA) module is utilized to enhance information recovery during upsampling. We also added a segmentation optimization module to optimize low-resolution segmentation results, improving the overall performance. Finally, experimental results demonstrate that our proposed method exhibits strong balance and excels across multiple metrics. It can provide more accurate AIS lesion segmentation results to assist doctors in evaluating patient conditions. Our code are available at <span><span>https://github.com/MTVLab/WMHCA-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102724"},"PeriodicalIF":14.7,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AtCAF: Attention-based causality-aware fusion network for multimodal sentiment analysis AtCAF:基于注意力的因果关系感知融合网络,用于多模态情感分析
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-02 DOI: 10.1016/j.inffus.2024.102725
Changqin Huang , Jili Chen , Qionghao Huang , Shijin Wang , Yaxin Tu , Xiaodi Huang
{"title":"AtCAF: Attention-based causality-aware fusion network for multimodal sentiment analysis","authors":"Changqin Huang ,&nbsp;Jili Chen ,&nbsp;Qionghao Huang ,&nbsp;Shijin Wang ,&nbsp;Yaxin Tu ,&nbsp;Xiaodi Huang","doi":"10.1016/j.inffus.2024.102725","DOIUrl":"10.1016/j.inffus.2024.102725","url":null,"abstract":"<div><div>Multimodal sentiment analysis (MSA) involves interpreting sentiment using various sensory data modalities. Traditional MSA models often overlook causality between modalities, resulting in spurious correlations and ineffective cross-modal attention. To address these limitations, we propose the <strong>At</strong>tention-based <strong>C</strong>ausality-<strong>A</strong>ware <strong>F</strong>usion (AtCAF) network from a causal perspective. To capture a causality-aware representation of text, we introduce the <strong>C</strong>ausality-<strong>A</strong>ware <strong>T</strong>ext <strong>D</strong>ebiasing <strong>M</strong>odule (CATDM) utilizing the front-door adjustment. Furthermore, we employ the <strong>C</strong>ounterfactual <strong>C</strong>r<strong>o</strong>ss-modal <strong>At</strong>tention (CCoAt) module integrate causal information in modal fusion, thereby enhancing the quality of aggregation by incorporating more causality-aware cues. AtCAF achieves state-of-the-art performance across three datasets, demonstrating significant improvements in both standard and Out-Of-Distribution (OOD) settings. Specifically, AtCAF outperforms existing models with a 1.5% improvement in ACC-2 on the CMU-MOSI dataset, a 0.95% increase in ACC-7 on the CMU-MOSEI dataset under normal conditions, and a 1.47% enhancement under OOD conditions. CATDM improves category cohesion in feature space, while CCoAt accurately classifies ambiguous samples through context filtering. Overall, AtCAF offers a robust solution for social media sentiment analysis, delivering reliable insights by effectively addressing data imbalance. The code is available at <span><span>https://github.com/TheShy-Dream/AtCAF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102725"},"PeriodicalIF":14.7,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prompt-guided image color aesthetics assessment: Models, datasets and benchmarks 提示引导的图像色彩美学评估:模型、数据集和基准
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-10-01 DOI: 10.1016/j.inffus.2024.102706
Shuai He , Yi Xiao , Anlong Ming, Huadong Ma
{"title":"Prompt-guided image color aesthetics assessment: Models, datasets and benchmarks","authors":"Shuai He ,&nbsp;Yi Xiao ,&nbsp;Anlong Ming,&nbsp;Huadong Ma","doi":"10.1016/j.inffus.2024.102706","DOIUrl":"10.1016/j.inffus.2024.102706","url":null,"abstract":"<div><div>Image color aesthetics assessment (ICAA) aims to assess color aesthetics based on human perception, which is crucial for various applications such as imaging measurement and image analysis. The ceiling of previous methods is constrained to a holistic evaluation approach, which hinders their ability to offer explainability from multiple perspectives. Moreover, existing ICAA datasets often lack multi-attribute annotations beyond holistic scores, which are necessary to provide effective supervision for training or validating models’ multi-perspective assessment capabilities, thereby hindering their capacity for effective generalization. To advance ICAA research, (1) we propose an “all-in-one” model called the Prompt-Guided Delegate Transformer (Prompt-DeT). Prompt-DeT utilizes dedicated prompt strategies and an Aesthetic Adapter (Aes-Adapter), to exploit the rich visual language prior embedded in large pre-trained vision-language models. It enhances the model’s perception of multiple attributes, enabling impressive zero-shot and fine-tuning capabilities on sub-attribute tasks, and even supports user-customized scenarios. (2) We elaborately construct a color-oriented dataset, ICAA20K, containing 20K images and 6 annotated dimensions to support both holistic and sub-attribute ICAA tasks. (3) We develop a comprehensive benchmark comprising of 17 methods, which is the most extensive to date, based on four datasets (ICAA20K, ICAA17K, SPAQ, and PARA) for evaluating the holistic and sub-attribute performance of ICAA methods. Our work, not only achieves state-of-the-art (SOTA) performance, but also offers the community a roadmap to explore solutions for ICAA. The code and dataset are available <span><span>https://github.com/woshidandan/Image-Color-Aesthetics-Assessment/blob/main/Refine-for-ICAA.md</span><svg><path></path></svg></span> here.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102706"},"PeriodicalIF":14.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Depression recognition using high-order generalized multilayer brain functional network fused with EEG multi-domain information 利用融合脑电图多域信息的高阶广义多层脑功能网络识别抑郁症
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-09-30 DOI: 10.1016/j.inffus.2024.102723
Shanshan Qu , Dixin Wang , Chang Yan , Na Chu , Zhigang Li , Gang Luo , Huayu Chen , Xuesong Liu , Xuan Zhang , Qunxi Dong , Xiaowei Li , Shuting Sun , Bin Hu
{"title":"Depression recognition using high-order generalized multilayer brain functional network fused with EEG multi-domain information","authors":"Shanshan Qu ,&nbsp;Dixin Wang ,&nbsp;Chang Yan ,&nbsp;Na Chu ,&nbsp;Zhigang Li ,&nbsp;Gang Luo ,&nbsp;Huayu Chen ,&nbsp;Xuesong Liu ,&nbsp;Xuan Zhang ,&nbsp;Qunxi Dong ,&nbsp;Xiaowei Li ,&nbsp;Shuting Sun ,&nbsp;Bin Hu","doi":"10.1016/j.inffus.2024.102723","DOIUrl":"10.1016/j.inffus.2024.102723","url":null,"abstract":"<div><div>Major Depressive Disorder (MDD) is a serious and highly heterogeneous psychological disorder. According to the network hypothesis, depression originates from abnormal neural network information processing, typically resulting in aberrant changes in the topological structure of the brain’s functional network. Recent evidence further reveals that depression involves dynamic changes related to both within- and cross-frequency coupling. Therefore, we utilize second-order tensor expansion to integrate frequency- and time-varying multilayer brain functional networks based on node sharing, thus propose a generalized multilayer brain functional network (GMBFN) incorporating multi-domain information. Concurrently, we derive global and local topological properties from both the frequency and temporal domains to characterize the novel network structure. To uncover more reliable biomarkers and explore various coupling features that can assess the interaction between signals from different perspectives, we conduct research in two datasets employing four sets of within- and cross-frequency coupling. Leveraging the novel multi-domain high-order GMBFNs, abnormalities of information integration abilities in patients with MDD are observed, particularly in the theta-band and overall temporal-domain. Through the fusion of topological properties across both domains with multiple classifiers, the alpha-band can serve as a potential biomarker for depression identification. More importantly, the combination of global topological properties from both domains, on average, enhances the classification performance for identifying patients with MDD by 5.18% compared to using just one domain. This study presents a systematic framework for comprehending the aberrant mechanisms of MDD from multiple perspectives, offering significant value for clinical applications aimed at assisting in depression diagnosis and intervention.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102723"},"PeriodicalIF":14.7,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Image colorization: A survey and dataset 图像着色:调查与数据集
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-09-30 DOI: 10.1016/j.inffus.2024.102720
Saeed Anwar , Muhammad Tahir , Chongyi Li , Ajmal Mian , Fahad Shahbaz Khan , Abdul Wahab Muzaffar
{"title":"Image colorization: A survey and dataset","authors":"Saeed Anwar ,&nbsp;Muhammad Tahir ,&nbsp;Chongyi Li ,&nbsp;Ajmal Mian ,&nbsp;Fahad Shahbaz Khan ,&nbsp;Abdul Wahab Muzaffar","doi":"10.1016/j.inffus.2024.102720","DOIUrl":"10.1016/j.inffus.2024.102720","url":null,"abstract":"<div><div>Image colorization estimates RGB colors for grayscale images or video frames to improve their aesthetic and perceptual quality. Over the last decade, deep learning techniques for image colorization have significantly progressed, necessitating a systematic survey and benchmarking of these techniques. This article presents a comprehensive survey of recent state-of-the-art deep learning-based image colorization techniques, describing their fundamental block architectures, inputs, optimizers, loss functions, training protocols, training data, etc. It categorizes the existing colorization techniques into seven classes and discusses important factors governing their performance, such as benchmark datasets and evaluation metrics. We highlight the limitations of existing datasets and introduce a new dataset specific to colorization. We perform an extensive experimental evaluation of existing image colorization methods using both existing datasets and our proposed one. Finally, we discuss the limitations of existing methods and recommend possible solutions and future research directions for this rapidly evolving topic of deep image colorization. The dataset and codes for evaluation are publicly available at <span><span>https://github.com/saeed-anwar/ColorSurvey</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102720"},"PeriodicalIF":14.7,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning techniques for hand vein biometrics: A comprehensive review 用于手部静脉生物识别的深度学习技术:全面回顾
IF 14.7 1区 计算机科学
Information Fusion Pub Date : 2024-09-27 DOI: 10.1016/j.inffus.2024.102716
Mustapha Hemis , Hamza Kheddar , Sami Bourouis , Nasir Saleem
{"title":"Deep learning techniques for hand vein biometrics: A comprehensive review","authors":"Mustapha Hemis ,&nbsp;Hamza Kheddar ,&nbsp;Sami Bourouis ,&nbsp;Nasir Saleem","doi":"10.1016/j.inffus.2024.102716","DOIUrl":"10.1016/j.inffus.2024.102716","url":null,"abstract":"<div><div>Biometric authentication has garnered significant attention as a secure and efficient method of identity verification. Among the various modalities, hand vein biometrics, including finger vein, palm vein, and dorsal hand vein recognition, offer unique advantages due to their high accuracy, low susceptibility to forgery, and non-intrusiveness. The vein patterns within the hand are highly complex and distinct for each individual, making them an ideal biometric identifier. Additionally, hand vein recognition is contactless, enhancing user convenience and hygiene compared to other modalities such as fingerprint or iris recognition. Furthermore, the veins are internally located, rendering them less susceptible to damage or alteration, thus enhancing the security and reliability of the biometric system. The combination of these factors makes hand vein biometrics a highly effective and secure method for identity verification.</div><div>This review paper delves into the latest advancements in deep learning techniques applied to finger vein, palm vein, and dorsal hand vein recognition. It encompasses all essential fundamentals of hand vein biometrics, summarizes publicly available datasets, and discusses state-of-the-art metrics used for evaluating the three modes. Moreover, it provides a comprehensive overview of suggested approaches for finger, palm, dorsal, and multimodal vein techniques, offering insights into the best performance achieved, data augmentation techniques, and effective transfer learning methods, along with associated pretrained deep learning models. Additionally, the review addresses research challenges faced and outlines future directions and perspectives, encouraging researchers to enhance existing methods and propose innovative techniques.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102716"},"PeriodicalIF":14.7,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信