Information Fusion最新文献

筛选
英文 中文
Depth cue fusion for event-based stereo depth estimation 基于事件立体深度估计的深度线索融合
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-24 DOI: 10.1016/j.inffus.2024.102891
Dipon Kumar Ghosh, Yong Ju Jung
{"title":"Depth cue fusion for event-based stereo depth estimation","authors":"Dipon Kumar Ghosh, Yong Ju Jung","doi":"10.1016/j.inffus.2024.102891","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102891","url":null,"abstract":"Inspired by the biological retina, event cameras utilize dynamic vision sensors to capture pixel intensity changes asynchronously. Event cameras offer numerous advantages, such as high dynamic range, high temporal resolution, less motion blur, and low power consumption. These features make event cameras particularly well-suited for depth estimation, especially in challenging scenarios involving rapid motion and high dynamic range imaging conditions. The human visual system perceives the scene depth by combining multiple depth cues such as monocular pictorial depth, stereo depth, and motion parallax. However, most existing algorithms of the event-based depth estimation utilize only single depth cue such as either stereo depth or monocular depth. While it is feasible to estimate depth from a single cue, estimating dense disparity in challenging scenarios and lightning conditions remains a challenging problem. Following this, we conduct extensive experiments to explore various methods for the depth cue fusion. Inspired by the experiment results, in this study, we propose a fusion architecture that systematically incorporates multiple depth cues for the event-based stereo depth estimation. To this end, we propose a depth cue fusion (DCF) network to fuse multiple depth cues by utilizing a novel fusion method called SpadeFormer. The proposed SpadeFormer is a full y context-aware fusion mechanism, which incorporates two modulation techniques (i.e., spatially adaptive denormalization (Spade) and cross-attention) for the depth cue fusion in a transformer block. The adaptive denormalization modulates both input features by adjusting the global statistics of features in a cross manner, and the modulated features are further fused by the cross-attention technique. Experiments conducted on a real-world dataset show that our method reduces the one-pixel error rate by at least 47.63% (3.708 for the best existing method vs. 1.942 for ours) and the mean absolute error by 40.07% (0.302 for the best existing method vs. 0.181 for ours). The results reveal that the depth cue fusion method outperforms the state-of-the-art methods by significant margins and produces better disparity maps.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"44 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimum adjustment consensus model for multi-person multi-criteria large scale decision-making with trust consistency propagation and opinion dynamics 基于信任、一致性传播和意见动态的多人多准则大规模决策最小调整共识模型
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-23 DOI: 10.1016/j.inffus.2024.102883
Xi-Yu Wang, Ying-Ming Wang
{"title":"Minimum adjustment consensus model for multi-person multi-criteria large scale decision-making with trust consistency propagation and opinion dynamics","authors":"Xi-Yu Wang, Ying-Ming Wang","doi":"10.1016/j.inffus.2024.102883","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102883","url":null,"abstract":"The consensus reaching process (CRP) represents a multi-round dynamic method essential for harmonizing the interests of multiple parties. With the rise of instant messaging and social media, the complexity of individual social trust networks and structures. Therefore, it is crucial to explore the inherent value of trust networks in the context of multi-person multi-criteria large-scale decision-making (MpMcLSDM) to facilitate consensus. This paper develops a minimum adjustment consensus model (MACM) for MpMcLSDM based on social trust network analysis (STNA). First, the consistency path rule and personal traits are defined through STNA, leading to a formulated strategy for the completion of the trust relationship. Subsequently, a novel centrality measure, informed by the consistency path rule, is proposed, and a weight method is devised to determine decision-maker (DM) weights and sub-cluster weights after clustering. This paper further elucidates the implications of consensus level fluctuations on DM self-confidence and opinion inclination. Ultimately, a MACM is constructed within the MpMcLSDM framework, integrating opinion dynamics. A numerical example demonstrates the model’s effectiveness, and comparisons with other methods show its rationale and improvement in performance.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"33 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142902102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive survey of large language models and multimodal large language models in medicine 医学大语言模型和多模态大语言模型的综合研究
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-23 DOI: 10.1016/j.inffus.2024.102888
Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang
{"title":"A comprehensive survey of large language models and multimodal large language models in medicine","authors":"Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang","doi":"10.1016/j.inffus.2024.102888","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102888","url":null,"abstract":"Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal large language models (MLLMs) have attracted widespread attention for their exceptional capabilities in understanding, reasoning, and generation, introducing transformative paradigms for integrating artificial intelligence into medicine. This survey provides a comprehensive overview of the development, principles, application scenarios, challenges, and future directions of LLMs and MLLMs in medicine. Specifically, it begins by examining the paradigm shift, tracing the transition from traditional models to LLMs and MLLMs, and highlighting the unique advantages of these LLMs and MLLMs in medical applications. Next, the survey reviews existing medical LLMs and MLLMs, providing detailed guidance on their construction and evaluation in a clear and systematic manner. Subsequently, to underscore the substantial value of LLMs and MLLMs in healthcare, the survey explores five promising applications in the field. Finally, the survey addresses the challenges confronting medical LLMs and MLLMs and proposes practical strategies and future directions for their integration into medicine. In summary, this survey offers a comprehensive analysis of the technical methodologies and practical clinical applications of medical LLMs and MLLMs, with the goal of bridging the gap between these advanced technologies and clinical practice, thereby fostering the evolution of the next generation of intelligent healthcare systems.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"32 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-guided multimodal depression detection via cross-modal feature reconstruction and decomposition 基于跨模态特征重构和分解的文本引导多模态凹陷检测
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-22 DOI: 10.1016/j.inffus.2024.102861
Ziqiang Chen, Dandan Wang, Liangliang Lou, Shiqing Zhang, Xiaoming Zhao, Shuqiang Jiang, Jun Yu, Jun Xiao
{"title":"Text-guided multimodal depression detection via cross-modal feature reconstruction and decomposition","authors":"Ziqiang Chen, Dandan Wang, Liangliang Lou, Shiqing Zhang, Xiaoming Zhao, Shuqiang Jiang, Jun Yu, Jun Xiao","doi":"10.1016/j.inffus.2024.102861","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102861","url":null,"abstract":"Depression, a widespread and debilitating mental health disorder, requires early detection to facilitate effective intervention. Automated depression detection integrating audio with text modalities is a challenging yet significant issue due to the information redundancy and inter-modal heterogeneity across modalities. Prior works usually fail to fully learn the interaction of audio–text modalities for depression detection in an explicit manner. To address these issues, this work proposes a novel text-guided multimdoal depression detection method based on a cross-modal feature reconstruction and decomposition framework. The proposed method takes the text modality as the core modality to guide the model to reconstruct comprehensive audio features for cross-modal feature decomposition tasks. Moreover, the designed cross-modal feature reconstruction and decomposition framework aims to disentangle the shared and private features from the text-guided reconstructed comprehensive audio features for subsequent multimodal fusion. Besides, a bi-directional cross-attention module is designed to interactively learn simultaneous and mutual correlations across modalities for feature enhancement. Extensive experiments are performed on the DAIC-WoZ and E-DAIC datasets, and the results show the superiority of the proposed method on multimodal depression detection tasks, outperforming the state-of-the-arts.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"65 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ORC-GNN: A novel open set recognition based on graph neural network for multi-class classification of psychiatric disorders ORC-GNN:一种基于图神经网络的开放集识别方法,用于精神疾病的多类分类
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-21 DOI: 10.1016/j.inffus.2024.102887
Yaqin Li, Yihong Dong, Shoubo Peng, Linlin Gao, Yu Xin
{"title":"ORC-GNN: A novel open set recognition based on graph neural network for multi-class classification of psychiatric disorders","authors":"Yaqin Li, Yihong Dong, Shoubo Peng, Linlin Gao, Yu Xin","doi":"10.1016/j.inffus.2024.102887","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102887","url":null,"abstract":"Open-set recognition (OSR) refers to the challenge of introducing classes not seen during model training into the test set. This issue is particularly critical in the medical field due to incomplete data collection and the continuous emergence of new and rare diseases. Medical OSR techniques necessitate not only the accurate classification of known cases but also the ability to detect unknown cases and send the corresponding information to experts for further diagnosis. However, there is a significant research gap in the current medical OSR field, which not only lacks research methods for OSR in psychiatric disorders, but also lacks detailed procedures for OSR evaluation based on neuroimaging. To address the challenges associated with the OSR of psychiatric disorders, we propose a method named the open-set risk collaborative consistency graph neural network (ORC-GNN). First, functional connectivity (FC) is used to extract measurable representations in the deep feature space by coordinating hemispheric and whole-brain networks, thereby achieving multi-level brain network feature fusion and regional communication. Subsequently, these representations are used to guide the model to adaptively learn the decision boundaries for known classes using the instance-level density awareness and to identify samples outside these boundaries as unknown. We introduce a novel open-risk margin loss (ORML) to balance empirical risk and open-space risk; this approach makes open-space risk quantifiable through the introduction of open-risk term. We evaluate our method using an integrated multi-class dataset and a tailored experimental protocol suited for psychiatric disorder-related OSR challenges. Compared to state-of-the-art techniques, ORC-GNN demonstrates significant performance improvements and yields important clinically interpretative information regarding the shared and distinct characteristics of multiple psychiatric disorders.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"33 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced detection of early Parkinson’ s disease through multi-sensor fusion on smartphone-based IoMT platforms 基于智能手机的IoMT平台上多传感器融合增强早期帕金森病的检测
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-21 DOI: 10.1016/j.inffus.2024.102889
Tongyue He, Junxin Chen, M. Shamim Hossain, Zhihan Lyu
{"title":"Enhanced detection of early Parkinson’ s disease through multi-sensor fusion on smartphone-based IoMT platforms","authors":"Tongyue He, Junxin Chen, M. Shamim Hossain, Zhihan Lyu","doi":"10.1016/j.inffus.2024.102889","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102889","url":null,"abstract":"To date, Parkinson’s disease (PD) is an incurable neurological disorder, and the time of quality life can only be extended through early detection and timely intervention. However, the symptoms of early PD are both heterogeneous and subtle. To cope with these challenges, we develop a two-level fusion framework for smart healthcare, leveraging smartphones interconnected with the Internet of Medical Things and exploring the contribution of multi-sensor and multi-activity data. Rotation rate and acceleration during walking activity are recorded with the gyroscope and accelerometer, while location coordinates and acceleration during tapping activity are collected via the touch screen and accelerometer, and voice signals are captured by the microphone. The main scientific contribution is the enhanced fusion of multi-sensor information to cope with the heterogeneous and subtle nature of early PD symptoms, achieved by a first-level component that fuses features within a single activity using an attention mechanism and a second-level component that dynamically allocates weights across activities. Compared with related works, the proposed framework explores the potential of fusing multi-sensor data within a single activity, and mines the importance of different activities that correspond to early PD symptoms. The proposed two-level fusion framework achieves an AUC of 0.891 (95 % CI, 0.860–0.921) and a sensitivity of 0.950 (95 % CI, 0.888–1.000) in early PD detection, demonstrating that it efficiently fuses information from different sensor data for various activities and has a strong fault tolerance for data.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"166 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142902106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical disturbance and Group Inference for video-based visible-infrared person re-identification 基于视频的可见-红外人物再识别的层次扰动与群体推理
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-21 DOI: 10.1016/j.inffus.2024.102882
Chuhao Zhou, Yuzhe Zhou, Tingting Ren, Huafeng Li, Jinxing Li, Guangming Lu
{"title":"Hierarchical disturbance and Group Inference for video-based visible-infrared person re-identification","authors":"Chuhao Zhou, Yuzhe Zhou, Tingting Ren, Huafeng Li, Jinxing Li, Guangming Lu","doi":"10.1016/j.inffus.2024.102882","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102882","url":null,"abstract":"Video-based Visible-Infrared person Re-identification (VVI-ReID) is challenging due to the large inter-view and inter-modal discrepancies. To alleviate these discrepancies, most existing works only focus on whole images, while more id-related partial information is ignored. Furthermore, the inference decision is commonly based on the similarity of two samples. However, the semantic gap between the query and gallery samples inevitably exists due to their inter-view misalignment, no matter whether the modality-gap is removed. In this paper, we proposed a Hierarchical Disturbance (HD) and Group Inference (GI) method to handle aforementioned issues. Specifically, the HD module models the inter-view and inter-modal discrepancies as multiple image styles, and conducts feature disturbances through partially transferring body styles. By hierarchically taking the partial and global features into account, our model is capable of adaptively achieving invariant but identity-related features. Additionally, instead of establishing similarity between the query sample and each gallery sample independently, the GI module is further introduced to extract complementary information from all potential intra-class gallery samples of the given query sample, which boosts the performance on matching hard samples. Extensive experiments substantiate the superiority of our method compared with state-of-the arts.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"58 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FLEX: Flexible Federated Learning Framework FLEX:灵活的联邦学习框架
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-20 DOI: 10.1016/j.inffus.2024.102792
F. Herrera, D. Jiménez-López, A. Argente-Garrido, N. Rodríguez-Barroso, C. Zuheros, I. Aguilera-Martos, B. Bello, M. García-Márquez, M.V. Luzón
{"title":"FLEX: Flexible Federated Learning Framework","authors":"F. Herrera, D. Jiménez-López, A. Argente-Garrido, N. Rodríguez-Barroso, C. Zuheros, I. Aguilera-Martos, B. Bello, M. García-Márquez, M.V. Luzón","doi":"10.1016/j.inffus.2024.102792","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102792","url":null,"abstract":"In the realm of Artificial Intelligence (AI), the need for privacy and security in data processing has become paramount. As AI applications continue to expand, the collection and handling of sensitive data raise concerns about individual privacy protection. Federated Learning (FL) emerges as a promising solution to address these challenges by enabling decentralized model training on local devices, thus preserving data privacy. This paper introduces FLEX: a FLEXible Federated Learning Framework designed to provide maximum flexibility in FL research experiments and the possibility to deploy federated solutions. By offering customizable features for data distribution, privacy parameters, and communication strategies, FLEX empowers researchers to innovate and develop novel FL techniques. It also provides a distributed version that allows experiments to be deployed on different devices. The framework also includes libraries for specific FL implementations including: (1) anomalies, (2) blockchain, (3) adversarial attacks and defenses, (4) natural language processing and (5) decision trees, enhancing its versatility and applicability in various domains. Overall, FLEX represents a significant advancement in FL research and deployment, facilitating the development of robust and efficient FL applications.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"50 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of Expressed and Private Opinions (EPOs) models: Improving self-cognitive dissonance and releasing cumulative pressure in group decision-making systems 表达意见和私人意见模型分析:改善群体决策系统中的自我认知失调,释放累积压力
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-20 DOI: 10.1016/j.inffus.2024.102881
Jianglin Dong, Yiyi Zhao, Haixia Mao, Ya Yin, Jiangping Hu
{"title":"Analysis of Expressed and Private Opinions (EPOs) models: Improving self-cognitive dissonance and releasing cumulative pressure in group decision-making systems","authors":"Jianglin Dong, Yiyi Zhao, Haixia Mao, Ya Yin, Jiangping Hu","doi":"10.1016/j.inffus.2024.102881","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102881","url":null,"abstract":"For group decision-making problems, the existing expressed and private opinions (EPOs) models focus on analyzing the limiting discrepancy between agents’ EPOs and the disagreement among agents’ private opinions under social pressure. However, they failed to consider the self-cognitive dissonance phenomenon arising from the discrepancy between agents’ EPOs or agents’ mismatched opinions and behaviors, as well as the impact of the cumulative pressure. This study proposes a novel EPOs model that updates private opinions by inferring the private opinions of social neighbors from their explicit behaviors, whereas expressed opinions updated by minimizing current social pressure. The proposed prevention and remedy mechanisms effectively address agents’ self-cognitive dissonance from different psychological perspectives. Additionally, to realize the release of the cumulative pressure, two threshold models grounded in the concepts of the self-persuasion and liberating effects in psychology are presented. The simulation results indicate that the proposed EPOs model effectively avoids the self-cognitive dissonance in a real social network. Finally, after the release of the cumulative pressure, the group EPOs will achieve a consensus under the self-persuasion effect or polarization under the liberating effect, demonstrating the feasibility and applicability of the proposal.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"26 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
InferTrans: Hierarchical structural fusion transformer for crowded human pose estimation 拥挤人群姿态估计的分层结构融合变压器
IF 18.6 1区 计算机科学
Information Fusion Pub Date : 2024-12-20 DOI: 10.1016/j.inffus.2024.102878
Muyu Li, Yingfeng Wang, Henan Hu, Xudong Zhao
{"title":"InferTrans: Hierarchical structural fusion transformer for crowded human pose estimation","authors":"Muyu Li, Yingfeng Wang, Henan Hu, Xudong Zhao","doi":"10.1016/j.inffus.2024.102878","DOIUrl":"https://doi.org/10.1016/j.inffus.2024.102878","url":null,"abstract":"Human pose estimation in crowded scenes presents unique challenges due to frequent occlusions and complex interactions between individuals. To address these issues, we introduce InferTrans, a hierarchical structural fusion Transformer designed to improve crowded human pose estimation. InferTrans integrates semantic features into structural information using a hierarchical joint-limb-semantic fusion module. By reorganizing joints and limbs into a tree structure, the fusion module facilitates effective information exchange across different structural levels, and leverage both global structural information and local contextual details. Furthermore, we explicitly model limb structural patterns separately from joints, treating limbs as vectors with defined lengths and orientations. This allows our model to infer complete human poses from minimal input, significantly enhancing pose refinement tasks. Extensive experiments on multiple datasets demonstrate that InferTrans outperforms existing pose estimation techniques in crowded and occluded scenarios. The proposed InferTrans serves as a robust post-processing technique, and is capable of improving the accuracy and robustness of pose estimation in challenging environments.","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"202 1","pages":""},"PeriodicalIF":18.6,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142901778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信