Pattern Recognition最新文献

筛选
英文 中文
Do it yourself dynamic single image super resolution network via ODE 自己做动态单图像超分辨率网络通过ODE
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.111987
Xiao Zhang , Zhen Zhang , Wei Wei , Lei Zhang , Yanning Zhang
{"title":"Do it yourself dynamic single image super resolution network via ODE","authors":"Xiao Zhang ,&nbsp;Zhen Zhang ,&nbsp;Wei Wei ,&nbsp;Lei Zhang ,&nbsp;Yanning Zhang","doi":"10.1016/j.patcog.2025.111987","DOIUrl":"10.1016/j.patcog.2025.111987","url":null,"abstract":"<div><div>Single Image Super Resolution (SISR) aims at characterizing fine-grain information given a low-resolution image. Recent progress shows that SISR can be viewed as a dynamic process that can be modeled using Ordinary Differential Equations (ODEs). As a result, ODE inspired neural network shows superior performance with limited number of parameters, as well as interpretability for network structure. However, the current ODE based approach restricts the neural network structure to a static single-branch residual network, while dynamic structures can adaptively adjust their parameters(or even structures) best suitable for each test image and lead to better SISR performance. To take advantage of ODE and dynamic network structures in both, we introduce the Implicit Runge–Kutta scheme to construct an ODE-inspired multi-branch residual module that serves as a basic module, which is helpful to capture information at different scales. Then, an attention module is applied on the weights of the Implicit Runge–Kutta scheme to obtain a new dynamic network module, which is equivalent to encourage different branch to jointly attend different positions to obtain the best performance. Experiments demonstrate that our approach outperforms state-of-the-art ODE-inspired methods with less or comparable number of parameters.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111987"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SceneLLM: Implicit language reasoning in LLM for dynamic scene graph generation SceneLLM: LLM中用于动态场景图生成的隐式语言推理
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.111992
Hang Zhang , Zhuoling Li , Jun Liu
{"title":"SceneLLM: Implicit language reasoning in LLM for dynamic scene graph generation","authors":"Hang Zhang ,&nbsp;Zhuoling Li ,&nbsp;Jun Liu","doi":"10.1016/j.patcog.2025.111992","DOIUrl":"10.1016/j.patcog.2025.111992","url":null,"abstract":"<div><div>Dynamic scenes contain intricate spatio-temporal information, crucial for mobile robots, UAVs, and autonomous driving systems to make informed decisions. Parsing these scenes into semantic triplets <span><math><mfenced><mrow></mrow></mfenced></math></span>Subject-Predicate-Object<span><math><mfenced><mrow></mrow></mfenced></math></span> for accurate Scene Graph Generation (SGG) is highly challenging due to the fluctuating spatio-temporal complexity. Inspired by the reasoning capabilities of Large Language Models (LLMs), we propose <em>SceneLLM</em>, a novel framework that leverages LLMs as powerful scene analyzers for dynamic SGG. Our framework introduces a Video-to-Language (V2L) mapping module that transforms video frames into linguistic signals (scene tokens), making the input more comprehensible for LLMs. To better encode spatial information, we devise a Spatial Information Aggregation (SIA) scheme, inspired by the structure of Chinese characters, which encodes spatial data into tokens. Using Optimal Transport (OT), we generate an implicit language signal from the frame-level token sequence that captures the video’s spatio-temporal information. To further improve the LLM’s ability to process this implicit linguistic input, we apply Low-Rank Adaptation (LoRA) to fine-tune the model. Finally, we use a transformer-based SGG predictor to decode the LLM’s reasoning and predict semantic triplets. Our method achieves state-of-the-art results on the Action Genome (AG) benchmark, and extensive experiments show the effectiveness of <em>SceneLLM</em> in understanding and generating accurate dynamic scene graphs.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111992"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond conformal predictors: Adaptive Conformal Inference with confidence predictors 超越适形预测:自适应适形推理与置信度预测
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.111999
Johan Hallberg Szabadváry , Tuwe Löfström
{"title":"Beyond conformal predictors: Adaptive Conformal Inference with confidence predictors","authors":"Johan Hallberg Szabadváry ,&nbsp;Tuwe Löfström","doi":"10.1016/j.patcog.2025.111999","DOIUrl":"10.1016/j.patcog.2025.111999","url":null,"abstract":"<div><div>Adaptive Conformal Inference (ACI) provides finite-sample coverage guarantees, enhancing the prediction reliability under non-exchangeability. This study demonstrates that these desirable properties of ACI do not require the use of Conformal Predictors (CP). We show that the guarantees hold for the broader class of confidence predictors, defined by the requirement of producing nested prediction sets, a property we argue is essential for meaningful confidence statements. We empirically investigate the performance of Non-Conformal Confidence Predictors (NCCP) against CP when used with ACI on non-exchangeable data. In online settings, the NCCP offers significant computational advantages while maintaining a comparable predictive efficiency. In batch settings, inductive NCCP (INCCP) can outperform inductive CP (ICP) by utilising the full training dataset without requiring a separate calibration set, leading to improved efficiency, particularly when the data are limited. Although these initial results highlight NCCP as a theoretically sound and practically effective alternative to CP for uncertainty quantification with ACI in non-exchangeable scenarios, further empirical studies are warranted across diverse datasets and predictors.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111999"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph pre-trained framework with spatio-temporal importance masking and fine-grained optimizing for neural decoding 基于时空重要性掩蔽和细粒度优化的神经解码图预训练框架
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.112006
Ziyu Li , Zhiyuan Zhu , Qing Li , Xia Wu
{"title":"Graph pre-trained framework with spatio-temporal importance masking and fine-grained optimizing for neural decoding","authors":"Ziyu Li ,&nbsp;Zhiyuan Zhu ,&nbsp;Qing Li ,&nbsp;Xia Wu","doi":"10.1016/j.patcog.2025.112006","DOIUrl":"10.1016/j.patcog.2025.112006","url":null,"abstract":"<div><div>Neural decoding has always been the cutting-edge neuroscience issue, significant progress has been made in neural decoding with the support of deep learning technology. However, these breakthroughs are based on large-scale fully annotated functional magnetic resonance imaging (fMRI) data, which greatly hinders its further applicability. Recently, foundation models have garnered considerable attention in the realm of natural language processing, computer vision, and multimodal data processing due to their ability to circumvent the need for extensive annotated datasets while achieving notable accuracy gains. Nevertheless, the formulation of effective foundation model approaches tailored for connectivity-based complex spatio-temporal brain networks remains an unresolved challenge. To address these issues, in this paper, we proposed a general Temporal-Aware Graph Self-supervised Contrastive learning framework (TAGSC) for fMRI-based neural decoding. Concretely, it includes three innovative improvements to enhance fMRI-based graph foundation models: (i) a spatio-temporal augmentation strategy considers spatial brain region synergy and temporal information continuity to generate brain spatio-temporal contrastive views; (ii) a temporal-aware feature extractor learns brain spatio-temporal representations, which fully takes into account the continuous consistency of brain state transitions and fetches brain spatio-temporal interaction information from local to global; (iii) a fine-grained consistency loss assists in contrastive optimization from both temporal and spatial perspectives. Extensive evaluation on publicly available fMRI datasets demonstrated the superior performance of the proposed TAGSC and revealed biomarkers related to different states of the brain. To the best of our knowledge, it is among the earliest attempts to employ a spatio-temporal pre-trained model for neural decoding.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112006"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GasSeg: A lightweight real-time infrared gas segmentation network for edge devices GasSeg:用于边缘设备的轻量级实时红外气体分割网络
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.111931
Huan Yu , Jin Wang , Jingru Yang , Kaixiang Huang , Yang Zhou , Fengtao Deng , Guodong Lu , Shengfeng He
{"title":"GasSeg: A lightweight real-time infrared gas segmentation network for edge devices","authors":"Huan Yu ,&nbsp;Jin Wang ,&nbsp;Jingru Yang ,&nbsp;Kaixiang Huang ,&nbsp;Yang Zhou ,&nbsp;Fengtao Deng ,&nbsp;Guodong Lu ,&nbsp;Shengfeng He","doi":"10.1016/j.patcog.2025.111931","DOIUrl":"10.1016/j.patcog.2025.111931","url":null,"abstract":"<div><div>Infrared gas segmentation (IGS) focuses on identifying gas regions within infrared images, playing a crucial role in gas leakage prevention, detection, and response. However, deploying IGS on edge devices introduces strict efficiency requirements, and the intricate shapes and weak visual features of gases pose significant challenges for accurate segmentation. To address these challenges, we propose GasSeg, a dual-branch network that leverages boundary and contextual cues to achieve real-time and precise IGS. Firstly, a Boundary-Aware Stem is introduced to enhance boundary sensitivity in shallow layers by leveraging fixed gradient operators, facilitating efficient feature extraction for gases with diverse shapes. Subsequently, a dual-branch architecture comprising a context branch and a boundary guidance branch is employed, where boundary features refine contextual representations to alleviate errors caused by blurred contours. Finally, a Contextual Attention Pyramid Pooling Module captures key information through context-aware multi-scale feature aggregation, ensuring robust gas recognition under subtle visual conditions. To advance IGS research and applications, we introduce a high-quality real-world IGS dataset comprising 6,426 images. Experimental results demonstrate that GasSeg outperforms state-of-the-art models in both accuracy and efficiency, achieving 90.68% mIoU and 95.02% mF1, with real-time inference speeds of 215 FPS on a GPU platform and 62 FPS on an edge platform. The dataset and code are publicly available at: <span><span>https://github.com/FisherYuuri/GasSeg</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111931"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An end-to-end shadow removal framework with an intuitive interaction scheme 一个端到端的阴影去除框架,具有直观的交互方案
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.112001
Ding Yuan , Yuqian Meng , Hanyang Liu , Yachun Feng , Hong Zhang , Yifan Yang
{"title":"An end-to-end shadow removal framework with an intuitive interaction scheme","authors":"Ding Yuan ,&nbsp;Yuqian Meng ,&nbsp;Hanyang Liu ,&nbsp;Yachun Feng ,&nbsp;Hong Zhang ,&nbsp;Yifan Yang","doi":"10.1016/j.patcog.2025.112001","DOIUrl":"10.1016/j.patcog.2025.112001","url":null,"abstract":"<div><div>Shadow removal plays a crucial role in enhancing image quality by restoring the color and texture details of the shadow regions, thereby improving the performance of downstream visual tasks. Although recent shadow removal algorithms have achieved impressive results on benchmark datasets, shadows in such datasets are typically centralized and captured in relatively straightforward scenes. In contrast, real-world shadows tend to exhibit complex and irregular patterns due to the random distribution of objects, causing global processing methods to produce false positives and missed corrections. To address these challenges, this paper presents an end-to-end shadow removal framework leveraging Human-Computer Interaction (HCI), allowing simple bounding boxes to annotate targeted shadows. Our approach employs a novel chunked processing training strategy, which decomposes global shadow removal into iterative local refinements. Additionally, a Split-Channel module and an Edge-Weighted loss are incorporated to maintain consistent color and smooth edge transitions during restoration. Furthermore, an HSI-based shadow detection algorithm is proposed to generate shadow masks, facilitating end-to-end shadow removal. Experimental results demonstrate that our approach outperforms state-of-the-art methods on ISTD and SRD datasets, and exhibits robust performance on real-world images, effectively reducing restoration errors.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112001"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144572666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAR image change detection via generalized extreme value (GEV) modeling 基于广义极值(GEV)模型的SAR图像变化检测
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-22 DOI: 10.1016/j.patcog.2025.112040
Fan Zhang, Sijin Zheng, Fei Ma, Qiang Yin, Yongsheng Zhou
{"title":"SAR image change detection via generalized extreme value (GEV) modeling","authors":"Fan Zhang,&nbsp;Sijin Zheng,&nbsp;Fei Ma,&nbsp;Qiang Yin,&nbsp;Yongsheng Zhou","doi":"10.1016/j.patcog.2025.112040","DOIUrl":"10.1016/j.patcog.2025.112040","url":null,"abstract":"<div><div>The rapid growth of high-resolution synthetic aperture radar (SAR) images has created new challenges for change detection methods. High-resolution SAR images often exhibit extremely heterogeneous terrain, resulting in severe long-tailed distributions in the image histogram. Traditional change detection methods based on hypothesis testing theory rely on Gamma distributions, which struggle to accurately model the complex scenes in high-resolution images. Recently the Generalized Extreme Value (GEV) distribution is proven effective in describing the long-tail phenomenon. In this paper, we introduce GEV model into the hypothesis test theory and propose a GEV-based SAR change detection method. First, there may exist two or more heterogeneous components in a certain scene in high-resolution SAR images. We oversegment the image into homogeneous local regions using a superpixel algorithm and model each region with the GEV distribution. Based on this distribution, we then derive the GEV-based likelihood-ratio test (LRT) statistics to measure the similarity of two superpixels for unsupervised change detection. Finally, by analyzing the asymptotic behavior of the GEV-based LRT, we apply a threshold to obtain the change maps (CMs). To evaluate the performance of our approach, we conduct Monte Carlo experiments using empirical data to investigate the goodness-of-fit performance and asymptotic behavior of the LRT. Our method demonstrates superior performance compared to state-of-the-art approaches, achieving the highest overall accuracy in both studied areas.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112040"},"PeriodicalIF":7.5,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online multi-label streaming feature selection by affinity significance, affinity relevance and affinity redundancy 基于亲和性显著性、亲和性相关性和亲和性冗余的在线多标签流特征选择
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-21 DOI: 10.1016/j.patcog.2025.111990
Jianhua Dai, Duo Xu, Chucai Zhang
{"title":"Online multi-label streaming feature selection by affinity significance, affinity relevance and affinity redundancy","authors":"Jianhua Dai,&nbsp;Duo Xu,&nbsp;Chucai Zhang","doi":"10.1016/j.patcog.2025.111990","DOIUrl":"10.1016/j.patcog.2025.111990","url":null,"abstract":"<div><div>Multi-label streaming feature selection has applied to various fields to deal with the applications that features arrive dynamically. However, most exist multi-label streaming feature selection methods ignore that a feature tends to provide more classification information for part of labels, rather than equal information for all labels. This phenomenon results part of labels get more information from selected features, while other labels lack information. In order to address the issue, we propose a novel multi-label streaming feature selection method. Firstly, we come up with the concept of affinity between features and labels. Secondly, we propose the concepts of affinity significance, affinity relevance and affinity redundancy to evaluate streaming features in three dimensions. Thirdly, we propose a novel multi-label streaming feature selection method named OMFS-FA. OMFS-FA has three phases to retain affinity significant features, remove affinity irrelevant features and remove affinity redundant features respectively. Finally, experiments on performance analysis, statistic analysis, number of selected features and running time analysis are conducted, verifying that OMFS-FA significantly outperforms other eleven methods in terms of effectiveness and efficiency.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111990"},"PeriodicalIF":7.5,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MemoryFusion: A novel architecture for infrared and visible image fusion based on memory unit MemoryFusion:一种基于存储单元的红外和可见光图像融合新架构
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-21 DOI: 10.1016/j.patcog.2025.112004
Jiachen He , Xiaoqing Luo , Zhancheng Zhang , Xiao-jun Wu
{"title":"MemoryFusion: A novel architecture for infrared and visible image fusion based on memory unit","authors":"Jiachen He ,&nbsp;Xiaoqing Luo ,&nbsp;Zhancheng Zhang ,&nbsp;Xiao-jun Wu","doi":"10.1016/j.patcog.2025.112004","DOIUrl":"10.1016/j.patcog.2025.112004","url":null,"abstract":"<div><div>Existing image fusion methods utilize elaborate encoders to sequentially extract shallow and deep features from the source images. However, most methods lack long-term dependence, i.e. shallow details are inevitably lost when the network encodes deep features. To this end, some methods employ skip connections or dense connections to directly assign shallow features into deeper layers, potentially introducing redundant information and increasing computational loads. To overcome these drawbacks and enhance the generalization ability for low-quality scenarios, a novel fusion architecture based on Gated Recurrent Unit (GRU) termed as MemoryFusion is proposed. First, the Input Extension Encoder (IEE) transfers the source image into a feature sequence. Then a Recurrent Fusion Encoder (RFE) containing Recurrent Memory Fusion Unit (RMFU) is designed to learn the intrinsic correlation between the multi-modality feature sequences and generate the fusion feature sequence. This memory fusion unit utilizes a special gating mechanism to incorporate historical information and current input, and then adaptively selects the valuable content and forgets the redundant information. More importantly, it effectively relieves the computational pressure. Finally, since the modality information is distributed at different sequence depths and varying illumination intensity, the Multi-hierarchical Aggregation Module (MHAM) is designed to obtain the corresponding weight sequence. The aggregated fusion feature is obtained by integrating the fusion feature sequence with the weight sequence. Extensive experiments demonstrate that MemoryFusion is superior to the state-of-the-art fusion methods on multiple datasets. Even on low-quality images, such as low-light or foggy conditions, our method also demonstrates exceptional fusion performance and scene fidelity.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112004"},"PeriodicalIF":7.5,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing federated domain generalization in ophthalmology: Vision enhancement and consistency assurance for multicenter fundus image segmentation 推进联合域泛化在眼科中的应用:多中心眼底图像分割的视觉增强和一致性保证
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-21 DOI: 10.1016/j.patcog.2025.111993
Yuxin Ye , Nian Liu , Yang Zhao , Xianxun Zhu , Jun Wang , Yan Liu
{"title":"Advancing federated domain generalization in ophthalmology: Vision enhancement and consistency assurance for multicenter fundus image segmentation","authors":"Yuxin Ye ,&nbsp;Nian Liu ,&nbsp;Yang Zhao ,&nbsp;Xianxun Zhu ,&nbsp;Jun Wang ,&nbsp;Yan Liu","doi":"10.1016/j.patcog.2025.111993","DOIUrl":"10.1016/j.patcog.2025.111993","url":null,"abstract":"<div><div>Federated learning has transformed privacy-preserving medical image analysis, but the diversity of imaging equipment and conditions poses significant challenges in creating models that generalize effectively across domains. Current federated domain generalization (FedDG) methods often require partial information sharing, which may compromise privacy standards. To address this, we introduce the Federated Domain-Generalization Vision Enhancement and Consistency Assurance (FedDG-VECA) approach. This method enhances the generalization ability of federated learning by independently strengthening local node, integrating a Federated Vision Feature Extractor (FVFE) for global data capture and local fine-tuning, a Federated Vision Augmentation Strategy (FVAS) to simulate diverse image distributions, and a Federated Bootstrapped Consistency Assurance (FBCA) mechanism using a dual MLP network for stable, consistent model performance across varied data sources. Initial experiments confirm that FedDG-VECA significantly improves model generalization without compromising privacy, ensuring robust and consistent diagnostic capabilities across multiple institutions.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"169 ","pages":"Article 111993"},"PeriodicalIF":7.5,"publicationDate":"2025-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144366275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信