Medical image analysis最新文献_第4页

Skeleton2Mask: Skeleton-supervised airway segmentation 骷髅监督的气道分割

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-07-02 DOI: 10.1016/j.media.2025.103693

Mingyue Zhao , Han Li , Di Zhang , Jin Zhang , Xiuxiu Zhou , Li Fan , Xiaolan Qiu , Shiyuan Liu , S. Kevin Zhou

{"title":"Skeleton2Mask: Skeleton-supervised airway segmentation","authors":"Mingyue Zhao , Han Li , Di Zhang , Jin Zhang , Xiuxiu Zhou , Li Fan , Xiaolan Qiu , Shiyuan Liu , S. Kevin Zhou","doi":"10.1016/j.media.2025.103693","DOIUrl":"10.1016/j.media.2025.103693","url":null,"abstract":"<div><div>Airway segmentation has achieved considerable success. However, it still hinges on precise voxel-wise annotations, which are not only labor-intensive and time-consuming but also subject to challenges like missing branches, discontinuous branch labeling, and erroneous edge delineation. To tackle this, this paper introduces two novel contributions: a skeleton annotation (SKA) strategy for airway tree structures, and a sparse supervision learning approach — Skeleton2Mask, built upon SKA for dense airway prediction. The SKA strategy replaces traditional slice-by-slice, voxel-wise labeling with a branch-by-branch, control-point-based skeleton delineation. This approach not only enhances the preservation of topological integrity but also reduces annotation time by approximately 80%. Its effectiveness and reliability have been validated through <strong>clinical experiments</strong>, demonstrating its potential to streamline airway segmentation tasks. Nevertheless, the absolute sparsity of this annotation, along with the typical tree structure, can easily cause the failure of sparse supervision learning. To tackle this, we further propose Skeleton2Mask, a two-stage label propagation learning method, involving dual-stream buffer propagation and hierarchical geometry-aware learning, to ensure reliable and structure-friendly dense prediction. Experiments reveal that 1) Skeleton2Mask outperforms other sparsely supervised approaches on two public datasets by a large margin, achieving comparable results to full supervision with no more than 3% of airway annotations. 2) With the same annotation cost, our algorithm demonstrated significantly superior performance in both topological and voxel-wise metrics.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103693"},"PeriodicalIF":10.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144549703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unleashing the potential of open-set noisy samples against label noise for medical image classification 释放开集噪声样本对抗标签噪声的潜力，用于医学图像分类

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-07-02 DOI: 10.1016/j.media.2025.103702

Zehui Liao , Shishuai Hu , Yanning Zhang , Yong Xia

{"title":"Unleashing the potential of open-set noisy samples against label noise for medical image classification","authors":"Zehui Liao , Shishuai Hu , Yanning Zhang , Yong Xia","doi":"10.1016/j.media.2025.103702","DOIUrl":"10.1016/j.media.2025.103702","url":null,"abstract":"<div><div>Addressing the coexistence of closed-set and open-set label noise in medical image classification remains a largely unexplored challenge. Unlike natural image classification, where noisy samples can often be clearly separated from clean ones, medical image classification is complicated by high inter-class similarity, which makes the identification of open-set noisy samples particularly difficult. Moreover, existing methods typically fail to fully exploit open-set noisy samples for label noise mitigation, either discarding them or assigning uniform soft labels, thus limiting their utility. To address these challenges, we propose the ENCOFA: the Extended Noise-robust Contrastive and Open-set Feature Augmentation framework for medical image classification. This framework introduces the Extended Noise-robust Supervised Contrastive Loss, which enhances feature discrimination across both in-distribution and out-of-distribution classes. By treating open-set noisy samples as an extended class and weighting contrastive pairs based on label reliability, this loss effectively improves the robustness to label noise. In addition, we develop the Open-set Feature Augmentation module, which enriches open-set samples at the feature level and dynamically assigns class labels, thereby leveraging model capacity while mitigating overfitting to noisy data. We evaluated the proposed framework on two synthetic noisy datasets and one real-world noisy dataset. The results demonstrate the superiority of ENCOFA over six state-of-the-art methods and highlight the effectiveness of explicitly leveraging open-set noisy samples in combating label noise. The code will be publicly available at <span><span>https://github.com/Merrical/ENCOFA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103702"},"PeriodicalIF":10.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144557406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A holistic approach for classifying dental conditions from textual reports and panoramic radiographs 从文本报告和全景x光片分类牙齿状况的整体方法

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-07-02 DOI: 10.1016/j.media.2025.103709

Bernardo Silva , Jefferson Fontinele , Carolina Letícia Zilli Vieira , João Manuel R.S. Tavares , Patricia Ramos Cury , Luciano Oliveira

{"title":"A holistic approach for classifying dental conditions from textual reports and panoramic radiographs","authors":"Bernardo Silva , Jefferson Fontinele , Carolina Letícia Zilli Vieira , João Manuel R.S. Tavares , Patricia Ramos Cury , Luciano Oliveira","doi":"10.1016/j.media.2025.103709","DOIUrl":"10.1016/j.media.2025.103709","url":null,"abstract":"<div><div>Dental panoramic radiographs offer vast diagnostic opportunities, but the shortage of labeled data hampers the training of supervised deep-learning networks for the automatic analysis of these images. To address this issue, we introduce a holistic learning approach to classify dental conditions on panoramic radiographs, exploring tooth segmentation and textual reports, without a direct tooth-level annotated dataset. Large language models were used to identify the prevalent dental conditions in these reports, acting as an auto-labeling procedure. After an instance segmentation network segments the teeth, a linkage approach is in charge of matching each tooth with the corresponding condition found in the textual report. The proposed framework was validated using two of the most extensive datasets in the literature, specially gathered for this study, consisting of 8,795 panoramic radiographs and 8,029 paired reports and images. Encouragingly, the results consistently exceeded the baseline for the Matthews correlation coefficient. A comparative analysis against specialist and dental student ratings, supported by statistical evaluation, highlighted its effectiveness. Using specialist consensus as the ground truth, the system achieved precision comparable to final-year undergraduate students and was within 8.1 percentage points of specialist performance.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103709"},"PeriodicalIF":10.7,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144579827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Upper-body free-breathing Magnetic Resonance Fingerprinting applied to the quantification of water T1 and fat fraction 上半身自由呼吸磁共振指纹技术应用于水T1和脂肪分数的定量

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-07-01 DOI: 10.1016/j.media.2025.103699

Constantin Slioussarenko , Marc Lapert , Pierre-Yves Baudin , Benjamin Marty

{"title":"Upper-body free-breathing Magnetic Resonance Fingerprinting applied to the quantification of water T1 and fat fraction","authors":"Constantin Slioussarenko , Marc Lapert , Pierre-Yves Baudin , Benjamin Marty","doi":"10.1016/j.media.2025.103699","DOIUrl":"10.1016/j.media.2025.103699","url":null,"abstract":"<div><div>Over the past decade, Magnetic Resonance Fingerprinting (MRF) has emerged as an efficient paradigm for the rapid and simultaneous quantification of multiple MRI parameters, including fat fraction (FF), water T1 (<span><math><mrow><mi>T</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>H</mi><mn>2</mn><mi>O</mi></mrow></msub></mrow></math></span>), water T2 (<span><math><mrow><mi>T</mi><msub><mrow><mn>2</mn></mrow><mrow><mi>H</mi><mn>2</mn><mi>O</mi></mrow></msub></mrow></math></span>), and fat T1 (<span><math><mrow><mi>T</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>f</mi><mi>a</mi><mi>t</mi></mrow></msub></mrow></math></span>). These parameters serve as promising imaging biomarkers in various anatomical targets such as the heart, liver, and skeletal muscles. However, measuring these parameters in the upper body poses challenges due to physiological motion, particularly respiratory motion. In this work, we propose a novel approach, motion-corrected (MoCo) MRF T1-FF, which estimates the motion field using an optimized preliminary motion scan and uses it to correct the MRF acquisition data before dictionary search for reconstructing motion-corrected FF and <span><math><mrow><mi>T</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>H</mi><mn>2</mn><mi>O</mi></mrow></msub></mrow></math></span> parametric maps of the upper-body region. We validated this framework using an <em>in vivo</em> dataset comprising 18 healthy volunteers (12 men, 6 women, mean age = 40 ± 14 years old) and a 3 subjects with different neuromuscular disorders. At the ROI level, in regions minimally affected by motion, no significant bias was observed between the uncorrected and MoCo reconstructions for FF (mean difference of -0.6%) and <span><math><mrow><mi>T</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>H</mi><mn>2</mn><mi>O</mi></mrow></msub></mrow></math></span> (<span><math><mrow><mo>−</mo><mn>5</mn><mo>.</mo><mn>5</mn></mrow></math></span> ms) values. Moreover, MoCo MRF T1-FF significantly reduced the standard deviations of distributions assessed in these regions, indicating improved precision. Notably, in regions heavily affected by motion, such as respiratory muscles, liver, and kidneys, the MRF parametric maps exhibited a marked reduction in motion blurring and streaking artifacts after motion correction. Furthermore, the diaphragm was consistently discernible on parametric maps after motion correction. This approach lays the groundwork for the joint 3D quantification of FF and <span><math><mrow><mi>T</mi><msub><mrow><mn>1</mn></mrow><mrow><mi>H</mi><mn>2</mn><mi>O</mi></mrow></msub></mrow></math></span> in regions that are rarely studied, such as the respiratory muscles, particularly the intercostal muscles and diaphragm.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103699"},"PeriodicalIF":10.7,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FPM-R2Net: Fused Photoacoustic and operating Microscopic imaging with cross-modality Representation and Registration Network FPM-R2Net：融合光声和操作显微成像与交叉模态表示和配准网络

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-06-30 DOI: 10.1016/j.media.2025.103698

Yuxuan Liu , Jiasheng Zhou , Yating Luo , Sung-Liang Chen , Yao Guo , Guang-Zhong Yang

{"title":"FPM-R2Net: Fused Photoacoustic and operating Microscopic imaging with cross-modality Representation and Registration Network","authors":"Yuxuan Liu , Jiasheng Zhou , Yating Luo , Sung-Liang Chen , Yao Guo , Guang-Zhong Yang","doi":"10.1016/j.media.2025.103698","DOIUrl":"10.1016/j.media.2025.103698","url":null,"abstract":"<div><div>Robot-assisted microsurgery is a promising technique for a number of clinical specialties including neurosurgery. One of the prerequisites of such procedures is accurate vision guidance, delineating not only the exposed surface details but also embedded microvasculature. Conventional microscopic cameras used for vascular imaging are susceptible to specular reflections and changes in ambient light with low tissue resolution and contrast. Photoacoustic microscopy (PAM) is emerging as a promising tool and increasingly used for vascular imaging due to its high image resolution and tissue contrast. This paper presents a fused microscopic imaging scheme that integrates standard surgical microscopy with PAM for improved intraoperative visualization and guidance. We propose the FPM-R<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>Net to <strong>F</strong>use <strong>P</strong>hotoacoustic and surgical <strong>M</strong>icroscopic imaging via cross-modality <strong>R</strong>epresentation and <strong>R</strong>egistration <strong>Net</strong>work. A MOdality Representation Network (MORNet) is used to extract unified feature representation across white-light and PAM modalities, and a Hierarchical Iterative Registration Network (HIRNet) is used to establish the correspondence between the two modalities in a coarse-to-fine manner based on multi-resolution feature maps. A synthetic dataset with ground truth correspondence and an <em>in vivo</em> dataset of mouse brain vasculature are used to evaluate our proposed network. Extensive validation on the two datasets has shown significant improvements compared to the current state-of-the-art methods assessed with intersection over union and Dice scores (10.3% and 6.6% on the synthetic dataset and 15.9% and 11.8% on the <em>in vivo</em> dataset, respectively).</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103698"},"PeriodicalIF":10.7,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144535511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uncertainty-driven hybrid-view adaptive learning for fully automated uterine leiomyosarcoma diagnosis 不确定性驱动的混合视图自适应学习用于全自动子宫平滑肌肉瘤诊断

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-06-28 DOI: 10.1016/j.media.2025.103692

Qi Li , Jingxian Wu , Xiyu Liu , Dengwang Li , Jie Xue

{"title":"Uncertainty-driven hybrid-view adaptive learning for fully automated uterine leiomyosarcoma diagnosis","authors":"Qi Li , Jingxian Wu , Xiyu Liu , Dengwang Li , Jie Xue","doi":"10.1016/j.media.2025.103692","DOIUrl":"10.1016/j.media.2025.103692","url":null,"abstract":"<div><div>Uterine leiomyosarcoma (ULMS) is a rare malignant tumor of the smooth muscle of the uterine wall that is aggressive and has a poor prognosis. Accurately and automatically classifying histopathological whole-slide images (WSIs) is critical for clinically diagnosing ULMS. However, few works have investigated automated ULMS diagnosis methods due to its high degrees of concealment and phenotype diversity. In this study, we present a novel uncertainty-driven hybrid-view adaptive learning (UHAL) framework to efficiently capture the distinct features of ULMS by mining pivotal biomarkers at the cell level and minimizing the redundancy from hybrid views under an uncertainty discrimination mechanism, ultimately ensuring reliable diagnoses of ULMS WSIs. Specifically, hybrid-view adaptive learning incorporates three modules: phenotype-driven patch self-optimization to select salient patch features, unsupervised inter-bags adaptive learning effectively filters out redundant information, and compensatory inner-level adaptive learning further refines tumor features. Furthermore, the uncertainty discrimination mechanism achieves enhanced reliability by assigning quantitative confidence coefficients to predictions under the Dirichlet distribution, leveraging uncertainty to update the features for obtaining accurate diagnoses. The experimental results obtained on the ULMS dataset indicate the superior performance of the proposed framework over that of ten state-of-the-art methods. Extensive experimental results obtained on the TCGA-Esca, TCGA-Lung, and Spinal infection datasets further validate the robustness and generalizability of the UHAL framework.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103692"},"PeriodicalIF":10.7,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144516109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SimIntestine: A synthetic dataset from virtual capsule endoscope 虚拟胶囊内窥镜合成数据集

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-06-28 DOI: 10.1016/j.media.2025.103706

Sarita Singh, Basabi Bhaumik, Shouri Chatterjee

{"title":"SimIntestine: A synthetic dataset from virtual capsule endoscope","authors":"Sarita Singh, Basabi Bhaumik, Shouri Chatterjee","doi":"10.1016/j.media.2025.103706","DOIUrl":"10.1016/j.media.2025.103706","url":null,"abstract":"<div><div>The absence of accurately position-annotated datasets of the human gastrointestinal (GI) tract limits the efficient learning of deep learning-based models and their effective performance evaluation for depth and pose estimation. The currently available synthetic datasets for the GI tract lack the intrinsic anatomical features and the associated textural characteristics. In this work, we have developed a method to generate virtual models of the small and large intestines of human gastrointestinal system integrated with a virtual capsule endoscope, that generate position-annotated image dataset (SimIntestine) along with ground truth depth maps. The virtual intestines incorporate the distinctive anatomical characteristics of the real intestines, such as plicae circulares, villi, haustral folds, realistic textures; and the physiological processes such as peristalsis. The virtual endoscope navigates through the virtual intestine analogous to a real capsule endoscope and generates images that closely approximate the visual characteristics of those captured by a real endoscope. The framework additionally provides information on the camera’s orientation and position inside the virtual intestine; along with the depth information for each image pixel. The proposed framework provides a comprehensive and physically realistic annotated synthetic dataset benchmark of intestines which can be used to improve endoscopic video analysis, specifically in the domain of pose estimation and simultaneous localization and mapping which is challenging to obtain using real endoscope unannotated dataset. The SimIntestine dataset is utilized to evaluate the established benchmark techniques for depth and ego-motion estimation - Endo-SfMLearner and Monodepth2, and their results are discussed. The dataset has also been evaluated against other existing datasets, and its efficacy has been quantitatively affirmed by the enhanced performance metrics.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103706"},"PeriodicalIF":10.7,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144516040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HGTL: A hypergraph transfer learning framework for survival prediction of ccRCC 基于超图迁移学习框架的ccRCC存活预测

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-06-27 DOI: 10.1016/j.media.2025.103700

Xiangmin Han , Wuchao Li , Yan Zhang , Pinhao Li , Jianguo Zhu , Tijiang Zhang , Rongpin Wang , Yue Gao

{"title":"HGTL: A hypergraph transfer learning framework for survival prediction of ccRCC","authors":"Xiangmin Han , Wuchao Li , Yan Zhang , Pinhao Li , Jianguo Zhu , Tijiang Zhang , Rongpin Wang , Yue Gao","doi":"10.1016/j.media.2025.103700","DOIUrl":"10.1016/j.media.2025.103700","url":null,"abstract":"<div><div>The clinical diagnosis of clear cell renal cell carcinoma (ccRCC) primarily depends on histopathological analysis and computed tomography (CT). Although pathological diagnosis is regarded as the gold standard, invasive procedures such as biopsy carry the risk of tumor dissemination. Conversely, CT scanning offers a non-invasive alternative, but its resolution may be inadequate for detecting microscopic tumor features, which limits the performance of prognostic assessments. To address this issue, we propose a high-order correlation-driven method for predicting the survival of ccRCC using only CT images, achieving performance comparable to that of the pathological gold standard. The proposed method utilizes a cross-modal hypergraph neural network based on hypergraph transfer learning to perform high-order correlation modeling and semantic feature extraction from whole-slide pathological images and CT images. By employing multi-kernel maximum mean discrepancy, we transfer the high-order semantic features learned from pathological images to the CT-based hypergraph neural network channel. During the testing phase, high-precision survival predictions were achieved using only CT images, eliminating the need for pathological images. This approach not only reduces the risks associated with invasive examinations for patients but also significantly enhances clinical diagnostic efficiency. The proposed method was validated using four datasets: three collected from different hospitals and one from the public TCGA dataset. Experimental results indicate that the proposed method achieves higher concordance indices across all datasets compared to other methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103700"},"PeriodicalIF":10.7,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144516042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FSDA-DG: Improving cross-domain generalizability of medical image segmentation with few source domain annotations FSDA-DG：利用较少的源域注释提高医学图像分割的跨域泛化性

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-06-27 DOI: 10.1016/j.media.2025.103704

Zanting Ye , Ke Wang , Wenbing Lv , Qianjin Feng , Lijun Lu

{"title":"FSDA-DG: Improving cross-domain generalizability of medical image segmentation with few source domain annotations","authors":"Zanting Ye , Ke Wang , Wenbing Lv , Qianjin Feng , Lijun Lu","doi":"10.1016/j.media.2025.103704","DOIUrl":"10.1016/j.media.2025.103704","url":null,"abstract":"<div><div>Deep learning-based medical image segmentation faces significant challenges arising from limited labeled data and domain shifts. While prior approaches have primarily addressed these issues independently, their simultaneous occurrence is common in medical imaging. A method that generalizes to unseen domains using only minimal annotations offers significant practical value due to reduced data annotation and development costs. In pursuit of this goal, we propose FSDA-DG, a novel solution to improve cross-domain generalizability of medical image segmentation with few single-source domain annotations. Specifically, our approach introduces semantics-guided semi-supervised data augmentation. This method divides images into global broad regions and semantics-guided local regions, and applies distinct augmentation strategies to enrich data distribution. Within this framework, both labeled and unlabeled data are transformed into extensive domain knowledge while preserving domain-invariant semantic information. Additionally, FSDA-DG employs a multi-decoder U-Net pipeline semi-supervised learning (SSL) network to improve domain-invariant representation learning through consistent prior assumption across multiple perturbations. By integrating data-level and model-level designs, FSDA-DG achieves superior performance compared to state-of-the-art methods in two challenging single domain generalization (SDG) tasks with limited annotations. The code is publicly available at <span><span>https://github.com/yezanting/FSDA-DG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103704"},"PeriodicalIF":10.7,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144516041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training ECAMP：以实体为中心的情境感知医学视觉语言预训练

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-06-26 DOI: 10.1016/j.media.2025.103690

Rongsheng Wang , Qingsong Yao , Zihang Jiang , Haoran Lai , Zhiyang He , Xiaodong Tao , S. Kevin Zhou

{"title":"ECAMP: Entity-centered Context-aware Medical Vision Language Pre-training","authors":"Rongsheng Wang , Qingsong Yao , Zihang Jiang , Haoran Lai , Zhiyang He , Xiaodong Tao , S. Kevin Zhou","doi":"10.1016/j.media.2025.103690","DOIUrl":"10.1016/j.media.2025.103690","url":null,"abstract":"<div><div>Despite significant advancements in medical vision-language pre-training, existing methods have largely overlooked the inherent linguistic complexity and imbalanced issue within medical reports, as well as the complex cross-modality contextual relationships between texts and images. To close this gap, we propose a novel Entity-centered Context-aware Medical Vision-language Pre-training (ECAMP) framework, which establishes a more entity-centered, context-sensitive, and balanced understanding of medical reports to effectively pre-train the vision encoder. We first distill entity-centered context from medical reports utilizing large language models, enabling ECAMP to draw more precise supervision from the text modality. By further incorporating entity-aware re-balanced factor and descriptor masking strategies into masked language modeling, ECAMP significantly enhances the knowledge of entities within the reports. A context-guided super-resolution task is proposed alongside a multi-scale context fusion design to improve the semantic integration of both coarse and fine-level image representations, which prompts better performance for multi-scale downstream applications. ECAMP integrates these innovations together, leading to significant performance leaps over current state-of-the-art methods and establish a new standard for cross-modality pre-training in medical imaging. The effectiveness of ECAMP is demonstrated by extensive experiments on various domains and organs, which achieves cutting-edge results on multiple tasks including classification, segmentation, and detection across 5 public chest X-ray and 4 fundoscopy datasets respectively.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103690"},"PeriodicalIF":10.7,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144516043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0