Medical image analysis最新文献_第9页

Multi-Faceted Consistency learning with active cross-labeling for barely-supervised 3D medical image segmentation 基于主动交叉标记的多面一致性学习在无监督三维医学图像分割中的应用

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-29 DOI: 10.1016/j.media.2025.103744

Xinyao Wu , Zhe Xu , Raymond Kai-yu Tong

{"title":"Multi-Faceted Consistency learning with active cross-labeling for barely-supervised 3D medical image segmentation","authors":"Xinyao Wu , Zhe Xu , Raymond Kai-yu Tong","doi":"10.1016/j.media.2025.103744","DOIUrl":"10.1016/j.media.2025.103744","url":null,"abstract":"<div><div>Deep learning-driven 3D medical image segmentation generally necessitates dense voxel-wise annotations, which are expensive and labor-intensive to acquire. Cross-annotation, which labels only a few orthogonal slices per scan, has recently emerged as a cost-effective alternative that better preserves the shape and precise boundaries of the 3D object than traditional weak labeling methods such as bounding boxes and scribbles. However, learning from such sparse labels, referred to as barely-supervised learning (BSL), remains challenging due to less fine-grained object perception, less compact class features and inferior generalizability. To tackle these challenges and foster collaboration between model training and human expertise, we propose a Multi-Faceted ConSistency learning (MF-ConS) framework with a Diversity and Uncertainty Sampling-based Active Learning (DUS-AL) strategy, specifically designed for the active BSL scenario. This framework combines a cross-annotation BSL strategy, where only three orthogonal slices are labeled per scan, with an AL paradigm guided by DUS to direct human-in-the-loop annotation toward the most informative volumes under a fixed budget. Built upon a teacher–student architecture, MF-ConS integrates three complementary consistency regularization modules: (i) neighbor-informed object prediction consistency for advancing fine-grained object perception by encouraging the student model to infer complete segmentation from masked inputs; (ii) prototype-driven consistency, which enhances intra-class compactness and discriminativeness by aligning latent feature and decision spaces using fused prototypes; and (iii) stability constraint that promotes model robustness against input perturbations. Extensive experiments on three benchmark datasets demonstrate that MF-ConS (DUS-AL) consistently outperforms state-of-the-art methods under extremely limited annotation.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103744"},"PeriodicalIF":11.8,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects 医学图像分析中的持续学习：对最近进展和未来前景的全面回顾。

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-28 DOI: 10.1016/j.media.2025.103730

Pratibha Kumari , Joohi Chauhan , Afshin Bozorgpour , Boqiang Huang , Reza Azad , Dorit Merhof

{"title":"Continual learning in medical image analysis: A comprehensive review of recent advancements and future prospects","authors":"Pratibha Kumari , Joohi Chauhan , Afshin Bozorgpour , Boqiang Huang , Reza Azad , Dorit Merhof","doi":"10.1016/j.media.2025.103730","DOIUrl":"10.1016/j.media.2025.103730","url":null,"abstract":"<div><div>Medical image analysis has witnessed remarkable advancements, even surpassing human-level performance in recent years, driven by the rapid development of advanced deep-learning algorithms. However, when the inference dataset slightly differs from what the model has seen during one-time training, the model performance is greatly compromised. The situation requires restarting the training process using both the old and the new data, which is computationally costly, does not align with the human learning process, and imposes storage constraints and privacy concerns. Alternatively, continual learning has emerged as a crucial approach for developing unified and sustainable deep models to deal with new classes, tasks, and the drifting nature of data in non-stationary environments for various application areas. Continual learning techniques enable models to adapt and accumulate knowledge over time, which is essential for maintaining performance on evolving datasets and novel tasks. Owing to its popularity and promising performance, it is an active and emerging research topic in the medical field and hence demands a survey and taxonomy to clarify the current research landscape of continual learning in medical image analysis. This systematic review paper provides a comprehensive overview of the state-of-the-art in continual learning techniques applied to medical image analysis. We present an extensive survey of existing research, covering topics including catastrophic forgetting, data drifts, stability, and plasticity requirements. Further, an in-depth discussion of key components of a continual learning framework, such as continual learning scenarios, techniques, evaluation schemes, and metrics, is provided. Continual learning techniques encompass various categories, including rehearsal, regularization, architectural, and hybrid strategies. We assess the popularity and applicability of continual learning categories in various medical sub-fields like radiology and histopathology. Our exploration considers unique challenges in the medical domain, including costly data annotation, temporal drift, and the crucial need for benchmarking datasets to ensure consistent model evaluation. The paper also addresses current challenges and looks ahead to potential future research directions.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103730"},"PeriodicalIF":11.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144768767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Pixel-wise recognition for holistic surgical scene understanding 用于整体手术场景理解的逐像素识别。

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-28 DOI: 10.1016/j.media.2025.103726

Nicolás Ayobi , Santiago Rodríguez , Alejandra Pérez , Isabela Hernández , Nicolás Aparicio , Eugénie Dessevres , Sebastián Peña , Jessica Santander , Juan Ignacio Caicedo , Nicolás Fernández , Pablo Arbeláez

{"title":"Pixel-wise recognition for holistic surgical scene understanding","authors":"Nicolás Ayobi , Santiago Rodríguez , Alejandra Pérez , Isabela Hernández , Nicolás Aparicio , Eugénie Dessevres , Sebastián Peña , Jessica Santander , Juan Ignacio Caicedo , Nicolás Fernández , Pablo Arbeláez","doi":"10.1016/j.media.2025.103726","DOIUrl":"10.1016/j.media.2025.103726","url":null,"abstract":"<div><div>This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset, a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity. Our approach encompasses long-term tasks, such as surgical phase and step recognition, and short-term tasks, including surgical instrument segmentation and atomic visual actions detection. To exploit our proposed benchmark, we introduce the Transformers for Actions, Phases, Steps, and Instrument Segmentation (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals from an instrument segmentation model to tackle the multi-granularity of our benchmark. We demonstrate TAPIS’s versatility and state-of-the-art performance across different tasks through extensive experimentation on GraSP and alternative benchmarks. This work represents a foundational step forward in Endoscopic Vision, offering a novel framework for future research towards holistic surgical scene understanding.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103726"},"PeriodicalIF":11.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144799560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Synomaly noise and multi-stage diffusion: A novel approach for unsupervised anomaly detection in medical images 异常噪声和多阶段扩散：医学图像中无监督异常检测的新方法

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-26 DOI: 10.1016/j.media.2025.103737

Yuan Bi , Lucie Huang , Ricarda Clarenbach , Reza Ghotbi , Angelos Karlas , Nassir Navab , Zhongliang Jiang

{"title":"Synomaly noise and multi-stage diffusion: A novel approach for unsupervised anomaly detection in medical images","authors":"Yuan Bi , Lucie Huang , Ricarda Clarenbach , Reza Ghotbi , Angelos Karlas , Nassir Navab , Zhongliang Jiang","doi":"10.1016/j.media.2025.103737","DOIUrl":"10.1016/j.media.2025.103737","url":null,"abstract":"<div><div>Anomaly detection in medical imaging plays a crucial role in identifying pathological regions across various imaging modalities, such as brain MRI, liver CT, and carotid ultrasound (US). However, training fully supervised segmentation models is often hindered by the scarcity of expert annotations and the complexity of diverse anatomical structures. To address these issues, we propose a novel unsupervised anomaly detection framework based on a diffusion model that incorporates a synthetic anomaly (Synomaly) noise function and a multi-stage diffusion process. Synomaly noise introduces synthetic anomalies into healthy images during training, allowing the model to effectively learn anomaly removal. The multi-stage diffusion process is introduced to progressively denoise images, preserving fine details while improving the quality of anomaly-free reconstructions. The generated high-fidelity counterfactual healthy images can further enhance the interpretability of the segmentation models, as well as provide a reliable baseline for evaluating the extent of anomalies and supporting clinical decision-making. Notably, the unsupervised anomaly detection model is trained purely on healthy images, eliminating the need for anomalous training samples and pixel-level annotations. We validate the proposed approach on brain MRI, liver CT datasets, and carotid US. The experimental results demonstrate that the proposed framework outperforms existing state-of-the-art unsupervised anomaly detection methods, achieving performance comparable to fully supervised segmentation models in the US dataset. Ablation studies further highlight the contributions of Synomaly noise and the multi-stage diffusion process in improving anomaly segmentation. These findings underscore the potential of our approach as a robust and annotation-efficient alternative for medical anomaly detection. <strong>Code:</strong> <span><span>https://github.com/yuan-12138/Synomaly</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103737"},"PeriodicalIF":11.8,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accelerating cardiac radial-MRI: Fully polar based technique using compressed sensing and deep learning 加速心脏放射mri：使用压缩感知和深度学习的全极性技术

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-26 DOI: 10.1016/j.media.2025.103732

Vahid Ghodrati , Jinming Duan , Fadil Ali , Arash Bedayat , Ashley Prosper , Mark Bydder

{"title":"Accelerating cardiac radial-MRI: Fully polar based technique using compressed sensing and deep learning","authors":"Vahid Ghodrati , Jinming Duan , Fadil Ali , Arash Bedayat , Ashley Prosper , Mark Bydder","doi":"10.1016/j.media.2025.103732","DOIUrl":"10.1016/j.media.2025.103732","url":null,"abstract":"<div><div>Fast radial-MRI approaches based on compressed sensing (CS) and deep learning (DL) often use non-uniform fast Fourier transform (NUFFT) as the forward imaging operator, which might introduce interpolation errors and reduce image quality. Using the polar Fourier transform (PFT), we developed fully polar CS and DL algorithms for fast 2D cardiac radial-MRI. Our methods directly reconstruct images in polar spatial space from polar k-space data, eliminating frequency interpolation and ensuring an easy-to-compute data consistency term for the DL framework via the variable splitting (VS) scheme. Furthermore, PFT reconstruction produces initial images with fewer artifacts in a reduced field of view, making it a better starting point for CS and DL algorithms, especially for dynamic imaging, where information from a small region of interest is critical, as opposed to NUFFT, which often results in global streaking artifacts. In the cardiac region, PFT-based CS technique outperformed NUFFT-based CS at acceleration rates of 5x (mean SSIM: 0.8831 vs. 0.8526), 10x (0.8195 vs. 0.7981), and 15x (0.7720 vs. 0.7503). Our PFT(VS)-DL technique outperformed the NUFFT(GD)-based DL method, which used unrolled gradient descent with the NUFFT as the forward imaging operator, with mean SSIM scores of 0.8914 versus 0.8617 at 10x and 0.8470 versus 0.8301 at 15x. Radiological assessments revealed that PFT(VS)-based DL scored <span><math><mrow><mn>2</mn><mo>.</mo><mn>9</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>30</mn></mrow></math></span> and <span><math><mrow><mn>2</mn><mo>.</mo><mn>73</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>45</mn></mrow></math></span> at 5x and 10x, whereas NUFFT(GD)-based DL scored <span><math><mrow><mn>2</mn><mo>.</mo><mn>7</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>47</mn></mrow></math></span> and <span><math><mrow><mn>2</mn><mo>.</mo><mn>40</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>50</mn></mrow></math></span>, respectively. Our methods suggest a promising alternative to NUFFT-based fast radial-MRI for dynamic imaging, prioritizing reconstruction quality in a small region of interest over whole image quality.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103732"},"PeriodicalIF":11.8,"publicationDate":"2025-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SemiSAM+: Rethinking semi-supervised medical image segmentation in the era of foundation models SemiSAM+：重新思考基础模型时代的半监督医学图像分割

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-25 DOI: 10.1016/j.media.2025.103733

Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi

{"title":"SemiSAM+: Rethinking semi-supervised medical image segmentation in the era of foundation models","authors":"Yichi Zhang , Bohao Lv , Le Xue , Wenbo Zhang , Yuchen Liu , Yu Fu , Yuan Cheng , Yuan Qi","doi":"10.1016/j.media.2025.103733","DOIUrl":"10.1016/j.media.2025.103733","url":null,"abstract":"<div><div>Deep learning-based medical image segmentation typically requires large amount of labeled data for training, making it less applicable in clinical settings due to high annotation cost. Semi-supervised learning (SSL) has emerged as an appealing strategy due to its less dependence on acquiring abundant annotations from experts compared to fully supervised methods. Beyond existing model-centric advancements of SSL by designing novel regularization strategies, we anticipate a paradigmatic shift due to the emergence of promptable segmentation foundation models with universal segmentation capabilities using positional prompts represented by Segment Anything Model (SAM). In this paper, we present <strong>SemiSAM+</strong>, a foundation model-driven SSL framework to efficiently learn from limited labeled data for medical image segmentation. SemiSAM+ consists of one or multiple promptable foundation models as <strong>generalist models</strong>, and a trainable task-specific segmentation model as <strong>specialist model</strong>. For a given new segmentation task, the training is based on the specialist–generalist collaborative learning procedure, where the trainable specialist model delivers positional prompts to interact with the frozen generalist models to acquire pseudo-labels, and then the generalist model output provides the specialist model with informative and efficient supervision which benefits the automatic segmentation and prompt generation in turn. Extensive experiments on three public datasets and one in-house clinical dataset demonstrate that SemiSAM+ achieves significant performance improvement, especially under extremely limited annotation scenarios, and shows strong efficiency as a plug-and-play strategy that can be easily adapted to different specialist and generalist models.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103733"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144780368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust T-Loss for medical image segmentation 鲁棒T-Loss用于医学图像分割

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-25 DOI: 10.1016/j.media.2025.103735

Alvaro Gonzalez-Jimenez , Simone Lionetti , Philippe Gottfrois , Fabian Gröger , Alexander Navarini , Marc Pouly

{"title":"Robust T-Loss for medical image segmentation","authors":"Alvaro Gonzalez-Jimenez , Simone Lionetti , Philippe Gottfrois , Fabian Gröger , Alexander Navarini , Marc Pouly","doi":"10.1016/j.media.2025.103735","DOIUrl":"10.1016/j.media.2025.103735","url":null,"abstract":"<div><div>This work introduces T-Loss, a novel and robust loss function for medical image segmentation. T-Loss is derived from the negative log-likelihood of the Student-t distribution and excels at handling noisy masks by dynamically controlling its sensitivity through a single parameter. This parameter is optimized during the backpropagation process, obviating the need for additional computations or prior knowledge about the extent and distribution of noisy labels. We provide in-depth analysis of this parameter behavior during training and revealing its adaptive nature and its role in preventing noisy memorization. Our extensive experiments demonstrate that T-Loss significantly outperforms traditional loss functions in terms of dice scores on two public medical datasets, specifically for skin lesion and lung segmentation. Moreover, T-Loss exhibits remarkable resilience to various types of simulated label noise, which mimics human annotation errors. Our results provide strong evidence that T-Loss is a promising alternative for medical image segmentation where high levels of noise or outliers in the dataset are a typical phenomenon in practice. The project website, including code and additional resources, can be found at: <span><span>https://robust-tloss.github.io/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103735"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144724975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Controllable illumination invariant GAN for diverse temporally-consistent surgical video synthesis 可控光照不变GAN用于多种时间一致手术视频合成

IF 10.7 1区医学

Medical image analysis Pub Date : 2025-07-25 DOI: 10.1016/j.media.2025.103731

Long Chen , Mobarak I. Hoque , Zhe Min , Matt Clarkson , Thomas Dowrick

{"title":"Controllable illumination invariant GAN for diverse temporally-consistent surgical video synthesis","authors":"Long Chen , Mobarak I. Hoque , Zhe Min , Matt Clarkson , Thomas Dowrick","doi":"10.1016/j.media.2025.103731","DOIUrl":"10.1016/j.media.2025.103731","url":null,"abstract":"<div><div>Surgical video synthesis offers a cost-effective way to expand training data and enhance the performance of machine learning models in computer-assisted surgery. However, existing video translation methods often produce video sequences with large illumination changes across different views, disrupting the temporal consistency of the videos. Additionally, these methods typically synthesize videos with a monotonous style, whereas diverse synthetic data is desired to improve the generalization ability of downstream machine learning models. To address these challenges, we propose a novel Controllable Illumination Invariant Generative Adversarial Network (CIIGAN) for generating diverse, illumination-consistent video sequences. CIIGAN fuses multi-scale illumination-invariant features from a novel controllable illumination-invariant (CII) image space with multi-scale texture-invariant features from self-constructed 3D scenes. The CII image space, along with the 3D scenes, allows CIIGAN to produce diverse and temporally-consistent video or image translations. Extensive experiments demonstrate that CIIGAN achieves more realistic and illumination-consistent translations compared to previous state-of-the-art baselines. Furthermore, the segmentation networks trained on our diverse synthetic data outperform those trained on monotonous synthetic data. Our source code, well-trained models, and 3D simulation scenes are public available at <span><span>https://github.com/LongChenCV/CIIGAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103731"},"PeriodicalIF":10.7,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144713306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spectrum intervention based invariant causal representation learning for single-domain generalizable medical image segmentation 基于谱介入的不变因果表示学习在单域广义医学图像分割中的应用

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-25 DOI: 10.1016/j.media.2025.103741

Wentao Liu , Zhiwei Ni , Xuhui Zhu , Qian Chen , Liping Ni , Pingfan Xia

{"title":"Spectrum intervention based invariant causal representation learning for single-domain generalizable medical image segmentation","authors":"Wentao Liu , Zhiwei Ni , Xuhui Zhu , Qian Chen , Liping Ni , Pingfan Xia","doi":"10.1016/j.media.2025.103741","DOIUrl":"10.1016/j.media.2025.103741","url":null,"abstract":"<div><div>The performance of a well-trained segmentation model is often trapped by domain shift caused by acquisition variance. Existing efforts are devoted to expanding the diversity of single-source samples, as well as learning domain-invariant representations. Essentially, they are still modeling the statistical dependence between sample-label pairs to achieve a superficial portrayal of reality. On the contrary, we propose a Spectrum Intervention based Invariant Causal Representation Learning (SI<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>CRL) framework, to unify the data generation and representation learning from causal view. Specifically, for the data generation, the unknown object elements can be reified in frequency domain as phase variables, then we propose an amplitude-based intervention module to generate low-frequency perturbations via random-weighted multilayer convolutional network. For the causal representations, a two-stage causal synergy modeling process is proposed to derive unobservable causal factors. In the first stage, the style-sensitive non-causal factors lying in the shallow layer of encoder are filtered out by contrastive-based causal decoupling mechanism. In the second stage, the hierarchical features in decoder are first factorized with cross-covariance regularization to ensure channel-wise independence; Subsequently, we introduce an adversarial-based causal purification module, which encourages the decoder to iteratively update causally sufficient information and make domain-robust predictions. We evaluate our SI<span><math><msup><mrow></mrow><mrow><mn>2</mn></mrow></msup></math></span>CRL against the state-of-the-art methods on cross-site prostate MRI segmentation, cross-modality (CT-MRI) abdominal multi-organ segmentation, and cross-sequence (MRI) cardiac segmentation. Our approach achieves consistent performance gains compared to these peer methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103741"},"PeriodicalIF":11.8,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144738203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SET: Superpixel Embedded Transformer for skin lesion segmentation SET：用于皮肤病变分割的超像素嵌入式变压器

IF 11.8 1区医学

Medical image analysis Pub Date : 2025-07-24 DOI: 10.1016/j.media.2025.103738

Zhonghua Wang , Junyan Lyu , Xiaoying Tang

{"title":"SET: Superpixel Embedded Transformer for skin lesion segmentation","authors":"Zhonghua Wang , Junyan Lyu , Xiaoying Tang","doi":"10.1016/j.media.2025.103738","DOIUrl":"10.1016/j.media.2025.103738","url":null,"abstract":"<div><div>Accurate skin lesion segmentation is crucial for the early detection and treatment of skin cancer. Despite significant advances in deep learning, current segmentation methods often struggle to fully capture global contextual information and maintain the structural integrity of skin lesions. To address these challenges, this paper introduces Superpixel Embedded Transformer (SET), which integrates superpixels into the Transformer framework for skin lesion segmentation. Instead of embedding non-overlapping patches as tokens, SET employs an Association Embedded Merging & Dispatching (AEM&D) module to treat superpixels as the fundamental units during both the down-sampling and up-sampling phases. To better capture the multi-scale information of lesions, we propose a superpixel bank to store various superpixel maps with distinct compactness values. An Ensemble Fusion and Refinery (EFR) module is then designed to fuse and refine the results obtained from each map in the superpixel bank. This approach enables the model to selectively focus on different features by adopting various superpixel maps, thereby enhancing the segmentation performance. Extensive experiments are conducted on multiple skin lesion segmentation datasets, including ISIC 2016, ISIC 2017, and ISIC 2018. Comparative analyses with state-of-the-art methods showcase SET’s superior performance, and ablation studies confirm the effectiveness of our proposed modules incorporating superpixels into Vision Transformer. The source code of our SET will be available at <span><span>https://github.com/Wzhjerry/SET</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103738"},"PeriodicalIF":11.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144725051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0