Alvaro Gomariz , Yusuke Kikuchi , Yun Yvonna Li , Thomas Albrecht , Andreas Maunz , Daniela Ferrara , Huanxiang Lu , Orcun Goksel
{"title":"Joint semi-supervised and contrastive learning enables domain generalization and multi-domain segmentation","authors":"Alvaro Gomariz , Yusuke Kikuchi , Yun Yvonna Li , Thomas Albrecht , Andreas Maunz , Daniela Ferrara , Huanxiang Lu , Orcun Goksel","doi":"10.1016/j.media.2025.103575","DOIUrl":"10.1016/j.media.2025.103575","url":null,"abstract":"<div><div>Despite their effectiveness, current deep learning models face challenges with images coming from different domains with varying appearance and content. We introduce SegCLR, a versatile framework designed to segment images across different domains, employing supervised and contrastive learning simultaneously to effectively learn from both labeled and unlabeled data. We demonstrate the superior performance of SegCLR through a comprehensive evaluation involving three diverse clinical datasets of 3D retinal Optical Coherence Tomography (OCT) images, for the slice-wise segmentation of fluids with various network configurations and verification across 10 different network initializations. In an unsupervised domain adaptation context, SegCLR achieves results on par with a supervised upper-bound model trained on the intended target domain. Notably, we discover that the segmentation performance of SegCLR framework is marginally impacted by the abundance of unlabeled data from the target domain, thereby we also propose an effective domain generalization extension of SegCLR, known also as zero-shot domain adaptation, which eliminates the need for any target domain information. This shows that our proposed addition of contrastive loss in standard supervised training for segmentation leads to superior models, inherently more generalizable to both in- and out-of-domain test data. We additionally propose a pragmatic solution for SegCLR deployment in realistic scenarios with multiple domains containing labeled data. Accordingly, our framework pushes the boundaries of deep-learning based segmentation in multi-domain applications, regardless of data availability — labeled, unlabeled, or nonexistent.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103575"},"PeriodicalIF":10.7,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Baihong Xie , Heye Zhang , Anbang Wang , Xiujian Liu , Zhifan Gao
{"title":"Bi-variational physics-informed operator network for fractional flow reserve curve assessment from coronary angiography","authors":"Baihong Xie , Heye Zhang , Anbang Wang , Xiujian Liu , Zhifan Gao","doi":"10.1016/j.media.2025.103564","DOIUrl":"10.1016/j.media.2025.103564","url":null,"abstract":"<div><div>The coronary angiography-derived fractional flow reserve (FFR) curve, referred to as the Angio-FFR curve, is crucial for guiding percutaneous coronary intervention (PCI). The invasive FFR is the diagnostic gold standard for determining functional significance and is recommended to complement coronary angiography. The invasive FFR curve can quantitatively define disease patterns. The Angio-FFR curve further overcomes the limitation of invasive FFR measurement and thus emerges as a promising approach. However, the Angio-FFR curve computation suffers from a lack of satisfactory trade-off between accuracy and efficiency. In this paper, we propose a bi-variational physics-informed neural operator (BVPINO) for FFR curve assessment from coronary angiography. Our BVPINO combines with the variational mechanism to guide the basis function learning and residual evaluation. Extensive experiments involving coronary angiographies of 215 vessels from 184 subjects demonstrate the optimal balance of BVPINO between effectiveness and efficiency, compared with computational-based models and other machine/deep learning-based models. The results also provide high agreement and correlation between the distal FFR predictions of BVPINO and the invasive FFR measurements. Besides, we discuss the Angio-FFR curve assessment for a novel gradient-based index. A series of case studies demonstrate the effectiveness and superiority of BVPINO for predicting the FFR curve along the coronary artery centerline.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103564"},"PeriodicalIF":10.7,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From tissue to sound: A new paradigm for medical sonic interaction design","authors":"Sasan Matinfar , Shervin Dehghani , Mehrdad Salehi , Michael Sommersperger , Navid Navab , Koorosh Faridpooya , Merle Fairhurst , Nassir Navab","doi":"10.1016/j.media.2025.103571","DOIUrl":"10.1016/j.media.2025.103571","url":null,"abstract":"<div><div>Medical imaging maps tissue characteristics into image intensity values, enhancing human perception. However, comprehending this data, especially in high-stakes scenarios such as surgery, is prone to errors. Additionally, current multimodal methods do not fully leverage this valuable data in their design. We introduce “From Tissue to Sound,” a new paradigm for medical sonic interaction design. This paradigm establishes a comprehensive framework for mapping tissue characteristics to auditory displays, providing dynamic and intuitive access to medical images that complement visual data, thereby enhancing multimodal perception. “From Tissue to Sound” provides an advanced and adaptable framework for the interactive sonification of multimodal medical imaging data. This framework employs a physics-based sound model composed of a network of multiple oscillators, whose mechanical properties—such as friction and stiffness—are defined by tissue characteristics extracted from imaging data. This approach enables the representation of anatomical structures and the creation of unique acoustic profiles in response to excitations of the sound model. This method allows users to explore data at a fundamental level, identifying tissue characteristics ranging from rigid to soft, dense to sparse, and structured to scattered. It facilitates intuitive discovery of both general and detailed patterns with minimal preprocessing. Unlike conventional methods that transform low-dimensional data into global sound features through a parametric approach, this method utilizes model-based unsupervised mapping between data and an anatomical sound model, enabling high-dimensional data processing. The versatility of this method is demonstrated through feasibility experiments confirming the generation of perceptually discernible acoustic signals. Furthermore, we present a novel application developed based on this framework for retinal surgery. This new paradigm opens up possibilities for designing multisensory applications for multimodal imaging data. It also facilitates the creation of interactive sonification models with various auditory causality approaches, enhancing both directness and richness.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103571"},"PeriodicalIF":10.7,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143820432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Stutz , Ali Taylan Cemgil , Abhijit Guha Roy , Tatiana Matejovicova , Melih Barsbey , Patricia Strachan , Mike Schaekermann , Jan Freyberg , Rajeev Rikhye , Beverly Freeman , Javier Perez Matos , Umesh Telang , Dale R. Webster , Yuan Liu , Greg S. Corrado , Yossi Matias , Pushmeet Kohli , Yun Liu , Arnaud Doucet , Alan Karthikesalingam
{"title":"Evaluating medical AI systems in dermatology under uncertain ground truth","authors":"David Stutz , Ali Taylan Cemgil , Abhijit Guha Roy , Tatiana Matejovicova , Melih Barsbey , Patricia Strachan , Mike Schaekermann , Jan Freyberg , Rajeev Rikhye , Beverly Freeman , Javier Perez Matos , Umesh Telang , Dale R. Webster , Yuan Liu , Greg S. Corrado , Yossi Matias , Pushmeet Kohli , Yun Liu , Arnaud Doucet , Alan Karthikesalingam","doi":"10.1016/j.media.2025.103556","DOIUrl":"10.1016/j.media.2025.103556","url":null,"abstract":"<div><div>For safety, medical AI systems undergo thorough evaluations before deployment, validating their predictions against a ground truth which is assumed to be fixed and certain. However, in medical applications, this ground truth is often curated in the form of differential diagnoses provided by multiple experts. While a single differential diagnosis reflects the uncertainty in one expert assessment, multiple experts introduce another layer of uncertainty through potential disagreement. Both forms of uncertainty are ignored in standard evaluation which aggregates these differential diagnoses to a single label. In this paper, we show that ignoring uncertainty leads to overly optimistic estimates of model performance, therefore underestimating risk associated with particular diagnostic decisions. Moreover, point estimates largely ignore dramatic differences in uncertainty of individual cases. To this end, we propose a <em>statistical aggregation</em> approach, where we infer a distribution on probabilities of underlying medical condition candidates themselves, based on observed annotations. This formulation naturally accounts for the potential disagreements between different experts, as well as uncertainty stemming from individual differential diagnoses, capturing the entire <em>ground truth uncertainty</em>. Practically, our approach boils down to generating multiple samples of medical condition probabilities, then evaluating and averaging performance metrics based on these sampled probabilities, instead of relying on a single point estimate. This allows us to provide uncertainty-adjusted estimates of common metrics of interest such as top-<span><math><mi>k</mi></math></span> accuracy and average overlap. In the skin condition classification problem of Liu <em>et al</em>., (2020), our methodology reveals significant ground truth uncertainty for most data points and demonstrates that standard evaluation techniques can overestimate performance by several percentage points. We conclude that, while assuming a crisp ground truth <em>may</em> be acceptable for many AI applications, a more nuanced evaluation protocol acknowledging the inherent complexity and variability of differential diagnoses should be utilized in medical diagnosis.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103556"},"PeriodicalIF":10.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143824402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuehui Qiu , Dandan Shan , Yining Wang , Pei Dong , Dijia Wu , Xinnian Yang , Qingqi Hong , Dinggang Shen
{"title":"A topology-preserving three-stage framework for fully-connected coronary artery extraction","authors":"Yuehui Qiu , Dandan Shan , Yining Wang , Pei Dong , Dijia Wu , Xinnian Yang , Qingqi Hong , Dinggang Shen","doi":"10.1016/j.media.2025.103578","DOIUrl":"10.1016/j.media.2025.103578","url":null,"abstract":"<div><div>Coronary artery extraction is a crucial prerequisite for computer-aided diagnosis of coronary artery disease. Accurately extracting the complete coronary tree remains challenging due to several factors, including presence of thin distal vessels, tortuous topological structures, and insufficient contrast. These issues often result in over-segmentation and under-segmentation in current segmentation methods. To address these challenges, we propose a topology-preserving three-stage framework for fully-connected coronary artery extraction. This framework includes vessel segmentation, centerline reconnection, and missing vessel reconstruction. First, we introduce a new centerline enhanced loss in the segmentation process. Second, for the broken vessel segments, we further propose a regularized walk algorithm to integrate distance, probabilities predicted by a centerline classifier, and directional cosine similarity, for reconnecting the centerlines. Third, we apply implicit neural representation and implicit modeling, to reconstruct the geometric model of the missing vessels. Experimental results show that our proposed framework outperforms existing methods, achieving Dice scores of 88.53% and 85.07%, with Hausdorff Distances (HD) of 1.07 mm and 1.63 mm on ASOCA and PDSCA datasets, respectively. Code will be available at <span><span>https://github.com/YH-Qiu/CorSegRec</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103578"},"PeriodicalIF":10.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143829163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiansong Fan , Qi Sun , Yicheng Di , Jiayu Bao , Tianxu Lv , Yuan Liu , Xiaoyun Hu , Lihua Li , Xiaobin Cui , Xiang Pan
{"title":"DIPathMamba: A domain-incremental weakly supervised state space model for pathology image segmentation","authors":"Jiansong Fan , Qi Sun , Yicheng Di , Jiayu Bao , Tianxu Lv , Yuan Liu , Xiaoyun Hu , Lihua Li , Xiaobin Cui , Xiang Pan","doi":"10.1016/j.media.2025.103563","DOIUrl":"10.1016/j.media.2025.103563","url":null,"abstract":"<div><div>Accurate segmentation of pathology images plays a crucial role in digital pathology workflow. However, two significant issues exist with the present pathology image segmentation methods: (i) Most fully supervised models rely on dense pixel-level annotations for superior results; (ii) Traditional static models are challenging to handle the massive amount of pathology data in multiple domains. To address these issues, we propose a Domain-Incremental Weakly Supervised State-space Model (DIPathMamba) that not only segments pathology images using image-level labels but also dynamically learns new domain knowledge and preserves the discriminability of previous domains. We first design a shared feature extractor based on the state space model, which employs an efficient hardware-aware design. Specifically, we extract pixel-level feature maps based on Multi-Instance Multi-Label Learning by treating pixels as instances, which are injected into our designed Contrastive Mamba Block (CMB). The CMB adopts a state space model and integrates the concept of contrastive learning to extract non-causal dual-granularity features in pathology images. Subsequently, to mitigate the performance degradation of prior domains during incremental learning, we design a Domain Parameter Constraint Model (DPCM). Finally, we propose a Collaborative Incremental Deep Supervision Loss (CIDSL), which aims to fully utilize the limited annotated information in weakly supervised methods and guide parameter learning during domain increment. Our approach integrates complex details and broader global contextual semantics in pathology images and can generate regionally more consistent segmentation results. Experiments on three public pathology image datasets show that the proposed method performs better than state-of-the-art methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103563"},"PeriodicalIF":10.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143799490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinru Zhang , Ni Ou , Chenghao Liu , Zhizheng Zhuo , Paul M. Matthews , Yaou Liu , Chuyang Ye , Wenjia Bai
{"title":"Unsupervised brain MRI tumour segmentation via two-stage image synthesis","authors":"Xinru Zhang , Ni Ou , Chenghao Liu , Zhizheng Zhuo , Paul M. Matthews , Yaou Liu , Chuyang Ye , Wenjia Bai","doi":"10.1016/j.media.2025.103568","DOIUrl":"10.1016/j.media.2025.103568","url":null,"abstract":"<div><div>Deep learning shows promise in automated brain tumour segmentation, but it depends on costly expert annotations. Recent advances in unsupervised learning offer an alternative by using synthetic data for training. However, the discrepancy between real and synthetic data limits the accuracy of the unsupervised approaches. In this paper, we propose an approach for unsupervised brain tumour segmentation on <em>magnetic resonance</em> (MR) images via a two-stage image synthesis strategy. This approach accounts for the domain gap between real and synthetic data and aims to generate realistic synthetic data for model training. In the first stage, we train a junior segmentation model using synthetic brain tumour images generated by hand-crafted tumour shape and intensity models, and employs a validation set with distribution shift for model selection. The trained junior model is applied to segment unlabelled real tumour images, generating pseudo labels that capture realistic tumour shape, intensity, and texture. In the second stage, realistic synthetic tumour images are generated by mixing brain images with tumour pseudo labels, closing the domain gap between real and synthetic images. The generated synthetic data is then used to train a senior model for final segmentation. In experiments on five brain imaging datasets, the proposed approach, named as <em>SynthTumour</em>, surpasses existing unsupervised methods and demonstrates high performance for both brain tumour segmentation and ischemic stroke lesion segmentation tasks.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103568"},"PeriodicalIF":10.7,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143786145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fakrul Islam Tushar , Liesbeth Vancoillie , Cindy McCabe , Amareswararao Kavuri , Lavsen Dahal , Brian Harrawood , Milo Fryling , Mojtaba Zarei , Saman Sotoudeh-Paima , Fong Chi Ho , Dhrubajyoti Ghosh , Michael R. Harowicz , Tina D. Tailor , Sheng Luo , W. Paul Segars , Ehsan Abadi , Kyle J. Lafata , Joseph Y. Lo , Ehsan Samei
{"title":"Virtual Lung Screening Trial (VLST): An In Silico Study Inspired by the National Lung Screening Trial for Lung Cancer Detection","authors":"Fakrul Islam Tushar , Liesbeth Vancoillie , Cindy McCabe , Amareswararao Kavuri , Lavsen Dahal , Brian Harrawood , Milo Fryling , Mojtaba Zarei , Saman Sotoudeh-Paima , Fong Chi Ho , Dhrubajyoti Ghosh , Michael R. Harowicz , Tina D. Tailor , Sheng Luo , W. Paul Segars , Ehsan Abadi , Kyle J. Lafata , Joseph Y. Lo , Ehsan Samei","doi":"10.1016/j.media.2025.103576","DOIUrl":"10.1016/j.media.2025.103576","url":null,"abstract":"<div><div>Clinical imaging trials play a crucial role in advancing medical innovation but are often costly, inefficient, and ethically constrained. Virtual Imaging Trials (VITs) present a solution by simulating clinical trial components in a controlled, risk-free environment. The Virtual Lung Screening Trial (VLST), an <em>in silico</em> study inspired by the National Lung Screening Trial (NLST), illustrates the potential of VITs to expedite clinical trials, minimize risks to participants, and promote optimal use of imaging technologies in healthcare. This study aimed to show that a virtual imaging trial platform could investigate some key elements of a major clinical trial, specifically the NLST, which compared Computed tomography (CT) and chest radiography (CXR) for lung cancer screening. With simulated cancerous lung nodules, a virtual patient cohort of 294 subjects was created using XCAT human models. Each virtual patient underwent both CT and CXR imaging, with deep learning models, the AI CT-Reader and AI CXR-Reader, acting as virtual readers to perform recall patients with suspicion of lung cancer. The primary outcome was the difference in diagnostic performance between CT and CXR, measured by the Area Under the Curve (AUC). The AI CT-Reader showed superior diagnostic accuracy, achieving an AUC of 0.92 (95% CI: 0.90-0.95) compared to the AI CXR-Reader's AUC of 0.72 (95% CI: 0.67-0.77). Furthermore, at the same 94% CT sensitivity reported by the NLST, the VLST specificity of 73% was similar to the NLST specificity of 73.4%. This CT performance highlights the potential of VITs to replicate certain aspects of clinical trials effectively, paving the way toward a safe and efficient method for advancing imaging-based diagnostics.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103576"},"PeriodicalIF":10.7,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143808364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chong Wang , Kaili Qu , Shuxin Li , Yi Yu , Junjun He , Chen Zhang , Yiqing Shen
{"title":"ArtiDiffuser: A unified framework for artifact restoration and synthesis for histology images via counterfactual diffusion model","authors":"Chong Wang , Kaili Qu , Shuxin Li , Yi Yu , Junjun He , Chen Zhang , Yiqing Shen","doi":"10.1016/j.media.2025.103567","DOIUrl":"10.1016/j.media.2025.103567","url":null,"abstract":"<div><div>Artifacts in histology images pose challenges for accurate diagnosis with deep learning models, often leading to misinterpretations. Existing artifact restoration methods primarily rely on Generative Adversarial Networks (GANs), which approach the problem as image-to-image translation. However, those approaches are prone to mode collapse and can unexpectedly alter morphological features or staining styles. To address the issue, we propose <span>ArtiDiffuser</span>, a counterfactual diffusion model tailored to restore only artifact-distorted regions while preserving the integrity of the rest of the image. Additionally, we show an innovative perspective by addressing the misdiagnosis stemming from artifacts via artifact synthesis as data augmentation, and thereby leverage <span>ArtiDiffuser</span> to unify the artifact synthesis and the restoration capabilities. This synergy significantly surpasses the performance of conventional methods which separately handle artifact restoration or synthesis. We propose a Swin-Transformer denoising network backbone to capture both local and global attention, further enhanced with a class-guided Mixture of Experts (MoE) to process features related to specific artifact categories. Moreover, it utilizes adaptable class-specific tokens for enhanced feature discrimination and a mask-weighted loss function to specifically target and correct artifact-affected regions, thus addressing issues of data imbalance. In downstream applications, <span>ArtiDiffuser</span> employs a consistency regularization strategy that assures the model’s predictive accuracy is maintained across original and artifact-augmented images. We also contribute the first comprehensive histology dataset, comprising 723 annotated patches across various artifact categories, to facilitate further research. Evaluations on four distinct datasets for both restoration and synthesis demonstrate <span>ArtiDiffuser</span>’s effectiveness compared to GAN-based approaches, used for either pre-processing or augmentation. The code is available at <span><span>https://github.com/wagnchogn/ArtiDiffuser</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"102 ","pages":"Article 103567"},"PeriodicalIF":10.7,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143783121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MedScale-Former: Self-guided multiscale transformer for medical image segmentation","authors":"Sanaz Karimijafarbigloo , Reza Azad , Amirhossein Kazerouni , Dorit Merhof","doi":"10.1016/j.media.2025.103554","DOIUrl":"10.1016/j.media.2025.103554","url":null,"abstract":"<div><div>Accurate medical image segmentation is crucial for enabling automated clinical decision procedures. However, existing supervised deep learning methods for medical image segmentation face significant challenges due to their reliance on extensive labeled training data. To address this limitation, our novel approach introduces a dual-branch transformer network operating on two scales, strategically encoding global contextual dependencies while preserving local information. To promote self-supervised learning, our method leverages semantic dependencies between different scales, generating a supervisory signal for inter-scale consistency. Additionally, it incorporates a spatial stability loss within each scale, fostering self-supervised content clustering. While intra-scale and inter-scale consistency losses enhance feature uniformity within clusters, we introduce a cross-entropy loss function atop the clustering score map to effectively model cluster distributions and refine decision boundaries. Furthermore, to account for pixel-level similarities between organ or lesion subpixels, we propose a selective kernel regional attention module as a plug and play component. This module adeptly captures and outlines organ or lesion regions, slightly enhancing the definition of object boundaries. Our experimental results on skin lesion, lung organ, and multiple myeloma plasma cell segmentation tasks demonstrate the superior performance of our method compared to state-of-the-art approaches.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103554"},"PeriodicalIF":10.7,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143808365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}