Zhaohu Xing , Liang Wan , Huazhu Fu , Guang Yang , Yijun Yang , Lequan Yu , Baiying Lei , Lei Zhu
{"title":"Diff-UNet: A diffusion embedded network for robust 3D medical image segmentation","authors":"Zhaohu Xing , Liang Wan , Huazhu Fu , Guang Yang , Yijun Yang , Lequan Yu , Baiying Lei , Lei Zhu","doi":"10.1016/j.media.2025.103654","DOIUrl":"10.1016/j.media.2025.103654","url":null,"abstract":"<div><div>Benefiting from the powerful generative capabilities of diffusion models, recent studies have utilized these models to address 2D medical image segmentation problems. However, directly extending these methods to 3D medical image segmentation slice-by-slice does not yield satisfactory results. The reason is that these approaches often ignore the inter-slice relations of 3D medical data and require significant computational costs. To overcome these challenges, we devise the first diffusion-based model (i.e., Diff-UNet) with two branches for general 3D medical image segmentation. Specifically, we devise an additional boundary-prediction branch to predict the auxiliary boundary information of the target segmentation region, which assists the diffusion-denoising branch in predicting 3D segmentation results. Furthermore, we design a Multi-granularity Boundary Aggregation (MBA) module to embed both low-level and high-level boundary features into the diffusion denoising process. Then, we propose a Monte Carlo Diffusion (MC-Diff) module to generate an uncertainty map and define an uncertainty-guided segmentation loss to improve the segmentation results of uncertain pixels. Moreover, during our diffusion inference stage, we develop a Progressive Uncertainty-driven REfinement (PURE) strategy to fuse intermediate segmentation results at each diffusion inference step. Experimental results on the three latest large-scale datasets (i.e., BraTS2023, SegRap2023, and AIIB2023) with diverse organs and modalities show that our Diff-UNet quantitatively and qualitatively outperforms state-of-the-art 3D medical segmentation methods, especially on regions with small or complex structures. Our code is available at the following link: <span><span>https://github.com/ge-xing/DiffUNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103654"},"PeriodicalIF":10.7,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runshi Zhang , Bimeng Jie , Yang He , Junchen Wang
{"title":"TCFNet: Bidirectional face-bone transformation via a Transformer-based coarse-to-fine point movement network","authors":"Runshi Zhang , Bimeng Jie , Yang He , Junchen Wang","doi":"10.1016/j.media.2025.103653","DOIUrl":"10.1016/j.media.2025.103653","url":null,"abstract":"<div><div>Computer-aided surgical simulation is a critical component of orthognathic surgical planning, where accurately simulating face-bone shape transformations is significant. The traditional biomechanical simulation methods are limited by their computational time consumption levels, labor-intensive data processing strategies and low accuracy. Recently, deep learning-based simulation methods have been proposed to view this problem as a point-to-point transformation between skeletal and facial point clouds. However, these approaches cannot process large-scale points, have limited receptive fields that lead to noisy points, and employ complex preprocessing and postprocessing operations based on registration. These shortcomings limit the performance and widespread applicability of such methods. Therefore, we propose a Transformer-based coarse-to-fine point movement network (TCFNet) to learn unique, complicated correspondences at the patch and point levels for dense face-bone point cloud transformations. This end-to-end framework adopts a Transformer-based network and a local information aggregation network (LIA-Net) in the first and second stages, respectively, which reinforce each other to generate precise point movement paths. LIA-Net can effectively compensate for the neighborhood precision loss of the Transformer-based network by modeling local geometric structures (edges, orientations and relative position features). The previous global features are employed to guide the local displacement using a gated recurrent unit. Inspired by deformable medical image registration, we propose an auxiliary loss that can utilize expert knowledge for reconstructing critical organs. Our framework is an unsupervised algorithm, and this loss is optional. Compared with the existing state-of-the-art (SOTA) methods on gathered datasets, TCFNet achieves outstanding evaluation metrics and visualization results. The code is available at <span><span>https://github.com/Runshi-Zhang/TCFNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103653"},"PeriodicalIF":10.7,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144290554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jingru Fu , Yuqi Zheng , Neel Dey , Daniel Ferreira , Rodrigo Moreno
{"title":"Synthesizing individualized aging brains in health and disease with generative models and parallel transport","authors":"Jingru Fu , Yuqi Zheng , Neel Dey , Daniel Ferreira , Rodrigo Moreno","doi":"10.1016/j.media.2025.103669","DOIUrl":"10.1016/j.media.2025.103669","url":null,"abstract":"<div><div>Simulating prospective magnetic resonance imaging (MRI) scans from a given individual brain image is challenging, as it requires accounting for canonical changes in aging and/or disease progression while also considering the individual brain’s current status and unique characteristics. While current deep generative models can produce high-resolution anatomically accurate templates for population-wide studies, their ability to predict future aging trajectories for individuals remains limited, particularly in capturing subject-specific neuroanatomical variations over time. In this study, we introduce Individualized Brain Synthesis (<strong>InBrainSyn</strong>), a framework for synthesizing high-resolution <em>subject-specific</em> longitudinal MRI scans that simulate neurodegeneration in both Alzheimer’s disease (AD) and normal aging. InBrainSyn uses a parallel transport algorithm to adapt the population-level aging trajectories learned by a generative deep template network, enabling individualized aging synthesis. As InBrainSyn uses diffeomorphic transformations to simulate aging, the synthesized images are topologically consistent with the original anatomy by design. We evaluated InBrainSyn both quantitatively and qualitatively on AD and healthy control cohorts from the Open Access Series of Imaging Studies - version 3 dataset. Experimentally, InBrainSyn can also model neuroanatomical transitions between normal aging and AD. An evaluation of an external set supports its generalizability. Overall, with only a single baseline scan, InBrainSyn synthesizes realistic 3D spatiotemporal T1w MRI scans, producing personalized longitudinal aging trajectories. The code for InBrainSyn is available at <span><span>https://github.com/Fjr9516/InBrainSyn</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103669"},"PeriodicalIF":10.7,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144337529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shengdong Zhang , Fan Jia , Xiang Li , Hao Zhang , Jun Shi , Liyan Ma , Shihui Ying
{"title":"LMS-Net: A learned Mumford-Shah network for binary few-shot medical image segmentation","authors":"Shengdong Zhang , Fan Jia , Xiang Li , Hao Zhang , Jun Shi , Liyan Ma , Shihui Ying","doi":"10.1016/j.media.2025.103676","DOIUrl":"10.1016/j.media.2025.103676","url":null,"abstract":"<div><div>Few-shot semantic segmentation (FSS) methods have shown great promise in handling data-scarce scenarios, particularly in medical image segmentation tasks. However, most existing FSS architectures lack sufficient interpretability and fail to fully incorporate the underlying physical structures of semantic regions. To address these issues, in this paper, we propose a novel deep unfolding network, called the Learned Mumford-Shah Network (LMS-Net), for the FSS task. Specifically, motivated by the effectiveness of pixel-to-prototype comparison in prototypical FSS methods and the capability of deep priors to model complex spatial structures, we leverage our learned Mumford-Shah model (LMS model) as a mathematical foundation to integrate these insights into a unified framework. By reformulating the LMS model into prototype update and mask update tasks, we propose an alternating optimization algorithm to solve it efficiently. Further, the iterative steps of this algorithm are unfolded into corresponding network modules, resulting in LMS-Net with clear interpretability. Comprehensive experiments on three publicly available medical segmentation datasets verify the effectiveness of our method, demonstrating superior accuracy and robustness in handling complex structures and adapting to challenging segmentation scenarios. These results highlight the potential of LMS-Net to advance FSS in medical imaging applications. Our code will be available at: <span><span>https://github.com/SDZhang01/LMSNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103676"},"PeriodicalIF":10.7,"publicationDate":"2025-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144305086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jintu Zheng , Qizhe Liu , Yi Ding , Yi Cao , Ying Hu , Zenan Wang
{"title":"One-shot cell segmentation via learning memory query: Towards universal solution without active tuning","authors":"Jintu Zheng , Qizhe Liu , Yi Ding , Yi Cao , Ying Hu , Zenan Wang","doi":"10.1016/j.media.2025.103675","DOIUrl":"10.1016/j.media.2025.103675","url":null,"abstract":"<div><div>Cell segmentation, which involves separating individual cells in biomedical images, is essential for disease analysis and drug development research. However, many existing methods are restricted to specific types of images or require constant adjustment, making them time-consuming and labor-intensive. We introduce a new framework called Mimic, which employs a ”Query-and-Answer” (Q&A) mechanism to segment cells in a single step. This innovative approach eliminates the need for constant adjustments across different images, significantly reducing labor-intensive tasks. Mimic learns to recognize and segment cells using a few examples as ”prompts”, allowing this model to adapt to new cell types without additional training. Mimic was tested on 12 public datasets featuring various imaging techniques, cell shapes, sizes, and staining methods. It achieved state-of-the-art performance, surpassing existing generalist cell segmentation models such as Cellpose and Stardist and foundational vision models. Mimic’s capability to segment cells without extensive tuning or additional training could greatly enhance the speed and accuracy of quantitative analysis in biological and medical research.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103675"},"PeriodicalIF":10.7,"publicationDate":"2025-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144305087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haozheng Xu , Alistair Weld , Chi Xu , Alfie Roddan , João Cartucho , Mert Asim Karaoglu , Alexander Ladikos , Yangke Li , Yiping Li , Daiyun Shen , Geonhee Lee , Seyeon Park , Jongho Shin , Lucy Fothergill , Dominic Jones , Pietro Valdastri , Duygu Sarikaya , Stamatia Giannarou
{"title":"SurgRIPE challenge: Benchmark of surgical robot instrument pose estimation","authors":"Haozheng Xu , Alistair Weld , Chi Xu , Alfie Roddan , João Cartucho , Mert Asim Karaoglu , Alexander Ladikos , Yangke Li , Yiping Li , Daiyun Shen , Geonhee Lee , Seyeon Park , Jongho Shin , Lucy Fothergill , Dominic Jones , Pietro Valdastri , Duygu Sarikaya , Stamatia Giannarou","doi":"10.1016/j.media.2025.103674","DOIUrl":"10.1016/j.media.2025.103674","url":null,"abstract":"<div><div>Accurate instrument pose estimation is a crucial step towards the future of robotic surgery, enabling applications such as autonomous surgical task execution. Vision-based methods for surgical instrument pose estimation provide a practical approach to tool tracking, but they often require markers to be attached to the instruments. Recently, more research has focused on the development of markerless methods based on deep learning. However, acquiring realistic surgical data, with ground truth (GT) instrument poses, required for deep learning training, is challenging. To address the issues in surgical instrument pose estimation, we introduce the Surgical Robot Instrument Pose Estimation (SurgRIPE) challenge, hosted at the 26th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI) in 2023. The objectives of this challenge are: (1) to provide the surgical vision community with realistic surgical video data paired with ground truth instrument poses, and (2) to establish a benchmark for evaluating markerless pose estimation methods. The challenge led to the development of several novel algorithms that showcased improved accuracy and robustness over existing methods. The performance evaluation study on the SurgRIPE dataset highlights the potential of these advanced algorithms to be integrated into robotic surgery systems, paving the way for more precise and autonomous surgical procedures. The SurgRIPE challenge has successfully established a new benchmark for the field, encouraging further research and development in surgical robot instrument pose estimation.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103674"},"PeriodicalIF":10.7,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144305088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Sofia Sappia , Chris L. de Korte , Bram van Ginneken , Dean Ninalga , Satoshi Kondo , Satoshi Kasai , Kousuke Hirasawa , Tanya Akumu , Carlos Martín-Isla , Karim Lekadir , Victor M. Campello , Jorge Fabila , Anette Beverdam , Jeroen van Dillen , Chase Neff , Keelin Murphy
{"title":"ACOUSLIC-AI challenge report: Fetal abdominal circumference measurement on blind-sweep ultrasound data from low-income countries","authors":"M. Sofia Sappia , Chris L. de Korte , Bram van Ginneken , Dean Ninalga , Satoshi Kondo , Satoshi Kasai , Kousuke Hirasawa , Tanya Akumu , Carlos Martín-Isla , Karim Lekadir , Victor M. Campello , Jorge Fabila , Anette Beverdam , Jeroen van Dillen , Chase Neff , Keelin Murphy","doi":"10.1016/j.media.2025.103640","DOIUrl":"10.1016/j.media.2025.103640","url":null,"abstract":"<div><div>Fetal growth restriction, affecting up to 10% of pregnancies, is a critical factor contributing to perinatal mortality and morbidity. Ultrasound measurements of the fetal abdominal circumference (AC) are a key aspect of monitoring fetal growth. However, the routine practice of biometric obstetric ultrasounds is limited in low-resource settings due to the high cost of sonography equipment and the scarcity of trained sonographers. To address this issue, we organized the ACOUSLIC-AI (Abdominal Circumference Operator-agnostic UltraSound measurement in Low-Income Countries) challenge to investigate the feasibility of automatically estimating fetal AC from blind-sweep ultrasound scans acquired by novice operators using low-cost devices. Training data, collected from three Public Health Units (PHUs) in Sierra Leone <strong>are</strong> made publicly available. Private validation and test sets, containing data from two PHUs in Tanzania and a European hospital, are provided through the Grand-Challenge platform. All sets were annotated by experienced readers. Sixteen international teams participated in this challenge, with six teams submitting to the Final Test Phase. In this article, we present the results of the three top-performing AI models from the ACOUSLIC-AI challenge, which are publicly accessible. We evaluate their performance in fetal abdomen frame selection, segmentation, abdominal circumference measurement, and compare their performance against clinical standards for fetal AC measurement. Clinical comparisons demonstrated that the limits of agreement (LoA) for A2 in fetal AC measurements are comparable to the interobserver LoA reported in the literature. The algorithms developed as part of the ACOUSLIC-AI challenge provide a benchmark for future algorithms on the selection and segmentation of fetal abdomen frames to further minimize fetal abdominal circumference measurement variability.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103640"},"PeriodicalIF":10.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144297217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhen Yuan , David Stojanovski , Lei Li , Alberto Gomez , Haran Jogeesvaran , Esther Puyol-Antón , Baba Inusa , Andrew P. King
{"title":"DeepSPV: A deep learning pipeline for 3D spleen volume estimation from 2D ultrasound images","authors":"Zhen Yuan , David Stojanovski , Lei Li , Alberto Gomez , Haran Jogeesvaran , Esther Puyol-Antón , Baba Inusa , Andrew P. King","doi":"10.1016/j.media.2025.103671","DOIUrl":"10.1016/j.media.2025.103671","url":null,"abstract":"<div><div>Splenomegaly, the enlargement of the spleen, is an important clinical indicator for various associated medical conditions, such as sickle cell disease (SCD). Spleen length measured from 2D ultrasound is the most widely used metric for characterising spleen size. However, it is still considered a surrogate measure, and spleen volume remains the gold standard for assessing spleen size. Accurate spleen volume measurement typically requires 3D imaging modalities, such as computed tomography or magnetic resonance imaging, but these are not widely available, especially in the Global South which has a high prevalence of SCD. In this work, we introduce a deep learning pipeline, DeepSPV, for precise spleen volume estimation from single or dual 2D ultrasound images. The pipeline involves a segmentation network and a variational autoencoder for learning low-dimensional representations from the estimated segmentations. We investigate three approaches for spleen volume estimation and our best model achieves 86.62%/92.5% mean relative volume accuracy (MRVA) under single-view/dual-view settings, surpassing the performance of human experts. In addition, the pipeline can provide confidence intervals for the volume estimates as well as offering benefits in terms of interpretability, which further support clinicians in decision-making when identifying splenomegaly. We evaluate the full pipeline using a highly realistic synthetic dataset generated by a diffusion model, achieving an overall MRVA of 83.0% from a single 2D ultrasound image. Our proposed DeepSPV is the first work to use deep learning to estimate 3D spleen volume from 2D ultrasound images and can be seamlessly integrated into the current clinical workflow for spleen assessment. We also make our synthetic spleen ultrasound dataset publicly available.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103671"},"PeriodicalIF":10.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144305090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mélanie Roschewitz , Fabio De Sousa Ribeiro , Tian Xia , Galvin Khara , Ben Glocker
{"title":"Robust image representations with counterfactual contrastive learning","authors":"Mélanie Roschewitz , Fabio De Sousa Ribeiro , Tian Xia , Galvin Khara , Ben Glocker","doi":"10.1016/j.media.2025.103668","DOIUrl":"10.1016/j.media.2025.103668","url":null,"abstract":"<div><div>Contrastive pretraining can substantially increase model generalisation and downstream performance. However, the quality of the learned representations is highly dependent on the data augmentation strategy applied to generate positive pairs. Positive contrastive pairs should preserve semantic meaning while discarding unwanted variations related to the data acquisition domain. Traditional contrastive pipelines attempt to simulate domain shifts through pre-defined generic image transformations. However, these do not always mimic realistic and relevant domain variations for medical imaging, such as scanner differences. To tackle this issue, we herein introduce <em>counterfactual contrastive learning</em>, a novel framework leveraging recent advances in causal image synthesis to create contrastive positive pairs that faithfully capture relevant domain variations. Our method, evaluated across five datasets encompassing both chest radiography and mammography data, for two established contrastive objectives (SimCLR and DINO-v2), outperforms standard contrastive learning in terms of robustness to acquisition shift. Notably, counterfactual contrastive learning achieves superior downstream performance on both in-distribution and external datasets, especially for images acquired with scanners under-represented in the training set. Further experiments show that the proposed framework extends beyond acquisition shifts, with models trained with counterfactual contrastive learning reducing subgroup disparities across biological sex.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"105 ","pages":"Article 103668"},"PeriodicalIF":10.7,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144279085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}