Rudolf L.M. van Herten , Ioannis Lagogiannis , Jelmer M. Wolterink , Steffen Bruns , Eva R. Meulendijks , Damini Dey , Joris R. de Groot , José P. Henriques , R. Nils Planken , Simone Saitta , Ivana Išgum
{"title":"World of Forms: Deformable geometric templates for one-shot surface meshing in coronary CT angiography","authors":"Rudolf L.M. van Herten , Ioannis Lagogiannis , Jelmer M. Wolterink , Steffen Bruns , Eva R. Meulendijks , Damini Dey , Joris R. de Groot , José P. Henriques , R. Nils Planken , Simone Saitta , Ivana Išgum","doi":"10.1016/j.media.2025.103582","DOIUrl":"10.1016/j.media.2025.103582","url":null,"abstract":"<div><div>Deep learning-based medical image segmentation and surface mesh generation typically involve a sequential pipeline from image to segmentation to meshes, often requiring large training datasets while making limited use of prior geometric knowledge. This may lead to topological inconsistencies and suboptimal performance in low-data regimes. To address these challenges, we propose a data-efficient deep learning method for direct 3D anatomical object surface meshing using geometric priors. Our approach employs a multi-resolution graph neural network that operates on a prior geometric template which is deformed to fit object boundaries of interest. We show how different templates may be used for the different surface meshing targets, and introduce a novel masked autoencoder pretraining strategy for 3D spherical data. The proposed method outperforms nnUNet in a one-shot setting for segmentation of the pericardium, left ventricle (LV) cavity and the LV myocardium. Similarly, the method outperforms other lumen segmentation operating on multi-planar reformatted images. Results further indicate that mesh quality is on par with or improves upon marching cubes post-processing of voxel mask predictions, while remaining flexible in the choice of mesh triangulation prior, thus paving the way for more accurate and topologically consistent 3D medical object surface meshing.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103582"},"PeriodicalIF":10.7,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143894463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyan Wang , Yating Zhu , Ying Cui , Xiaojie Huang , Dongyan Guo , Pan Mu , Ming Xia , Cong Bai , Zhongzhao Teng , Shengyong Chen
{"title":"Lightweight Multi-Stage Aggregation Transformer for robust medical image segmentation","authors":"Xiaoyan Wang , Yating Zhu , Ying Cui , Xiaojie Huang , Dongyan Guo , Pan Mu , Ming Xia , Cong Bai , Zhongzhao Teng , Shengyong Chen","doi":"10.1016/j.media.2025.103569","DOIUrl":"10.1016/j.media.2025.103569","url":null,"abstract":"<div><div>Capturing rich multi-scale features is essential to address complex variations in medical image segmentation. Multiple hybrid networks have been developed to integrate the complementary benefits of convolutional neural networks (CNN) and Transformers. However, existing methods may suffer from either huge computational cost required by the complicated networks or unsatisfied performance of lighter networks. How to give full play to the advantages of both convolution and self-attention and design networks that are both effective and efficient still remains an unsolved problem. In this work, we propose a robust lightweight multi-stage hybrid architecture, named Multi-stage Aggregation Transformer version 2 (MA-TransformerV2), to extract multi-scale features with progressive aggregations for accurate segmentation of highly variable medical images at a low computational cost. Specifically, lightweight Trans blocks and lightweight CNN blocks are parallelly introduced into the dual-branch encoder module in each stage, and then a vector quantization block is incorporated at the bottleneck to discretizes the features and discard the redundance. This design not only enhances the representation capabilities and computational efficiency of the model, but also makes the model interpretable. Extensive experimental results on public datasets show that our method outperforms state-of-the-art methods, including CNN-based, Transformer-based, advanced hybrid CNN-Transformer-based models, and several lightweight models, in terms of both segmentation accuracy and model capacity. Code will be made publicly available at <span><span>https://github.com/zjmiaprojects/MATransformerV2</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103569"},"PeriodicalIF":10.7,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Huang , Jing Jiao , Jinhua Yu , Yongping Zheng , Yuanyuan Wang
{"title":"Anatomy-inspired model for critical landmark localization in 3D spinal ultrasound volume data","authors":"Yi Huang , Jing Jiao , Jinhua Yu , Yongping Zheng , Yuanyuan Wang","doi":"10.1016/j.media.2025.103610","DOIUrl":"10.1016/j.media.2025.103610","url":null,"abstract":"<div><div>Three-dimensional (3D) spinal ultrasound imaging has demonstrated its promising potential in measuring spinal deformity through recent studies, and it is more suitable for massive early screening and longitudinal follow-up of adolescent idiopathic scoliosis (AIS) compared with X-ray imaging due to its radiation-free superiority. Moreover, some deformities with low observability, such as vertebral rotation, in X-ray images can also be reflected by critical landmarks in 3D ultrasound data. In this paper, we propose a localization network (LLNet) to extract lamina in 3D ultrasound data, which has been indicated as a meaningful anatomy for measuring vertebral rotation by clinical studies. First, the LLNet skillfully establishes a parallel anatomical prior embedding branch that implicitly explores the anatomical correlation between the lamina and another anatomy with more stable observability (spinous process) during the training phase and then introduces the correlation to highlight the potential region of the lamina in the inferring one. Second, since the lamina is a tiny target, the information loss caused by continuous convolutional and pooling operations has a profound negative effect on detecting the lamina. We employ an optimization mechanism to mitigate this problem, which refines feature maps according to information from the original image and further reuses them to polish output. Furthermore, a modified global-local attention module is deployed on skip connections to mine global dependencies and contextual information to construct an effective image pattern. Extensive comparisons and ablation studies are performed on actual clinical data. Results indicate that the capability of our model is better than other outstanding detection models, and functional modules effectively contribute to this, with a 100.0 % detection success rate and an 8.9 % improvement of mean intersection over the union. Thus, our model is promising to become a helpful part of a computer-assisted diagnosis system based on 3D spinal ultrasound imaging.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103610"},"PeriodicalIF":10.7,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143859567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alvaro Gomariz , Yusuke Kikuchi , Yun Yvonna Li , Thomas Albrecht , Andreas Maunz , Daniela Ferrara , Huanxiang Lu , Orcun Goksel
{"title":"Joint semi-supervised and contrastive learning enables domain generalization and multi-domain segmentation","authors":"Alvaro Gomariz , Yusuke Kikuchi , Yun Yvonna Li , Thomas Albrecht , Andreas Maunz , Daniela Ferrara , Huanxiang Lu , Orcun Goksel","doi":"10.1016/j.media.2025.103575","DOIUrl":"10.1016/j.media.2025.103575","url":null,"abstract":"<div><div>Despite their effectiveness, current deep learning models face challenges with images coming from different domains with varying appearance and content. We introduce SegCLR, a versatile framework designed to segment images across different domains, employing supervised and contrastive learning simultaneously to effectively learn from both labeled and unlabeled data. We demonstrate the superior performance of SegCLR through a comprehensive evaluation involving three diverse clinical datasets of 3D retinal Optical Coherence Tomography (OCT) images, for the slice-wise segmentation of fluids with various network configurations and verification across 10 different network initializations. In an unsupervised domain adaptation context, SegCLR achieves results on par with a supervised upper-bound model trained on the intended target domain. Notably, we discover that the segmentation performance of SegCLR framework is marginally impacted by the abundance of unlabeled data from the target domain, thereby we also propose an effective domain generalization extension of SegCLR, known also as zero-shot domain adaptation, which eliminates the need for any target domain information. This shows that our proposed addition of contrastive loss in standard supervised training for segmentation leads to superior models, inherently more generalizable to both in- and out-of-domain test data. We additionally propose a pragmatic solution for SegCLR deployment in realistic scenarios with multiple domains containing labeled data. Accordingly, our framework pushes the boundaries of deep-learning based segmentation in multi-domain applications, regardless of data availability — labeled, unlabeled, or nonexistent.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103575"},"PeriodicalIF":10.7,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Baihong Xie , Heye Zhang , Anbang Wang , Xiujian Liu , Zhifan Gao
{"title":"Bi-variational physics-informed operator network for fractional flow reserve curve assessment from coronary angiography","authors":"Baihong Xie , Heye Zhang , Anbang Wang , Xiujian Liu , Zhifan Gao","doi":"10.1016/j.media.2025.103564","DOIUrl":"10.1016/j.media.2025.103564","url":null,"abstract":"<div><div>The coronary angiography-derived fractional flow reserve (FFR) curve, referred to as the Angio-FFR curve, is crucial for guiding percutaneous coronary intervention (PCI). The invasive FFR is the diagnostic gold standard for determining functional significance and is recommended to complement coronary angiography. The invasive FFR curve can quantitatively define disease patterns. The Angio-FFR curve further overcomes the limitation of invasive FFR measurement and thus emerges as a promising approach. However, the Angio-FFR curve computation suffers from a lack of satisfactory trade-off between accuracy and efficiency. In this paper, we propose a bi-variational physics-informed neural operator (BVPINO) for FFR curve assessment from coronary angiography. Our BVPINO combines with the variational mechanism to guide the basis function learning and residual evaluation. Extensive experiments involving coronary angiographies of 215 vessels from 184 subjects demonstrate the optimal balance of BVPINO between effectiveness and efficiency, compared with computational-based models and other machine/deep learning-based models. The results also provide high agreement and correlation between the distal FFR predictions of BVPINO and the invasive FFR measurements. Besides, we discuss the Angio-FFR curve assessment for a novel gradient-based index. A series of case studies demonstrate the effectiveness and superiority of BVPINO for predicting the FFR curve along the coronary artery centerline.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103564"},"PeriodicalIF":10.7,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From tissue to sound: A new paradigm for medical sonic interaction design","authors":"Sasan Matinfar , Shervin Dehghani , Mehrdad Salehi , Michael Sommersperger , Navid Navab , Koorosh Faridpooya , Merle Fairhurst , Nassir Navab","doi":"10.1016/j.media.2025.103571","DOIUrl":"10.1016/j.media.2025.103571","url":null,"abstract":"<div><div>Medical imaging maps tissue characteristics into image intensity values, enhancing human perception. However, comprehending this data, especially in high-stakes scenarios such as surgery, is prone to errors. Additionally, current multimodal methods do not fully leverage this valuable data in their design. We introduce “From Tissue to Sound,” a new paradigm for medical sonic interaction design. This paradigm establishes a comprehensive framework for mapping tissue characteristics to auditory displays, providing dynamic and intuitive access to medical images that complement visual data, thereby enhancing multimodal perception. “From Tissue to Sound” provides an advanced and adaptable framework for the interactive sonification of multimodal medical imaging data. This framework employs a physics-based sound model composed of a network of multiple oscillators, whose mechanical properties—such as friction and stiffness—are defined by tissue characteristics extracted from imaging data. This approach enables the representation of anatomical structures and the creation of unique acoustic profiles in response to excitations of the sound model. This method allows users to explore data at a fundamental level, identifying tissue characteristics ranging from rigid to soft, dense to sparse, and structured to scattered. It facilitates intuitive discovery of both general and detailed patterns with minimal preprocessing. Unlike conventional methods that transform low-dimensional data into global sound features through a parametric approach, this method utilizes model-based unsupervised mapping between data and an anatomical sound model, enabling high-dimensional data processing. The versatility of this method is demonstrated through feasibility experiments confirming the generation of perceptually discernible acoustic signals. Furthermore, we present a novel application developed based on this framework for retinal surgery. This new paradigm opens up possibilities for designing multisensory applications for multimodal imaging data. It also facilitates the creation of interactive sonification models with various auditory causality approaches, enhancing both directness and richness.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103571"},"PeriodicalIF":10.7,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143820432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Stutz , Ali Taylan Cemgil , Abhijit Guha Roy , Tatiana Matejovicova , Melih Barsbey , Patricia Strachan , Mike Schaekermann , Jan Freyberg , Rajeev Rikhye , Beverly Freeman , Javier Perez Matos , Umesh Telang , Dale R. Webster , Yuan Liu , Greg S. Corrado , Yossi Matias , Pushmeet Kohli , Yun Liu , Arnaud Doucet , Alan Karthikesalingam
{"title":"Evaluating medical AI systems in dermatology under uncertain ground truth","authors":"David Stutz , Ali Taylan Cemgil , Abhijit Guha Roy , Tatiana Matejovicova , Melih Barsbey , Patricia Strachan , Mike Schaekermann , Jan Freyberg , Rajeev Rikhye , Beverly Freeman , Javier Perez Matos , Umesh Telang , Dale R. Webster , Yuan Liu , Greg S. Corrado , Yossi Matias , Pushmeet Kohli , Yun Liu , Arnaud Doucet , Alan Karthikesalingam","doi":"10.1016/j.media.2025.103556","DOIUrl":"10.1016/j.media.2025.103556","url":null,"abstract":"<div><div>For safety, medical AI systems undergo thorough evaluations before deployment, validating their predictions against a ground truth which is assumed to be fixed and certain. However, in medical applications, this ground truth is often curated in the form of differential diagnoses provided by multiple experts. While a single differential diagnosis reflects the uncertainty in one expert assessment, multiple experts introduce another layer of uncertainty through potential disagreement. Both forms of uncertainty are ignored in standard evaluation which aggregates these differential diagnoses to a single label. In this paper, we show that ignoring uncertainty leads to overly optimistic estimates of model performance, therefore underestimating risk associated with particular diagnostic decisions. Moreover, point estimates largely ignore dramatic differences in uncertainty of individual cases. To this end, we propose a <em>statistical aggregation</em> approach, where we infer a distribution on probabilities of underlying medical condition candidates themselves, based on observed annotations. This formulation naturally accounts for the potential disagreements between different experts, as well as uncertainty stemming from individual differential diagnoses, capturing the entire <em>ground truth uncertainty</em>. Practically, our approach boils down to generating multiple samples of medical condition probabilities, then evaluating and averaging performance metrics based on these sampled probabilities, instead of relying on a single point estimate. This allows us to provide uncertainty-adjusted estimates of common metrics of interest such as top-<span><math><mi>k</mi></math></span> accuracy and average overlap. In the skin condition classification problem of Liu <em>et al</em>., (2020), our methodology reveals significant ground truth uncertainty for most data points and demonstrates that standard evaluation techniques can overestimate performance by several percentage points. We conclude that, while assuming a crisp ground truth <em>may</em> be acceptable for many AI applications, a more nuanced evaluation protocol acknowledging the inherent complexity and variability of differential diagnoses should be utilized in medical diagnosis.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103556"},"PeriodicalIF":10.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143824402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gang Qu , Ziyu Zhou , Vince D. Calhoun , Aiying Zhang , Yu-Ping Wang
{"title":"Integrated brain connectivity analysis with fMRI, DTI, and sMRI powered by interpretable graph neural networks","authors":"Gang Qu , Ziyu Zhou , Vince D. Calhoun , Aiying Zhang , Yu-Ping Wang","doi":"10.1016/j.media.2025.103570","DOIUrl":"10.1016/j.media.2025.103570","url":null,"abstract":"<div><div>Multimodal neuroimaging data modeling has become a widely used approach but confronts considerable challenges due to their heterogeneity, which encompasses variability in data types, scales, and formats across modalities. This variability necessitates the deployment of advanced computational methods to integrate and interpret diverse datasets within a cohesive analytical framework. In our research, we combine functional magnetic resonance imaging (fMRI), diffusion tensor imaging (DTI), and structural MRI (sMRI) for joint analysis. This integration capitalizes on the unique strengths of each modality and their inherent interconnections, aiming for a comprehensive understanding of the brain’s connectivity and anatomical characteristics. Utilizing the Glasser atlas for parcellation, we integrate imaging-derived features from multiple modalities – functional connectivity from fMRI, structural connectivity from DTI, and anatomical features from sMRI – within consistent regions. Our approach incorporates a masking strategy to differentially weight neural connections, thereby facilitating an amalgamation of multimodal imaging data. This technique enhances interpretability at the connectivity level, transcending traditional analyses centered on singular regional attributes. The model is applied to the Human Connectome Project’s Development study to elucidate the associations between multimodal imaging and cognitive functions throughout youth. The analysis demonstrates improved prediction accuracy and uncovers crucial anatomical features and neural connections, deepening our understanding of brain structure and function. This study not only advances multimodal neuroimaging analytics by offering a novel method for integrative analysis of diverse imaging modalities but also improves the understanding of intricate relationships between brain’s structural and functional networks and cognitive development.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103570"},"PeriodicalIF":10.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143845299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuehui Qiu , Dandan Shan , Yining Wang , Pei Dong , Dijia Wu , Xinnian Yang , Qingqi Hong , Dinggang Shen
{"title":"A topology-preserving three-stage framework for fully-connected coronary artery extraction","authors":"Yuehui Qiu , Dandan Shan , Yining Wang , Pei Dong , Dijia Wu , Xinnian Yang , Qingqi Hong , Dinggang Shen","doi":"10.1016/j.media.2025.103578","DOIUrl":"10.1016/j.media.2025.103578","url":null,"abstract":"<div><div>Coronary artery extraction is a crucial prerequisite for computer-aided diagnosis of coronary artery disease. Accurately extracting the complete coronary tree remains challenging due to several factors, including presence of thin distal vessels, tortuous topological structures, and insufficient contrast. These issues often result in over-segmentation and under-segmentation in current segmentation methods. To address these challenges, we propose a topology-preserving three-stage framework for fully-connected coronary artery extraction. This framework includes vessel segmentation, centerline reconnection, and missing vessel reconstruction. First, we introduce a new centerline enhanced loss in the segmentation process. Second, for the broken vessel segments, we further propose a regularized walk algorithm to integrate distance, probabilities predicted by a centerline classifier, and directional cosine similarity, for reconnecting the centerlines. Third, we apply implicit neural representation and implicit modeling, to reconstruct the geometric model of the missing vessels. Experimental results show that our proposed framework outperforms existing methods, achieving Dice scores of 88.53% and 85.07%, with Hausdorff Distances (HD) of 1.07 mm and 1.63 mm on ASOCA and PDSCA datasets, respectively. Code will be available at <span><span>https://github.com/YH-Qiu/CorSegRec</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103578"},"PeriodicalIF":10.7,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143829163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}