Medical image analysis最新文献

筛选
英文 中文
Rethinking boundary detection in deep learning-based medical image segmentation 基于深度学习的医学图像分割中边界检测的再思考
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-06 DOI: 10.1016/j.media.2025.103615
Yi Lin , Dong Zhang , Xiao Fang , Yufan Chen , Kwang-Ting Cheng , Hao Chen
{"title":"Rethinking boundary detection in deep learning-based medical image segmentation","authors":"Yi Lin ,&nbsp;Dong Zhang ,&nbsp;Xiao Fang ,&nbsp;Yufan Chen ,&nbsp;Kwang-Ting Cheng ,&nbsp;Hao Chen","doi":"10.1016/j.media.2025.103615","DOIUrl":"10.1016/j.media.2025.103615","url":null,"abstract":"<div><div>Medical image segmentation is a pivotal task within the realms of medical image analysis and computer vision. While current methods have shown promise in accurately segmenting major regions of interest, the precise segmentation of boundary areas remains challenging. In this study, we propose a novel network architecture named CTO, which combines Convolutional Neural Networks (CNNs), Vision Transformer (ViT) models, and explicit edge detection operators to tackle this challenge. CTO surpasses existing methods in terms of segmentation accuracy and strikes a better balance between accuracy and efficiency, without the need for additional data inputs or label injections. Specifically, CTO adheres to the canonical encoder–decoder network paradigm, with a dual-stream encoder network comprising a mainstream CNN stream for capturing local features and an auxiliary StitchViT stream for integrating long-range dependencies. Furthermore, to enhance the model’s ability to learn boundary areas, we introduce a boundary-guided decoder network that employs binary boundary masks generated by dedicated edge detection operators to provide explicit guidance during the decoding process. We validate the performance of CTO through extensive experiments conducted on seven challenging medical image segmentation datasets, namely ISIC 2016, PH2, ISIC 2018, CoNIC, LiTS17, BraTS, and BTCV. Our experimental results unequivocally demonstrate that CTO achieves state-of-the-art accuracy on these datasets while maintaining competitive model complexity. The codes have been released at: <span><span>CTO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103615"},"PeriodicalIF":10.7,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143916897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REPAIR: Reciprocal assistance imputation-representation learning for glioma diagnosis with incomplete MRI sequences 修复:对不完整MRI序列的胶质瘤诊断的相互辅助假设-表征学习
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-06 DOI: 10.1016/j.media.2025.103634
Chuixing Wu , Jincheng Xie , Fangrong Liang , Weixiong Zhong , Ruimeng Yang , Yuankui Wu , Tao Liang , Linjing Wang , Xin Zhen
{"title":"REPAIR: Reciprocal assistance imputation-representation learning for glioma diagnosis with incomplete MRI sequences","authors":"Chuixing Wu ,&nbsp;Jincheng Xie ,&nbsp;Fangrong Liang ,&nbsp;Weixiong Zhong ,&nbsp;Ruimeng Yang ,&nbsp;Yuankui Wu ,&nbsp;Tao Liang ,&nbsp;Linjing Wang ,&nbsp;Xin Zhen","doi":"10.1016/j.media.2025.103634","DOIUrl":"10.1016/j.media.2025.103634","url":null,"abstract":"<div><div>The absence of MRI sequences is a common occurrence in clinical practice, posing a significant challenge for prediction modeling of non-invasive diagnosis of glioma (GM) via fusion of multi-sequence MRI. To address this issue, we propose a novel unified reciprocal assistance imputation-representation learning framework (namely REPAIR) for GM diagnosis modeling with incomplete MRI sequences. REPAIR facilitates a cooperative process between missing value imputation and multi-sequence MRI fusion by leveraging existing samples to inform the imputation of missing values. This, in turn, facilitates the learning of a shared latent representation, which reciprocally guides more accurate imputation of missing values. To tailor the learned representation for downstream tasks, a novel ambiguity-aware intercorrelation regularization is introduced to equip REPAIR by correlating imputation ambiguity and its impacts conveying to the learned representation via a fuzzy paradigm. Additionally, a multimodal structural calibration constraint is devised to correct for the structural shift caused by missing data, ensuring structural consistency between the learned representations and the actual data. The proposed methodology is extensively validated on eight GM datasets with incomplete MRI sequences and six clinical datasets from other diseases with incomplete imaging modalities. Comprehensive comparisons with state-of-the-art methods have demonstrated the competitiveness of our approach for GM diagnosis with incomplete MRI sequences, as well as its potential for generalization to various diseases with missing imaging modalities.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103634"},"PeriodicalIF":10.7,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Monocular pose estimation of articulated open surgery tools - in the wild 关节开放手术工具的单眼姿态估计-在野外。
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-03 DOI: 10.1016/j.media.2025.103618
Robert Spektor , Tom Friedman , Itay Or , Gil Bolotin , Shlomi Laufer
{"title":"Monocular pose estimation of articulated open surgery tools - in the wild","authors":"Robert Spektor ,&nbsp;Tom Friedman ,&nbsp;Itay Or ,&nbsp;Gil Bolotin ,&nbsp;Shlomi Laufer","doi":"10.1016/j.media.2025.103618","DOIUrl":"10.1016/j.media.2025.103618","url":null,"abstract":"<div><div>This work presents a framework for monocular 6D pose estimation of surgical instruments in open surgery, addressing challenges such as object articulations, specularity, occlusions, and synthetic-to-real domain adaptation. The proposed approach consists of three main components: <span><math><mrow><mo>(</mo><mn>1</mn><mo>)</mo></mrow></math></span> synthetic data generation pipeline that incorporates 3D scanning of surgical tools with articulation rigging and physically-based rendering; <span><math><mrow><mo>(</mo><mn>2</mn><mo>)</mo></mrow></math></span> a tailored pose estimation framework combining tool detection with pose and articulation estimation; and <span><math><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></math></span> a training strategy on synthetic and real unannotated video data, employing domain adaptation with automatically generated pseudo-labels. Evaluations conducted on real data of open surgery demonstrate the good performance and real-world applicability of the proposed framework, highlighting its potential for integration into medical augmented reality and robotic systems. The approach eliminates the need for extensive manual annotation of real surgical data.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103618"},"PeriodicalIF":10.7,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143971799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
XCAT 3.0: A comprehensive library of personalized digital twins derived from CT scans XCAT 3.0:基于CT扫描的个性化数字双胞胎综合库
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-03 DOI: 10.1016/j.media.2025.103636
Lavsen Dahal , Mobina Ghojoghnejad , Liesbeth Vancoillie , Dhrubajyoti Ghosh , Yubraj Bhandari , David Kim , Fong Chi Ho , Fakrul Islam Tushar , Sheng Luo , Kyle J. Lafata , Ehsan Abadi , Ehsan Samei , Joseph Y. Lo , W. Paul Segars
{"title":"XCAT 3.0: A comprehensive library of personalized digital twins derived from CT scans","authors":"Lavsen Dahal ,&nbsp;Mobina Ghojoghnejad ,&nbsp;Liesbeth Vancoillie ,&nbsp;Dhrubajyoti Ghosh ,&nbsp;Yubraj Bhandari ,&nbsp;David Kim ,&nbsp;Fong Chi Ho ,&nbsp;Fakrul Islam Tushar ,&nbsp;Sheng Luo ,&nbsp;Kyle J. Lafata ,&nbsp;Ehsan Abadi ,&nbsp;Ehsan Samei ,&nbsp;Joseph Y. Lo ,&nbsp;W. Paul Segars","doi":"10.1016/j.media.2025.103636","DOIUrl":"10.1016/j.media.2025.103636","url":null,"abstract":"<div><div>Virtual Imaging Trials (VIT) offer a cost-effective and scalable approach for evaluating medical imaging technologies. Computational phantoms, which mimic real patient anatomy and physiology, play a central role in VITs. However, the current libraries of computational phantoms face limitations, particularly in terms of sample size and heterogeneity. Insufficient representation of the population hampers accurate assessment of imaging technologies across different patient groups. Traditionally, the more realistic computational phantoms were created by manual segmentation, which is a laborious and time-consuming task, impeding the expansion of phantom libraries. This study presents a framework for creating realistic computational phantoms using a suite of automatic segmentation models and performing three forms of automated quality control on the segmented organ masks. The result is the release of over 2500 new XCAT 3 generation of computational phantoms. This new formation embodies 140 structures and represents a comprehensive approach to detailed anatomical modeling. The developed computational phantoms are formatted in both voxelized and surface mesh formats. The framework is combined with an in-house CT scanner simulator to produce realistic CT images. The framework has the potential to advance virtual imaging trials, facilitating comprehensive and reliable evaluations of medical imaging technologies. Phantoms may be requested at <span><span>https://cvit.duke.edu/resources/</span><svg><path></path></svg></span>. Code, model weights, and sample CT images are available at <span><span>https://xcat-3.github.io/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103636"},"PeriodicalIF":10.7,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143927479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation 体积器官分割的基础模型与少镜头参数高效微调
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-02 DOI: 10.1016/j.media.2025.103596
Julio Silva-Rodríguez , Jose Dolz , Ismail Ben Ayed
{"title":"Towards Foundation Models and Few-Shot Parameter-Efficient Fine-Tuning for Volumetric Organ Segmentation","authors":"Julio Silva-Rodríguez ,&nbsp;Jose Dolz ,&nbsp;Ismail Ben Ayed","doi":"10.1016/j.media.2025.103596","DOIUrl":"10.1016/j.media.2025.103596","url":null,"abstract":"<div><div>The recent popularity of foundation models and the pre-train-and-adapt paradigm, where a large-scale model is transferred to downstream tasks, is gaining attention for volumetric medical image segmentation. However, current transfer learning strategies devoted to full fine-tuning for transfer learning may require significant resources and yield sub-optimal results when the labeled data of the target task is scarce. This makes its applicability in real clinical settings challenging since these institutions are usually constrained on data and computational resources to develop proprietary solutions. To address this challenge, we formalize Few-Shot Efficient Fine-Tuning (FSEFT), a novel and realistic scenario for adapting medical image segmentation foundation models. This setting considers the key role of both data- and parameter-efficiency during adaptation. Building on a foundation model pre-trained on open-access CT organ segmentation sources, we propose leveraging Parameter-Efficient Fine-Tuning and black-box Adapters to address such challenges. Furthermore, novel efficient adaptation methodologies are introduced in this work, which include Spatial black-box Adapters that are more appropriate for dense prediction tasks and constrained transductive inference, leveraging task-specific prior knowledge. Our comprehensive transfer learning experiments confirm the suitability of foundation models in medical image segmentation and unveil the limitations of popular fine-tuning strategies in few-shot scenarios. The project code is available: <span><span>https://github.com/jusiro/fewshot-finetuning</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103596"},"PeriodicalIF":10.7,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143923639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic quality control of brain 3D FLAIR MRIs for a clinical data warehouse 用于临床数据仓库的脑3D FLAIR核磁共振成像自动质量控制
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-02 DOI: 10.1016/j.media.2025.103617
Sophie Loizillon , Simona Bottani , Aurélien Maire , Sebastian Ströer , Lydia Chougar , Didier Dormont , Olivier Colliot , Ninon Burgos , APPRIMAGE Study Group
{"title":"Automatic quality control of brain 3D FLAIR MRIs for a clinical data warehouse","authors":"Sophie Loizillon ,&nbsp;Simona Bottani ,&nbsp;Aurélien Maire ,&nbsp;Sebastian Ströer ,&nbsp;Lydia Chougar ,&nbsp;Didier Dormont ,&nbsp;Olivier Colliot ,&nbsp;Ninon Burgos ,&nbsp;APPRIMAGE Study Group","doi":"10.1016/j.media.2025.103617","DOIUrl":"10.1016/j.media.2025.103617","url":null,"abstract":"<div><div>Clinical data warehouses, which have arisen over the last decade, bring together the medical data of millions of patients and offer the potential to train and validate machine learning models in real-world scenarios. The quality of MRIs collected in clinical data warehouses differs significantly from that generally observed in research datasets, reflecting the variability inherent to clinical practice. Consequently, the use of clinical data requires the implementation of robust quality control tools.</div><div>By using a substantial number of pre-existing manually labelled T1-weighted MR images (5,500) alongside a smaller set of newly labelled FLAIR images (926), we present a novel semi-supervised adversarial domain adaptation architecture designed to exploit shared representations between MRI sequences thanks to a shared feature extractor, while taking into account the specificities of the FLAIR thanks to a specific classification head for each sequence. This architecture thus consists of a common invariant feature extractor, a domain classifier and two classification heads specific to the source and target, all designed to effectively deal with potential class distribution shifts between the source and target data classes. The primary objectives of this paper were: (1) to identify images which are not proper 3D FLAIR brain MRIs; (2) to rate the overall image quality.</div><div>For the first objective, our approach demonstrated excellent results, with a balanced accuracy of 89%, comparable to that of human raters. For the second objective, our approach achieved good performance, although lower than that of human raters. Nevertheless, the automatic approach accurately identified bad quality images (balanced accuracy &gt;79%). In conclusion, our proposed approach overcomes the initial barrier of heterogeneous image quality in clinical data warehouses, thereby facilitating the development of new research using clinical routine 3D FLAIR brain images.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103617"},"PeriodicalIF":10.7,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143923640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TIER-LOC: Visual Query-based Video Clip Localization in fetal ultrasound videos with a multi-tier transformer TIER-LOC:基于视觉查询的视频片段定位胎儿超声视频与多层变压器
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-02 DOI: 10.1016/j.media.2025.103611
Divyanshu Mishra , Pramit Saha , He Zhao , Netzahualcoyotl Hernandez-Cruz , Olga Patey , Aris T. Papageorghiou , J. Alison Noble
{"title":"TIER-LOC: Visual Query-based Video Clip Localization in fetal ultrasound videos with a multi-tier transformer","authors":"Divyanshu Mishra ,&nbsp;Pramit Saha ,&nbsp;He Zhao ,&nbsp;Netzahualcoyotl Hernandez-Cruz ,&nbsp;Olga Patey ,&nbsp;Aris T. Papageorghiou ,&nbsp;J. Alison Noble","doi":"10.1016/j.media.2025.103611","DOIUrl":"10.1016/j.media.2025.103611","url":null,"abstract":"<div><div>In this paper, we introduce the Visual Query-based task of Video Clip Localization (VQ-VCL) for medical video understanding. Specifically, we aim to retrieve a video clip containing frames similar to a given exemplar frame from a given input video. To solve the task, we propose a novel visual query-based video clip localization model called TIER-LOC. TIER-LOC is designed to improve video clip retrieval, especially in fine-grained videos by extracting features from different levels, <em>i.e.</em>, coarse to fine-grained, referred to as TIERS. The aim is to utilize multi-Tier features for detecting subtle differences, and adapting to scale or resolution variations, leading to improved video-clip retrieval. TIER-LOC has three main components: (1) a Multi-Tier Spatio-Temporal Transformer to fuse spatio-temporal features extracted from multiple Tiers of video frames with features from multiple Tiers of the visual query enabling better video understanding. (2) a Multi-Tier, Dual Anchor Contrastive Loss to deal with real-world annotation noise which can be notable at event boundaries and in videos featuring highly similar objects. (3) a Temporal Uncertainty-Aware Localization Loss designed to reduce the model sensitivity to imprecise event boundary. This is achieved by relaxing hard boundary constraints thus allowing the model to learn underlying class patterns and not be influenced by individual noisy samples. To demonstrate the efficacy of TIER-LOC, we evaluate it on two ultrasound video datasets and an open-source egocentric video dataset. First, we develop a sonographer workflow assistive task model to detect standard-frame clips in fetal ultrasound heart sweeps. Second, we assess our model’s performance in retrieving standard-frame clips for detecting fetal anomalies in routine ultrasound scans, using the large-scale PULSE dataset. Lastly, we test our model’s performance on an open-source computer vision video dataset by creating a VQ-VCL fine-grained video dataset based on the Ego4D dataset. Our model outperforms the best-performing state-of-the-art model by 7%, 4%, and 4% on the three video datasets, respectively.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103611"},"PeriodicalIF":10.7,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143916906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep implicit optimization enables robust learnable features for deformable image registration 深度隐式优化为可变形图像配准提供了鲁棒的可学习特征
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-02 DOI: 10.1016/j.media.2025.103577
Rohit Jena , Pratik Chaudhari , James C. Gee
{"title":"Deep implicit optimization enables robust learnable features for deformable image registration","authors":"Rohit Jena ,&nbsp;Pratik Chaudhari ,&nbsp;James C. Gee","doi":"10.1016/j.media.2025.103577","DOIUrl":"10.1016/j.media.2025.103577","url":null,"abstract":"<div><div>Deep Learning in Image Registration (DLIR) methods have been tremendously successful in image registration due to their speed and ability to incorporate weak label supervision at training time. However, existing DLIR methods forego many of the benefits and invariances of optimization methods. The lack of a task-specific inductive bias in DLIR methods leads to suboptimal performance, especially in the presence of domain shift. Our method aims to bridge this gap between statistical learning and optimization by explicitly incorporating optimization as a layer in a deep network. A deep network is trained to predict multi-scale dense feature images that are registered using a black box iterative optimization solver. This optimal warp is then used to minimize image and label alignment errors. By <em>implicitly</em> differentiating end-to-end through an iterative optimization solver, we <em>explicitly</em> exploit invariances of the correspondence matching problem induced by the optimization, while learning registration and label-aware features, and guaranteeing the warp functions to be a local minima of the registration objective in the feature space. Our framework shows excellent performance on in-domain datasets, and is agnostic to domain shift such as anisotropy and varying intensity profiles. For the first time, our method allows switching between arbitrary transformation representations (free-form to diffeomorphic) at test time with zero retraining. End-to-end feature learning also facilitates interpretability of features and arbitrary test-time regularization, which is not possible with existing DLIR methods.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103577"},"PeriodicalIF":10.7,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UN-SAM: Domain-adaptive self-prompt segmentation for universal nuclei images 联合国核监测团:通用核图像的自适应自提示分割
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-01 DOI: 10.1016/j.media.2025.103607
Zhen Chen , Qing Xu , Xinyu Liu , Yixuan Yuan
{"title":"UN-SAM: Domain-adaptive self-prompt segmentation for universal nuclei images","authors":"Zhen Chen ,&nbsp;Qing Xu ,&nbsp;Xinyu Liu ,&nbsp;Yixuan Yuan","doi":"10.1016/j.media.2025.103607","DOIUrl":"10.1016/j.media.2025.103607","url":null,"abstract":"<div><div>In digital pathology, precise nuclei segmentation is pivotal yet challenged by the diversity of tissue types, staining protocols, and imaging conditions. Recently, the segment anything model (SAM) revealed overwhelming performance in natural scenarios and impressive adaptation to medical imaging. Despite these advantages, the reliance on labor-intensive manual annotation as segmentation prompts severely hinders their clinical applicability, especially for nuclei image analysis containing massive cells where dense manual prompts are impractical. To overcome the limitations of current SAM methods while retaining the advantages, we propose the domain-adaptive self-prompt SAM framework for Universal Nuclei segmentation (UN-SAM), by providing a fully automated solution with superior performance across different domains. Specifically, to eliminate the labor-intensive requirement of per-nuclei annotations for prompt, we devise a multi-scale Self-Prompt Generation (SPGen) module to revolutionize clinical workflow by automatically generating high-quality mask hints to guide the segmentation tasks. Moreover, to unleash the capability of SAM across a variety of nuclei images, we devise a Domain-adaptive Tuning Encoder (DT-Encoder) to seamlessly harmonize visual features with domain-common and domain-specific knowledge, and further devise a Domain Query-enhanced Decoder (DQ-Decoder) by leveraging learnable domain queries for segmentation decoding in different nuclei domains. Extensive experiments prove that our UN-SAM surpasses state-of-the-arts in nuclei instance and semantic segmentation, especially the generalization capability on unseen nuclei domains. The source code is available at <span><span>https://github.com/CUHK-AIM-Group/UN-SAM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103607"},"PeriodicalIF":10.7,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143927480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel spatial-temporal image fusion method for augmented reality-based endoscopic surgery 一种用于增强现实内镜手术的新型时空图像融合方法
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-01 DOI: 10.1016/j.media.2025.103609
Haochen Shi , Jiangchang Xu , Haitao Li , Shuanglin Jiang , Chaoyu Lei , Huifang Zhou , Yinwei Li , Xiaojun Chen
{"title":"A novel spatial-temporal image fusion method for augmented reality-based endoscopic surgery","authors":"Haochen Shi ,&nbsp;Jiangchang Xu ,&nbsp;Haitao Li ,&nbsp;Shuanglin Jiang ,&nbsp;Chaoyu Lei ,&nbsp;Huifang Zhou ,&nbsp;Yinwei Li ,&nbsp;Xiaojun Chen","doi":"10.1016/j.media.2025.103609","DOIUrl":"10.1016/j.media.2025.103609","url":null,"abstract":"<div><div>Augmented reality (AR) has significant potential to enhance the identification of critical locations during endoscopic surgeries, where accurate endoscope calibration is essential for ensuring the quality of augmented images. In optical-based surgical navigation systems, asynchrony between the optical tracker and the endoscope can cause the augmented scene to diverge from reality during rapid movements, potentially misleading the surgeon—a challenge that remains unresolved. In this paper, we propose a novel spatial–temporal endoscope calibration method that simultaneously determines the spatial transformation from the image to the optical marker and the temporal latency between the tracking and image acquisition systems. To estimate temporal latency, we utilize a Monte Carlo method to estimate the intrinsic parameters of the endoscope’s imaging system, leveraging a dataset of thousands of calibration samples. This dataset is larger than those typically employed in conventional camera calibration routines, rendering traditional algorithms computationally infeasible within a reasonable timeframe. By introducing latency as an independent variable into the principal equation of hand-eye calibration, we developed a weighted algorithm to iteratively solve the equation. This approach eliminates the need for a fixture to stabilize the endoscope during calibration, allowing for quicker calibration through handheld flexible movement. Experimental results demonstrate that our method achieves an average 2D error of <span><math><mrow><mn>7</mn><mo>±</mo><mn>3</mn></mrow></math></span> pixels and a pseudo-3D error of <span><math><mrow><mn>1</mn><mo>.</mo><mn>2</mn><mo>±</mo><mn>0</mn><mo>.</mo><mn>4</mn><mspace></mspace><mi>mm</mi></mrow></math></span> for stable scenes within <span><math><mrow><mn>82</mn><mo>.</mo><mn>4</mn><mo>±</mo><mn>16</mn><mo>.</mo><mn>6</mn></mrow></math></span> seconds—approximately 68% faster in operation time than conventional methods. In dynamic scenes, our method compensates for the virtual-to-reality latency of <span><math><mrow><mn>11</mn><mo>±</mo><mn>2</mn><mspace></mspace><mi>ms</mi></mrow></math></span>, which is shorter than a single frame interval and 5.7 times shorter than the uncompensated conventional method. Finally, we successfully integrated the proposed method into our surgical navigation system and validated its feasibility in clinical trials for transnasal optic canal decompression surgery. Our method has the potential to improve the safety and efficacy of endoscopic surgeries, leading to better patient outcomes.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103609"},"PeriodicalIF":10.7,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143911737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信