{"title":"LoRA-Enhanced RT-DETR: First Low-Rank Adaptation based DETR for real-time full body anatomical structures identification in musculoskeletal ultrasound","authors":"Jyun-Ping Kao , Yu-Ching Chung , Hao-Yu Hung , Chun-Ping Chen , Wen-Shiang Chen","doi":"10.1016/j.compmedimag.2025.102583","DOIUrl":"10.1016/j.compmedimag.2025.102583","url":null,"abstract":"<div><div>Medical imaging models for object identification often rely on extensive pretraining data, which is difficult to obtain due to data scarcity and privacy constraints. In practice, hospitals typically have access only to pretrained model weights without the original training data limiting their ability to tailor models to specific patient populations and imaging devices. We address this challenge with the first Low-Rank Adaptation (LoRA)-enhanced Real-Time Detection Transformer (RT-DETR) model for full body musculoskeletal (MSK) ultrasound (US). By injecting LoRA modules into select encoder and decoder layers of RT-DETR, we achieved a 99.45 % (RT-DETR-L) and 99.68 % (RT-DETR-X) reduction in trainable parameters while preserving the model’s representational power. This extreme reduction enables efficient fine-tuning using only minimal institution-specific data and maintains robust performance even on anatomical structures absent from the fine-tuning set. In extensive 5-fold cross-validation, our LoRA-enhanced model outperformed traditional full-model fine-tuning and maintained or improved detection accuracy across a wide range of MSK structures while demonstrating strong resilience to domain shifts. The proposed LoRA-enhanced RT-DETR significantly lowers the barrier for deploying transformer-based detection in clinics, offering a privacy-conscious, computationally lightweight solution for real-time, full-body MSK US identification.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102583"},"PeriodicalIF":5.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144470848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sheng Miao , Dezhen Wang , Xiaonan Yang , Zitong Liu , Xiang Shen , Dapeng Hao , Chuanli Zhou , Jiufa Cui
{"title":"Dual Stream Feature Fusion 3D Network for supraspinatus tendon tear classification","authors":"Sheng Miao , Dezhen Wang , Xiaonan Yang , Zitong Liu , Xiang Shen , Dapeng Hao , Chuanli Zhou , Jiufa Cui","doi":"10.1016/j.compmedimag.2025.102580","DOIUrl":"10.1016/j.compmedimag.2025.102580","url":null,"abstract":"<div><div>The classification of medical images is of significant importance for computer-aided diagnosis. Supraspinatus tendon tear is a common clinical condition. Classifying the severity of supraspinatus tendon tears accurately aids in the selection of surgical techniques and postoperative rehabilitation. While some studies have classified supraspinatus tendon tears, existing methods lack detailed classification. Inaccurate and insufficiently detailed classification can lead to errors in the selection of surgical techniques, thereby affecting patient treatment and rehabilitation. In addition, the computational complexity of traditional 3D classification models is too high. In this study, we conducted a detailed 6-class classification of the supraspinatus tendon tears for the first time. We propose a novel 3D model for classifying supraspinatus tendon tears, the Dual Stream Feature Fusion 3D Network (DSFF-3DNet). To accelerate the extraction of the Region of Interest (ROI), we trained the Yolov9 model to identify the supraspinatus tendon and save the Yolo label. DSFF-3DNet comprises three stages: feature extraction, feature enhancement, and classification. We performed data augmentation, training, validation and internal testing on a dataset with 1014 patients, and tested on two independent external test sets. DSFF-3DNet achieved AUCs of 97.88, 88.06, and 84.47 on the internal test set and the two external test sets, respectively, surpassing the best-performing traditional models on these three test sets by 3.51%, 9.25%, and 9.38% across these test sets. Ablation experiments demonstrated the individual contributions of each module in DSFF-3DNet, and significance difference tests showed that the performance improvements were statistically significant (p<span><math><mo><</mo></math></span>0.05).</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102580"},"PeriodicalIF":5.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144338888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peng Lin , Jin-mei Zheng , Chang-wen Liu , Quan-quan Tang , Jin-shu Pang , Qiong Qin , Zhen-hu Lin , Hong Yang
{"title":"Radiogenomic insights suggest that multiscale tumor heterogeneity is associated with interpretable radiomic features and outcomes in cancer patients","authors":"Peng Lin , Jin-mei Zheng , Chang-wen Liu , Quan-quan Tang , Jin-shu Pang , Qiong Qin , Zhen-hu Lin , Hong Yang","doi":"10.1016/j.compmedimag.2025.102586","DOIUrl":"10.1016/j.compmedimag.2025.102586","url":null,"abstract":"<div><h3>Background:</h3><div>To develop radiogenomic subtypes and determine the relationships between radiomic phenotypes and multiomics molecular characteristics.</div></div><div><h3>Materials and Methods:</h3><div>In this retrospective multicohort analysis, we divided patients into different subgroups based on multiomics features. This unsupervised subtyping process was performed by integrating 10 unsupervised machine learning algorithms. We compared the variations in clinicopathological, radiomic, genomic, and transcriptomic features across different subgroups. Based on the key radiomic features of subtypes, overall survival (OS) prediction models were developed and validated by using 10 supervised machine learning algorithms. Model performance was evaluated by using the C-index and log-rank test.</div></div><div><h3>Results:</h3><div>This study included 2,281 patients (mean age, 63 years ±13 [SD]; 660 females, 1,621 males) for analysis. Patients were divided into four subgroups on the basis of radiogenomic data. Significant differences in OS were observed among the subgroups. Subtypes were significantly different when radiomic phenotypes, gene mutation status and transcriptomic pathway alterations were considered. Among the 24 radiomic features important for subtyping, 9 were closely associated with OS. Machine learning algorithms were used to develop prognostic models and showed moderate OS prediction performance in the training (log-rank <span><math><mrow><mi>P</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span>) and test (log-rank <span><math><mrow><mi>P</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span>) cohorts. Tumor molecular heterogeneity is also closely related to the radiomic phenotype.</div></div><div><h3>Conclusions:</h3><div>Biologically interpretable radiomic features provide an effective and novel algorithm for tumor molecular capture and risk stratification.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102586"},"PeriodicalIF":5.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144331331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MDEANet: A multi-scale deep enhanced attention net for popliteal fossa segmentation in ultrasound images","authors":"Fangfang Chen , Wei Fang , Qinghua Wu , Miao Zhou , Wenhui Guo , Liangqing Lin , Zhanheng Chen , Zui Zou","doi":"10.1016/j.compmedimag.2025.102570","DOIUrl":"10.1016/j.compmedimag.2025.102570","url":null,"abstract":"<div><div>Popliteal sciatic nerve block is a widely used technique for lower limb anesthesia. However, despite ultrasound guidance, the complex anatomical structures of the popliteal fossa can present challenges, potentially leading to complications. To accurately identify the bifurcation of the sciatic nerve for nerve blockade, we propose MDEANet, a deep learning-based segmentation network designed for the precise localization of nerves, muscles, and arteries in ultrasound images of the popliteal region. MDEANet incorporates Cascaded Multi-scale Atrous Convolutions (CMAC) to enhance multi-scale feature extraction, Enhanced Spatial Attention Mechanism (ESAM) to focus on key anatomical regions, and Cross-level Feature Fusion (CLFF) to improve contextual representation. This integration markedly improves segmentation of nerves, muscles, and arteries. Experimental results demonstrate that MDEANet achieves an average Intersection over Union (IoU) of 88.60% and a Dice coefficient of 93.95% across all target structures, outperforming state-of-the-art models by 1.68% in IoU and 1.66% in Dice coefficient. Specifically, for nerve segmentation, the Dice coefficient reaches 93.31%, underscoring the effectiveness of our approach. MDEANet has the potential to provide decision-support assistance for anesthesiologists, thereby enhancing the accuracy and efficiency of ultrasound-guided nerve blockade procedures.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102570"},"PeriodicalIF":5.4,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144331422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E.H. Bhuiyan , M.M. Khan , S.A. Hossain , R. Rahman , Q. Luo , M.F. Hossain , K. Wang , M.S.I. Sumon , S. Khalid , M. Karaman , J. Zhang , M.E.H. Chowdhury , W. Zhu , X.J. Zhou
{"title":"Classification of glioma grade and Ki-67 level prediction in MRI data: A SHAP-driven interpretation","authors":"E.H. Bhuiyan , M.M. Khan , S.A. Hossain , R. Rahman , Q. Luo , M.F. Hossain , K. Wang , M.S.I. Sumon , S. Khalid , M. Karaman , J. Zhang , M.E.H. Chowdhury , W. Zhu , X.J. Zhou","doi":"10.1016/j.compmedimag.2025.102578","DOIUrl":"10.1016/j.compmedimag.2025.102578","url":null,"abstract":"<div><div>This study focuses on artificial intelligence-driven classification of glioma and Ki-67 leveling using T2w-FLAIR MRI, exploring the association of Ki-67 biomarkers with deep learning (DL) features through explainable artificial intelligence (XAI) and SHapley Additive exPlanations (SHAP). This IRB-approved study included 101 patients with glioma brain tumor acquired MR images with the T2W-FLAIR sequence. We extracted DL bottleneck features using ResNet50 from glioma MR images. Principal component analysis (PCA) was deployed for dimensionality reduction. XAI was used to identify potential features. The XGBosst classified the histologic grades of the glioma and the level of Ki-67. We integrated potential DL features with patient demographics (age and sex) and Ki-67 biomarkers, utilizing SHAP to determine the model’s essential features and interactions. Glioma grade classification and Ki-67 level predictions achieved overall accuracies of 0.94 and 0.91, respectively. It achieved precision scores of 0.92, 0.94, and 0.96 for glioma grades 2, 3, and 4, and 0.88, 0.94, and 0.97 for Ki-67 levels (low: <span><math><mrow><mn>5</mn><mtext>%</mtext><mo>≤</mo><mi>K</mi><mi>i</mi><mo>−</mo><mn>67</mn><mo><</mo><mn>10</mn><mtext>%</mtext></mrow></math></span>, moderate: <span><math><mrow><mn>10</mn><mtext>%</mtext><mo>≤</mo><mi>K</mi><mi>i</mi><mo>−</mo><mn>67</mn><mo>≤</mo><mn>20</mn></mrow></math></span>, and high: <span><math><mrow><mi>K</mi><mi>i</mi><mo>−</mo><mn>67</mn><mo>></mo><mn>20</mn><mtext>%</mtext></mrow></math></span>). Corresponding F1-scores were 0.95, 0.88, and 0.96 for glioma grades and 0.92, 0.93, and 0.87 for Ki-67 levels. SHAP analysis further highlighted a strong association between bottleneck DL features and Ki-67 biomarkers, demonstrating their potential to differentiate glioma grades and Ki-67 levels while offering valuable insights into glioma aggressiveness. This study demonstrates the precise classification of glioma grades and the prediction of Ki-67 levels to underscore the potential of AI-driven MRI analysis to enhance clinical decision-making in glioma management.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102578"},"PeriodicalIF":5.4,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144322100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Three-step-guided visual prediction of glioblastoma recurrence from multimodality images","authors":"Chen Zhao , Meidi Chen , Xiaobo Wen , Jianping Song , Yifan Yuan , Qiu Huang","doi":"10.1016/j.compmedimag.2025.102585","DOIUrl":"10.1016/j.compmedimag.2025.102585","url":null,"abstract":"<div><div>Accurately predicting glioblastoma (GBM) recurrence is crucial for guiding the planning of target areas in subsequent radiotherapy and radiosurgery for glioma patients. Current prediction methods can determine the likelihood and type of recurrence but cannot identify the specific region or visually display location of the recurrence. To efficiently and accurately predict the recurrence of GBM, we proposed a three-step-guided prediction method consisting of feature extraction and segmentation (FES), radiomics analysis, and tag constraints to narrow the predicted region of GBM recurrence and standardize the shape of GBM recurrence prediction. Particularly in FES we developed an adaptive fusion module and a modality fusion module to fuse feature maps from different modalities. In the modality fusion module proposed, we designed different convolution modules (Conv-D and Conv-P) specifically for diffusion tensor imaging (DTI) and Positron Emission Computed Tomography (PET) images to extract recurrence-related features. Additionally, model fusion is proposed in the stable diffusion training process to learn and integrate the individual and typical properties of the recurrent tumors from different patients. Contrasted with existing segmentation and generation methods, our three-step-guided prediction method improves the ability to predict distant recurrence of GBM, achieving a 28.93 Fréchet Inception Distance (FID), and a 0.9113 Dice Similarity Coefficient (DSC). Quantitative results demonstrate the effectiveness of the proposed method in predicting the recurrence of GBM with the type and location. To the best of our knowledge, this is the first study combines the stable diffusion and multimodal images fusion with PET and DTI from different institutions to predict both distant and local recurrence of GBM in the form of images.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102585"},"PeriodicalIF":5.4,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144298054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunpeng Zhang , Huixiang Zhuang , Yue Guan , Yao Li
{"title":"Robust Bayesian brain extraction by integrating structural subspace-based spatial prior into deep neural networks","authors":"Yunpeng Zhang , Huixiang Zhuang , Yue Guan , Yao Li","doi":"10.1016/j.compmedimag.2025.102572","DOIUrl":"10.1016/j.compmedimag.2025.102572","url":null,"abstract":"<div><div>Accurate and robust brain extraction, or skull stripping, is essential for studying brain development, aging, and neurological disorders. However, brain images exhibit substantial data heterogeneity due to differences in contrast and geometric characteristics across various diseases, medical institutions and age groups. A fundamental challenge lies in effectively capturing the high-dimensional spatial-intensity distributions of the brain. This paper introduces a novel Bayesian brain extraction method that integrates a structural subspace-based prior, represented as a mixture-of-eigenmodes, with deep learning-based classification to achieve accurate and robust brain extraction. Specifically, we used structural subspace model to effectively capture global spatial-structural distributions of the normal brain. Leveraging this global spatial prior, a multi-resolution, position-dependent neural network is employed to effectively model the local spatial-intensity distributions. A patch-based fusion network is then used to combine these global and local spatial-intensity distributions for final brain extraction. The proposed method has been rigorously evaluated using multi-institutional datasets, including healthy scans across lifespan, images with lesions, and images affected by noise and artifacts, demonstrating superior segmentation accuracy and robustness over the state-of-the-art methods. Our proposed method holds promise for enhancing brain extraction in practical clinical applications.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102572"},"PeriodicalIF":5.4,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144279320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiaoling Lin , Fan Yang , Yang Yan , Haoyu Zhang , Qing Xie , Jiaju Zheng , Wenze Yang , Ling Qian , Shaoxing Liu , Weigen Yao , Xiaobo Qu
{"title":"Physics-informed neural networks for denoising high b-value diffusion-weighted images","authors":"Qiaoling Lin , Fan Yang , Yang Yan , Haoyu Zhang , Qing Xie , Jiaju Zheng , Wenze Yang , Ling Qian , Shaoxing Liu , Weigen Yao , Xiaobo Qu","doi":"10.1016/j.compmedimag.2025.102579","DOIUrl":"10.1016/j.compmedimag.2025.102579","url":null,"abstract":"<div><div>Diffusion-weighted imaging (DWI) is widely applied in tumor diagnosis by measuring the diffusion of water molecules. To increase the sensitivity to tumor identification, faithful high b-value DWI images are expected by setting a stronger strength of gradient field in magnetic resonance imaging (MRI). However, high b-value DWI images are heavily affected by reduced signal-to-noise ratio due to the exponential decay of signal intensity. Thus, removing noise becomes important for high b-value DWI images. Here, we propose a Physics-Informed neural Network for high b-value DWI images Denoising (PIND) by leveraging information from physics-informed loss and prior information from low b-value DWI images with high signal-to-noise ratio. Experiments are conducted on a prostate DWI dataset that has 125 subjects. Compared with the original noisy images, PIND improves the peak signal-to-noise ratio from 31.25 dB to 36.28 dB, and structural similarity index measure from 0.77 to 0.92. Our schemes can save 83% data acquisition time since fewer averages of high b-value DWI images need to be acquired, while maintaining 98% accuracy of the apparent diffusion coefficient value, suggesting its potential effectiveness in preserving essential diffusion characteristics. Reader study by 4 radiologists (3, 6, 13, and 18 years of experience) indicates PIND’s promising performance on overall quality, signal-to-noise ratio, artifact suppression, and lesion conspicuity, showing potential for improving clinical DWI applications.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102579"},"PeriodicalIF":5.4,"publicationDate":"2025-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144243049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lejun Gong , Jiaming Yang , Shengyuan Han , Yimu Ji
{"title":"MedBLIP: A multimodal method of medical question-answering based on fine-tuning large language model","authors":"Lejun Gong , Jiaming Yang , Shengyuan Han , Yimu Ji","doi":"10.1016/j.compmedimag.2025.102581","DOIUrl":"10.1016/j.compmedimag.2025.102581","url":null,"abstract":"<div><div>Medical visual question answering is crucial for effectively interpreting medical images containing clinically relevant information. This study proposes a method called MedBLIP (Medical Treatment Bootstrapping Language-Image Pretraining) to tackle visual language generation tasks related to chest X-rays in the medical field. The method combine an image encoder with a large-scale language model, and effectively generates medical question-answering text through a strategy of freezing the image encoder based on the BLIP-2 model. Firstly, chest X-ray images are preprocessed, and an image sample generation algorithm is used to enhance the text data of doctor-patient question-answering, thereby increasing data diversity. Then, a multi-layer convolutional image feature extractor is introduced to better capture the feature representation of medical images. During the fine-tuning process of the large language generation model, a new unfreezing strategy is proposed, which is to unfreeze different proportions of the weights of the fully connected layer to adapt to the data in the medical field. The image feature extractor is responsible for extracting key features from images, providing the model with rich visual information, while the text feature extractor accurately captures the essential requirements of the user's question. Through their synergistic interaction, the model can more effectively integrate medical images and user inquiries, thereby generating more accurate and relevant output content. The experimental results show that unfreezing 31.25 % of the weights of the fully connected layer can significantly improve the performance of the model, with ROUGE-L reaching 66.12 %, and providing a more accurate and efficient answer generation solution for the medical field. The method of this study has potential applications in the field of medical language generation tasks. Although the proposed model cannot yet fully replace human radiologists, it plays an indispensable role in improving diagnostic efficiency, assisting decision-making, and supporting medical research. With continuous technological advancements, the model's performance will be further enhanced, and its application value in the medical field will become even more significant. The algorithm implementation can be obtained from <span><span>https://github.com/JiminFohill/MedicalChat.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102581"},"PeriodicalIF":5.4,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144230408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shiman Li , Mingzhi Yuan , Xiaokun Dai , Chenxi Zhang
{"title":"Evaluation of uncertainty estimation methods in medical image segmentation: Exploring the usage of uncertainty in clinical deployment","authors":"Shiman Li , Mingzhi Yuan , Xiaokun Dai , Chenxi Zhang","doi":"10.1016/j.compmedimag.2025.102574","DOIUrl":"10.1016/j.compmedimag.2025.102574","url":null,"abstract":"<div><div>Uncertainty estimation methods are essential for the application of artificial intelligence (AI) models in medical image segmentation, particularly in addressing reliability and feasibility challenges in clinical deployment. Despite their significance, the adoption of uncertainty estimation methods in clinical practice remains limited due to the lack of a comprehensive evaluation framework tailored to their clinical usage. To address this gap, a simulation of uncertainty-assisted clinical workflows is conducted, highlighting the roles of uncertainty in model selection, sample screening, and risk visualization. Furthermore, uncertainty evaluation is extended to pixel, sample, and model levels to enable a more thorough assessment. At the pixel level, the Uncertainty Confusion Metric (UCM) is proposed, utilizing density curves to improve robustness against variability in uncertainty distributions and to assess the ability of pixel uncertainty to identify potential errors. At the sample level, the Expected Segmentation Calibration Error (ESCE) is introduced to provide more accurate calibration aligned with Dice, enabling more effective identification of low-quality samples. At the model level, the Harmonic Dice (HDice) metric is developed to integrate uncertainty and accuracy, mitigating the influence of dataset biases and offering a more robust evaluation of model performance on unseen data. Using this systematic evaluation framework, five mainstream uncertainty estimation methods are compared on organ and tumor datasets, providing new insights into their clinical applicability. Extensive experimental analyses validated the practicality and effectiveness of the proposed metrics. This study offers clear guidance for selecting appropriate uncertainty estimation methods in clinical settings, facilitating their integration into clinical workflows and ultimately improving diagnostic efficiency and patient outcomes.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"124 ","pages":"Article 102574"},"PeriodicalIF":5.4,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144195153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}