{"title":"Comparison of publicly available artificial intelligence models for pancreatic segmentation on T1-weighted Dixon images.","authors":"Yuki Sonoda, Shota Fujisawa, Mariko Kurokawa, Wataru Gonoi, Shouhei Hanaoka, Takeharu Yoshikawa, Osamu Abe","doi":"10.1007/s11604-025-01814-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to compare three publicly available deep learning models (TotalSegmentator, TotalVibeSegmentator, and PanSegNet) for automated pancreatic segmentation on magnetic resonance images and to evaluate their performance against human annotations in terms of segmentation accuracy, volumetric measurement, and intrapancreatic fat fraction (IPFF) assessment.</p><p><strong>Materials and methods: </strong>Twenty upper abdominal T1-weighted magnetic resonance series acquired using the two-point Dixon method were randomly selected. Three radiologists manually segmented the pancreas, and a ground-truth mask was constructed through a majority vote per voxel. Pancreatic segmentation was also performed using the three artificial intelligence models. Performance was evaluated using the Dice similarity coefficient (DSC), 95th-percentile Hausdorff distance, average symmetric surface distance, positive predictive value, sensitivity, Bland-Altman plots, and concordance correlation coefficient (CCC) for pancreatic volume and IPFF.</p><p><strong>Results: </strong>PanSegNet achieved the highest DSC (mean ± standard deviation, 0.883 ± 0.095) and showed no statistically significant difference from the human interobserver DSC (0.896 ± 0.068; p = 0.24). In contrast, TotalVibeSegmentator (0.731 ± 0.105) and TotalSegmentator (0.707 ± 0.142) had significantly lower DSC values compared with the human interobserver average (p < 0.001). For pancreatic volume and IPFF, PanSegNet demonstrated the best agreement with the ground truth (CCC values of 0.958 and 0.993, respectively), followed by TotalSegmentator (0.834 and 0.980) and TotalVibeSegmentator (0.720 and 0.672).</p><p><strong>Conclusion: </strong>PanSegNet demonstrated the highest segmentation accuracy and the best agreement with human measurements for both pancreatic volume and IPFF on T1-weighted Dixon images. This model appears to be the most suitable for large-scale studies requiring automated pancreatic segmentation and intrapancreatic fat evaluation.</p>","PeriodicalId":14691,"journal":{"name":"Japanese Journal of Radiology","volume":" ","pages":"1663-1669"},"PeriodicalIF":2.1000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12479682/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Japanese Journal of Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11604-025-01814-5","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/18 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This study aimed to compare three publicly available deep learning models (TotalSegmentator, TotalVibeSegmentator, and PanSegNet) for automated pancreatic segmentation on magnetic resonance images and to evaluate their performance against human annotations in terms of segmentation accuracy, volumetric measurement, and intrapancreatic fat fraction (IPFF) assessment.
Materials and methods: Twenty upper abdominal T1-weighted magnetic resonance series acquired using the two-point Dixon method were randomly selected. Three radiologists manually segmented the pancreas, and a ground-truth mask was constructed through a majority vote per voxel. Pancreatic segmentation was also performed using the three artificial intelligence models. Performance was evaluated using the Dice similarity coefficient (DSC), 95th-percentile Hausdorff distance, average symmetric surface distance, positive predictive value, sensitivity, Bland-Altman plots, and concordance correlation coefficient (CCC) for pancreatic volume and IPFF.
Results: PanSegNet achieved the highest DSC (mean ± standard deviation, 0.883 ± 0.095) and showed no statistically significant difference from the human interobserver DSC (0.896 ± 0.068; p = 0.24). In contrast, TotalVibeSegmentator (0.731 ± 0.105) and TotalSegmentator (0.707 ± 0.142) had significantly lower DSC values compared with the human interobserver average (p < 0.001). For pancreatic volume and IPFF, PanSegNet demonstrated the best agreement with the ground truth (CCC values of 0.958 and 0.993, respectively), followed by TotalSegmentator (0.834 and 0.980) and TotalVibeSegmentator (0.720 and 0.672).
Conclusion: PanSegNet demonstrated the highest segmentation accuracy and the best agreement with human measurements for both pancreatic volume and IPFF on T1-weighted Dixon images. This model appears to be the most suitable for large-scale studies requiring automated pancreatic segmentation and intrapancreatic fat evaluation.
期刊介绍:
Japanese Journal of Radiology is a peer-reviewed journal, officially published by the Japan Radiological Society. The main purpose of the journal is to provide a forum for the publication of papers documenting recent advances and new developments in the field of radiology in medicine and biology. The scope of Japanese Journal of Radiology encompasses but is not restricted to diagnostic radiology, interventional radiology, radiation oncology, nuclear medicine, radiation physics, and radiation biology. Additionally, the journal covers technical and industrial innovations. The journal welcomes original articles, technical notes, review articles, pictorial essays and letters to the editor. The journal also provides announcements from the boards and the committees of the society. Membership in the Japan Radiological Society is not a prerequisite for submission. Contributions are welcomed from all parts of the world.