Santiago Vitale, José Ignacio Orlando, Emmanuel Iarussi, Alejandro Díaz, Ignacio Larrabide
{"title":"Improving realism in abdominal ultrasound simulation combining a segmentation-guided loss and polar coordinates training.","authors":"Santiago Vitale, José Ignacio Orlando, Emmanuel Iarussi, Alejandro Díaz, Ignacio Larrabide","doi":"10.1002/mp.17801","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Ultrasound (US) simulation helps train physicians and medical students in image acquisition and interpretation, enabling safe practice of transducer manipulation and organ identification. Current simulators generate realistic images from reference scans. Although physics-based simulators provide real-time images, they lack sufficient realism, while recent deep learning-based models based on unpaired image-to-image translation improve realism but introduce anatomical inconsistencies.</p><p><strong>Purpose: </strong>We propose a novel framework to reduce hallucinations from generative adversarial networks (GANs) used on physics-based simulations, enhancing anatomical accuracy and realism in abdominal US simulation. Our method aims to produce anatomically consistent images free from artifacts within and outside the field of view (FoV).</p><p><strong>Methods: </strong>We introduce a segmentation-guided loss to enforce anatomical consistency by using a pre-trained Unet model that segments abdominal organs from physics-based simulated scans. Penalizing segmentation discrepancies before and after the translation cycle helps prevent unrealistic artifacts. Additionally, we propose training GANs on images in polar coordinates to limit the field of view to non-blank regions. We evaluated our approach on unpaired datasets comprising 617 real abdominal US images from a SonoSite-M turbo v1.3 scanner and 971 artificial scans from a ray-casting simulator. Data was partitioned at the patient level into training (70%), validation (10%), and testing (20%). Performance was quantitatively assessed with Frechet and Kernel Inception Distances (FID and KID), and organ-specific <math> <semantics><msup><mi>χ</mi> <mn>2</mn></msup> <annotation>$\\chi ^2$</annotation></semantics> </math> histogram distances, reporting 95% confidence intervals. We compared our model against generative methods such as CUT, UVCGANv2, and UNSB, performing statistical analyses using Wilcoxon tests (FID and KID with Bonferroni-corrected <math> <semantics><mrow><mi>α</mi> <mo>=</mo> <mn>0.01</mn></mrow> <annotation>$\\alpha = 0.01$</annotation></semantics> </math> , <math> <semantics><msup><mi>χ</mi> <mn>2</mn></msup> <annotation>$\\chi ^2$</annotation></semantics> </math> with <math> <semantics><mrow><mi>α</mi> <mo>=</mo> <mn>0.008</mn></mrow> <annotation>$\\alpha =0.008$</annotation></semantics> </math> ). A perceptual realism study involving expert radiologists was also conducted.</p><p><strong>Results: </strong>Our method significantly reduced FID and KID by 66% and 89%, respectively, compared to CycleGAN, and by 34% and 59% compared to the leading alternative UVCGANv2 ( <math> <semantics><mrow><mi>p</mi> <mo>≪</mo> <mn>0.01</mn></mrow> <annotation>$p \\ll 0.01$</annotation></semantics> </math> ). No significant differences ( <math> <semantics><mrow><mi>p</mi> <mo>></mo> <mn>0.008</mn></mrow> <annotation>$p>0.008$</annotation></semantics> </math> ) in echogenicity distributions were found between real and simulated images within liver and gallbladder regions. The user study indicated our simulated scans fooled radiologists in 36.2% of cases, outperforming other methods.</p><p><strong>Conclusions: </strong>Our segmentation-guided, polar-coordinates-trained CycleGAN framework significantly reduces hallucinations, ensuring anatomical consistency, and realism in simulated abdominal US images, surpassing existing methods.</p>","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17801","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Ultrasound (US) simulation helps train physicians and medical students in image acquisition and interpretation, enabling safe practice of transducer manipulation and organ identification. Current simulators generate realistic images from reference scans. Although physics-based simulators provide real-time images, they lack sufficient realism, while recent deep learning-based models based on unpaired image-to-image translation improve realism but introduce anatomical inconsistencies.
Purpose: We propose a novel framework to reduce hallucinations from generative adversarial networks (GANs) used on physics-based simulations, enhancing anatomical accuracy and realism in abdominal US simulation. Our method aims to produce anatomically consistent images free from artifacts within and outside the field of view (FoV).
Methods: We introduce a segmentation-guided loss to enforce anatomical consistency by using a pre-trained Unet model that segments abdominal organs from physics-based simulated scans. Penalizing segmentation discrepancies before and after the translation cycle helps prevent unrealistic artifacts. Additionally, we propose training GANs on images in polar coordinates to limit the field of view to non-blank regions. We evaluated our approach on unpaired datasets comprising 617 real abdominal US images from a SonoSite-M turbo v1.3 scanner and 971 artificial scans from a ray-casting simulator. Data was partitioned at the patient level into training (70%), validation (10%), and testing (20%). Performance was quantitatively assessed with Frechet and Kernel Inception Distances (FID and KID), and organ-specific histogram distances, reporting 95% confidence intervals. We compared our model against generative methods such as CUT, UVCGANv2, and UNSB, performing statistical analyses using Wilcoxon tests (FID and KID with Bonferroni-corrected , with ). A perceptual realism study involving expert radiologists was also conducted.
Results: Our method significantly reduced FID and KID by 66% and 89%, respectively, compared to CycleGAN, and by 34% and 59% compared to the leading alternative UVCGANv2 ( ). No significant differences ( ) in echogenicity distributions were found between real and simulated images within liver and gallbladder regions. The user study indicated our simulated scans fooled radiologists in 36.2% of cases, outperforming other methods.
Conclusions: Our segmentation-guided, polar-coordinates-trained CycleGAN framework significantly reduces hallucinations, ensuring anatomical consistency, and realism in simulated abdominal US images, surpassing existing methods.
背景:超声(美国)模拟有助于培训医生和医学生在图像采集和解释,使传感器操作和器官识别的安全实践。目前的模拟器从参考扫描中生成真实的图像。尽管基于物理的模拟器提供实时图像,但它们缺乏足够的真实感,而最近基于非配对图像到图像转换的基于深度学习的模型提高了真实感,但引入了解剖上的不一致性。目的:我们提出了一种新的框架来减少基于物理模拟的生成对抗网络(GANs)产生的幻觉,提高腹部US模拟的解剖准确性和真实感。我们的方法旨在产生解剖学上一致的图像,没有视野内外的伪影(FoV)。方法:我们通过使用预训练的Unet模型,从基于物理的模拟扫描中分割腹部器官,引入分割引导损失来加强解剖一致性。在翻译周期前后对分割差异进行惩罚有助于防止不现实的工件。此外,我们提出在极坐标图像上训练gan,将视野限制在非空白区域。我们在未配对的数据集上评估了我们的方法,这些数据集包括来自SonoSite-M涡轮v1.3扫描仪的617张真实腹部超声图像和来自射线投射模拟器的971张人工扫描。数据按患者水平分为训练组(70%), validation (10%), and testing (20%). Performance was quantitatively assessed with Frechet and Kernel Inception Distances (FID and KID), and organ-specific χ 2 $\chi ^2$ histogram distances, reporting 95% confidence intervals. We compared our model against generative methods such as CUT, UVCGANv2, and UNSB, performing statistical analyses using Wilcoxon tests (FID and KID with Bonferroni-corrected α = 0.01 $\alpha = 0.01$ , χ 2 $\chi ^2$ with α = 0.008 $\alpha =0.008$ ). A perceptual realism study involving expert radiologists was also conducted.Results: Our method significantly reduced FID and KID by 66% and 89%, respectively, compared to CycleGAN, and by 34% and 59% compared to the leading alternative UVCGANv2 ( p ≪ 0.01 $p \ll 0.01$ ). No significant differences ( p > 0.008 $p>0.008$ ) in echogenicity distributions were found between real and simulated images within liver and gallbladder regions. The user study indicated our simulated scans fooled radiologists in 36.2% of cases, outperforming other methods.Conclusions: Our segmentation-guided, polar-coordinates-trained CycleGAN framework significantly reduces hallucinations, ensuring anatomical consistency, and realism in simulated abdominal US images, surpassing existing methods.