A novel cross-modal data augmentation method based on contrastive unpaired translation network for kidney segmentation in ultrasound imaging.

Medical physics Pub Date : 2025-02-04 DOI:10.1002/mp.17663

Shuaizi Guo, Xiangyu Sheng, Haijie Chen, Jie Zhang, Qinmu Peng, Menglin Wu, Katherine Fischer, Gregory E Tasian, Yong Fan, Shi Yin

{"title":"A novel cross-modal data augmentation method based on contrastive unpaired translation network for kidney segmentation in ultrasound imaging.","authors":"Shuaizi Guo, Xiangyu Sheng, Haijie Chen, Jie Zhang, Qinmu Peng, Menglin Wu, Katherine Fischer, Gregory E Tasian, Yong Fan, Shi Yin","doi":"10.1002/mp.17663","DOIUrl":null,"url":null,"abstract":"Background: Kidney ultrasound (US) image segmentation is one of the key steps in computer-aided diagnosis and treatment planning of kidney diseases. Recently, deep learning (DL) technology has demonstrated promising prospects in automatic kidney US segmentation. However, due to the poor quality, particularly the weak boundaries in kidney US imaging, obtaining accurate annotations for DL-based segmentation methods remain a challenging and time-consuming task. This issue can hinder the application of data-hungry deep learning methods.Purpose: In this paper, we explore a novel cross-modal data augmentation method aimed at enhancing the performance of DL-based segmentation networks on the limited labeled kidney US dataset.Methods: In particular, we adopt a novel method based on contrastive unpaired translation network (CUT) to obtain simulated labeled kidney US images at a low cost from labeled abdomen computed tomography (CT) data and unlabeled kidney US images. To effectively improve the segmentation network performance, we propose an instance-weighting training strategy that simultaneously captures useful information from both the simulated and real labeled kidney US images. We trained our generative networks on a dataset comprising 4418 labeled CT slices and 4594 unlabeled US images. For segmentation network, we used a dataset consisting of 4594 simulated and 100 real kidney US images for training, 20 images for validation, and 169 real images for testing. We compared the performance of our method to several state-of-the-art approaches using the Wilcoxon signed-rank test, and applied the Bonferroni method for multiple comparison correction.Results: The experimental results show that we can synthesize accurate labeled kidney US images with a Fréchet inception distance of 52.52. Moreover, the proposed method achieves a segmentation accuracy of 0.9360 ± 0.0398 for U-Net on normal kidney US images, and 0.7719 ± 0.2449 on the abnormal dataset, as measured by the dice similarity coefficient. When compared to other training strategies, the proposed method demonstrated statistically significant superiority, with all p-values being less than 0.01.Conclusions: The proposed method can effectively improve the accuracy and generalization ability of kidney US image segmentation models with limited annotated training data.","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17663","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Kidney ultrasound (US) image segmentation is one of the key steps in computer-aided diagnosis and treatment planning of kidney diseases. Recently, deep learning (DL) technology has demonstrated promising prospects in automatic kidney US segmentation. However, due to the poor quality, particularly the weak boundaries in kidney US imaging, obtaining accurate annotations for DL-based segmentation methods remain a challenging and time-consuming task. This issue can hinder the application of data-hungry deep learning methods.

Purpose: In this paper, we explore a novel cross-modal data augmentation method aimed at enhancing the performance of DL-based segmentation networks on the limited labeled kidney US dataset.

Methods: In particular, we adopt a novel method based on contrastive unpaired translation network (CUT) to obtain simulated labeled kidney US images at a low cost from labeled abdomen computed tomography (CT) data and unlabeled kidney US images. To effectively improve the segmentation network performance, we propose an instance-weighting training strategy that simultaneously captures useful information from both the simulated and real labeled kidney US images. We trained our generative networks on a dataset comprising 4418 labeled CT slices and 4594 unlabeled US images. For segmentation network, we used a dataset consisting of 4594 simulated and 100 real kidney US images for training, 20 images for validation, and 169 real images for testing. We compared the performance of our method to several state-of-the-art approaches using the Wilcoxon signed-rank test, and applied the Bonferroni method for multiple comparison correction.

Results: The experimental results show that we can synthesize accurate labeled kidney US images with a Fréchet inception distance of 52.52. Moreover, the proposed method achieves a segmentation accuracy of 0.9360 ± 0.0398 for U-Net on normal kidney US images, and 0.7719 ± 0.2449 on the abnormal dataset, as measured by the dice similarity coefficient. When compared to other training strategies, the proposed method demonstrated statistically significant superiority, with all p-values being less than 0.01.

Conclusions: The proposed method can effectively improve the accuracy and generalization ability of kidney US image segmentation models with limited annotated training data.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical physics

自引率

0.00%

发文量