下一代诊断技术：合成数据生成对通过超声波成像检测乳腺癌的影响

IF 2.3 3区数学 Q1 MATHEMATICS

Mathematics Pub Date : 2024-09-11 DOI:10.3390/math12182808

Hari Mohan Rai, Serhii Dashkevych, Joon Yoo

{"title":"下一代诊断技术：合成数据生成对通过超声波成像检测乳腺癌的影响","authors":"Hari Mohan Rai, Serhii Dashkevych, Joon Yoo","doi":"10.3390/math12182808","DOIUrl":null,"url":null,"abstract":"Breast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-quality ultrasound data outperformed conventional augmentation strategies for efficiently diagnosing breast cancer using deep learning. We trained a deep-learning model using the EfficientNet-B7 architecture and a large dataset of 3186 ultrasound images acquired from multiple publicly available sources, as well as 10,000 synthetically generated images using generative adversarial networks (StyleGAN3). The model was trained using five-fold cross-validation techniques and validated using four metrics: accuracy, recall, precision, and the F1 score measure. The results showed that integrating synthetically produced data into the training set increased the classification accuracy from 88.72% to 92.01% based on the F1 score, demonstrating the power of generative models to expand and improve the quality of training datasets in medical-imaging applications. This demonstrated that training the model using a larger set of data comprising synthetic images significantly improved its performance by more than 3% over the genuine dataset with common augmentation. Various data augmentation procedures were also investigated to improve the training set’s diversity and representativeness. This research emphasizes the relevance of using modern artificial intelligence and machine-learning technologies in medical imaging by providing an effective strategy for categorizing ultrasound images, which may lead to increased diagnostic accuracy and optimal treatment options. The proposed techniques are highly promising and have strong potential for future clinical application in the diagnosis of breast cancer.","PeriodicalId":18303,"journal":{"name":"Mathematics","volume":"7 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging\",\"authors\":\"Hari Mohan Rai, Serhii Dashkevych, Joon Yoo\",\"doi\":\"10.3390/math12182808\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Breast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-quality ultrasound data outperformed conventional augmentation strategies for efficiently diagnosing breast cancer using deep learning. We trained a deep-learning model using the EfficientNet-B7 architecture and a large dataset of 3186 ultrasound images acquired from multiple publicly available sources, as well as 10,000 synthetically generated images using generative adversarial networks (StyleGAN3). The model was trained using five-fold cross-validation techniques and validated using four metrics: accuracy, recall, precision, and the F1 score measure. The results showed that integrating synthetically produced data into the training set increased the classification accuracy from 88.72% to 92.01% based on the F1 score, demonstrating the power of generative models to expand and improve the quality of training datasets in medical-imaging applications. This demonstrated that training the model using a larger set of data comprising synthetic images significantly improved its performance by more than 3% over the genuine dataset with common augmentation. Various data augmentation procedures were also investigated to improve the training set’s diversity and representativeness. This research emphasizes the relevance of using modern artificial intelligence and machine-learning technologies in medical imaging by providing an effective strategy for categorizing ultrasound images, which may lead to increased diagnostic accuracy and optimal treatment options. The proposed techniques are highly promising and have strong potential for future clinical application in the diagnosis of breast cancer.\",\"PeriodicalId\":18303,\"journal\":{\"name\":\"Mathematics\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.3390/math12182808\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.3390/math12182808","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS","Score":null,"Total":0}

引用次数: 0

摘要

乳腺癌是影响全世界妇女的最致命、最普遍的疾病之一。因此，有必要利用最具成本效益且广泛使用的方法来准确、高效地诊断乳腺癌。在这项研究中，我们证明了合成创建的高质量超声波数据在利用深度学习高效诊断乳腺癌方面优于传统的增强策略。我们使用 EfficientNet-B7 架构和一个大型数据集训练了一个深度学习模型，该数据集包含从多个公开来源获取的 3186 张超声波图像，以及使用生成式对抗网络（StyleGAN3）合成的 10,000 张图像。该模型使用五倍交叉验证技术进行训练，并使用准确率、召回率、精确度和 F1 分数四个指标进行验证。结果表明，根据 F1 分数，将合成数据整合到训练集可将分类准确率从 88.72% 提高到 92.01%，这证明了生成模型在医学影像应用中扩展和提高训练数据集质量的能力。这表明，使用由合成图像组成的更大数据集来训练模型，其性能比使用普通增强的真实数据集显著提高了 3% 以上。此外，还研究了各种数据增强程序，以提高训练集的多样性和代表性。这项研究强调了在医学成像中使用现代人工智能和机器学习技术的意义，为超声波图像分类提供了一种有效的策略，可提高诊断准确性和优化治疗方案。所提出的技术前景广阔，在未来乳腺癌诊断的临床应用中具有很强的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Next-Generation Diagnostics: The Impact of Synthetic Data Generation on the Detection of Breast Cancer from Ultrasound Imaging

Breast cancer is one of the most lethal and widespread diseases affecting women worldwide. As a result, it is necessary to diagnose breast cancer accurately and efficiently utilizing the most cost-effective and widely used methods. In this research, we demonstrated that synthetically created high-quality ultrasound data outperformed conventional augmentation strategies for efficiently diagnosing breast cancer using deep learning. We trained a deep-learning model using the EfficientNet-B7 architecture and a large dataset of 3186 ultrasound images acquired from multiple publicly available sources, as well as 10,000 synthetically generated images using generative adversarial networks (StyleGAN3). The model was trained using five-fold cross-validation techniques and validated using four metrics: accuracy, recall, precision, and the F1 score measure. The results showed that integrating synthetically produced data into the training set increased the classification accuracy from 88.72% to 92.01% based on the F1 score, demonstrating the power of generative models to expand and improve the quality of training datasets in medical-imaging applications. This demonstrated that training the model using a larger set of data comprising synthetic images significantly improved its performance by more than 3% over the genuine dataset with common augmentation. Various data augmentation procedures were also investigated to improve the training set’s diversity and representativeness. This research emphasizes the relevance of using modern artificial intelligence and machine-learning technologies in medical imaging by providing an effective strategy for categorizing ultrasound images, which may lead to increased diagnostic accuracy and optimal treatment options. The proposed techniques are highly promising and have strong potential for future clinical application in the diagnosis of breast cancer.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Mathematics Mathematics-General Mathematics

CiteScore

4.00

自引率

16.70%

发文量

4032

审稿时长

21.9 days

期刊介绍： Mathematics (ISSN 2227-7390) is an international, open access journal which provides an advanced forum for studies related to mathematical sciences. It devotes exclusively to the publication of high-quality reviews, regular research papers and short communications in all areas of pure and applied mathematics. Mathematics also publishes timely and thorough survey articles on current trends, new theoretical techniques, novel ideas and new mathematical tools in different branches of mathematics.