Derek J Van Booven, Cheng-Bang Chen, Oleksandr N Kryvenko, Sanoj Punnen, Victor Sandoval, Sheetal Malpani, Ahmed Noman, Farhan Ismael, Yujie Wang, Rehana Qureshi, Joshua M Hare, Himanshu Arora
{"title":"使用人工智能驱动的Gleason分级的合成数据减轻前列腺癌诊断的偏倚。","authors":"Derek J Van Booven, Cheng-Bang Chen, Oleksandr N Kryvenko, Sanoj Punnen, Victor Sandoval, Sheetal Malpani, Ahmed Noman, Farhan Ismael, Yujie Wang, Rehana Qureshi, Joshua M Hare, Himanshu Arora","doi":"10.1038/s41698-025-00934-5","DOIUrl":null,"url":null,"abstract":"<p><p>Prostate cancer (PCa) is a leading cause of cancer-related mortality in men, with Gleason grading critical for prognosis and treatment decisions. Machine learning (ML) models offer potential for automated grading but are limited by dataset biases, staining variability, and data scarcity, reducing their generalizability. This study employs generative adversarial networks (GANs) to generate high-quality synthetic histopathological images to address these challenges. A conditional GAN (dcGAN) was developed and validated using expert pathologist review and Spatial Heterogeneous Recurrence Quantification Analysis (SHRQA), achieving 80% diagnostic quality approval. A convolutional neural network (EfficientNet) was trained on original and synthetic images and validated across TCGA, PANDA Challenge, and MAST trial datasets. Integrating synthetic images improved classification accuracy for Gleason 3 (26%, p = 0.0010), Gleason 4 (15%, p = 0.0274), and Gleason 5 (32%, p < 0.0001), with sensitivity and specificity reaching 81% and 92%, respectively. This study demonstrates that synthetic data significantly enhances ML-based Gleason grading accuracy and improves reproducibility, providing a scalable AI-driven solution for precision oncology.</p>","PeriodicalId":19433,"journal":{"name":"NPJ Precision Oncology","volume":"9 1","pages":"151"},"PeriodicalIF":6.8000,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12098719/pdf/","citationCount":"0","resultStr":"{\"title\":\"Mitigating bias in prostate cancer diagnosis using synthetic data for improved AI driven Gleason grading.\",\"authors\":\"Derek J Van Booven, Cheng-Bang Chen, Oleksandr N Kryvenko, Sanoj Punnen, Victor Sandoval, Sheetal Malpani, Ahmed Noman, Farhan Ismael, Yujie Wang, Rehana Qureshi, Joshua M Hare, Himanshu Arora\",\"doi\":\"10.1038/s41698-025-00934-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Prostate cancer (PCa) is a leading cause of cancer-related mortality in men, with Gleason grading critical for prognosis and treatment decisions. Machine learning (ML) models offer potential for automated grading but are limited by dataset biases, staining variability, and data scarcity, reducing their generalizability. This study employs generative adversarial networks (GANs) to generate high-quality synthetic histopathological images to address these challenges. A conditional GAN (dcGAN) was developed and validated using expert pathologist review and Spatial Heterogeneous Recurrence Quantification Analysis (SHRQA), achieving 80% diagnostic quality approval. A convolutional neural network (EfficientNet) was trained on original and synthetic images and validated across TCGA, PANDA Challenge, and MAST trial datasets. Integrating synthetic images improved classification accuracy for Gleason 3 (26%, p = 0.0010), Gleason 4 (15%, p = 0.0274), and Gleason 5 (32%, p < 0.0001), with sensitivity and specificity reaching 81% and 92%, respectively. This study demonstrates that synthetic data significantly enhances ML-based Gleason grading accuracy and improves reproducibility, providing a scalable AI-driven solution for precision oncology.</p>\",\"PeriodicalId\":19433,\"journal\":{\"name\":\"NPJ Precision Oncology\",\"volume\":\"9 1\",\"pages\":\"151\"},\"PeriodicalIF\":6.8000,\"publicationDate\":\"2025-05-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12098719/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NPJ Precision Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1038/s41698-025-00934-5\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Precision Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41698-025-00934-5","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
前列腺癌(PCa)是男性癌症相关死亡的主要原因,Gleason分级对预后和治疗决策至关重要。机器学习(ML)模型提供了自动评分的潜力,但受到数据集偏差、染色可变性和数据稀缺性的限制,降低了它们的泛化性。本研究采用生成对抗网络(GANs)来生成高质量的合成组织病理学图像来解决这些挑战。通过专家病理学家评审和空间异质性复发量化分析(SHRQA),开发并验证了条件GAN (dcGAN),达到80%的诊断质量认可。卷积神经网络(effentnet)在原始和合成图像上进行了训练,并在TCGA、PANDA Challenge和MAST试验数据集上进行了验证。整合合成图像提高了Gleason 3 (26%, p = 0.0010)、Gleason 4 (15%, p = 0.0274)和Gleason 5 (32%, p = 0.0274)的分类准确率
Mitigating bias in prostate cancer diagnosis using synthetic data for improved AI driven Gleason grading.
Prostate cancer (PCa) is a leading cause of cancer-related mortality in men, with Gleason grading critical for prognosis and treatment decisions. Machine learning (ML) models offer potential for automated grading but are limited by dataset biases, staining variability, and data scarcity, reducing their generalizability. This study employs generative adversarial networks (GANs) to generate high-quality synthetic histopathological images to address these challenges. A conditional GAN (dcGAN) was developed and validated using expert pathologist review and Spatial Heterogeneous Recurrence Quantification Analysis (SHRQA), achieving 80% diagnostic quality approval. A convolutional neural network (EfficientNet) was trained on original and synthetic images and validated across TCGA, PANDA Challenge, and MAST trial datasets. Integrating synthetic images improved classification accuracy for Gleason 3 (26%, p = 0.0010), Gleason 4 (15%, p = 0.0274), and Gleason 5 (32%, p < 0.0001), with sensitivity and specificity reaching 81% and 92%, respectively. This study demonstrates that synthetic data significantly enhances ML-based Gleason grading accuracy and improves reproducibility, providing a scalable AI-driven solution for precision oncology.
期刊介绍:
Online-only and open access, npj Precision Oncology is an international, peer-reviewed journal dedicated to showcasing cutting-edge scientific research in all facets of precision oncology, spanning from fundamental science to translational applications and clinical medicine.