Monjoy Saha, Mustapha Abubakar, Ruth M Pfeiffer, Thomas E Rohan, Máire A Duggan, Kathryn Richert-Boe, Jonas S Almeida, Gretchen L Gierach
{"title":"苏木精和伊红染色良性乳腺活检的深度学习分析预测未来浸润性乳腺癌。","authors":"Monjoy Saha, Mustapha Abubakar, Ruth M Pfeiffer, Thomas E Rohan, Máire A Duggan, Kathryn Richert-Boe, Jonas S Almeida, Gretchen L Gierach","doi":"10.1093/jncics/pkaf037","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Benign breast disease is an important risk factor for breast cancer development. In this study, we analyzed hematoxylin and eosin-stained whole-slide images from diagnostic benign breast disease biopsies using different deep learning approaches to predict which individuals would subsequently developed breast cancer (cases) or would not (controls).</p><p><strong>Methods: </strong>We randomly divided cases and controls from a nested case-control study of 946 women with benign breast disease into training (331 cases, 331 control individuals) and test (142 cases, 142 control individuals) groups. We employed customized VGG-16 and AutoML machine learning models for image-only classification using whole-slide images, logistic regression for classification using only clinicopathological characteristics, and a multimodal network combining whole-slide images and clinicopathological characteristics for classification.</p><p><strong>Results: </strong>Both image-only (area under the receiver operating characteristic curve [AUROC] = 0.83 [SE = 0.001] and 0.78 [SE = 0.001] for customized VGG-16 and AutoML models, respectively) and multimodal (AUROC = 0.89 [SE = 0.03]) networks had high discriminatory accuracy for breast cancer. The clinicopathological-characteristics-only model had the lowest AUROC (0.54 [SE = 0.03]). In addition, compared with the customized VGG-16 model, which performed better than the AutoML model, the multimodal network had improved accuracy (AUROC = 0.89 [SE = 0.03] vs 0.83 [SE = 0.02]), sensitivity (AUROC = 0.93 [SE = 0.04] vs 0.83 [SE = 0.003]), and specificity (AUROC = 0.86 [SE = 0.03] vs 0.84 [SE = 0.003]).</p><p><strong>Conclusion: </strong>This study opens promising avenues for breast cancer risk assessment in women with benign breast disease. Integrating whole-slide images and clinicopathological characteristics through a multimodal approach substantially improved predictive model performance. Future research will explore deep learning techniques to understand benign breast disease progression to invasive breast cancer.</p>","PeriodicalId":14681,"journal":{"name":"JNCI Cancer Spectrum","volume":" ","pages":""},"PeriodicalIF":4.1000,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12105608/pdf/","citationCount":"0","resultStr":"{\"title\":\"Deep learning analysis of hematoxylin and eosin-stained benign breast biopsies to predict future invasive breast cancer.\",\"authors\":\"Monjoy Saha, Mustapha Abubakar, Ruth M Pfeiffer, Thomas E Rohan, Máire A Duggan, Kathryn Richert-Boe, Jonas S Almeida, Gretchen L Gierach\",\"doi\":\"10.1093/jncics/pkaf037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Benign breast disease is an important risk factor for breast cancer development. In this study, we analyzed hematoxylin and eosin-stained whole-slide images from diagnostic benign breast disease biopsies using different deep learning approaches to predict which individuals would subsequently developed breast cancer (cases) or would not (controls).</p><p><strong>Methods: </strong>We randomly divided cases and controls from a nested case-control study of 946 women with benign breast disease into training (331 cases, 331 control individuals) and test (142 cases, 142 control individuals) groups. We employed customized VGG-16 and AutoML machine learning models for image-only classification using whole-slide images, logistic regression for classification using only clinicopathological characteristics, and a multimodal network combining whole-slide images and clinicopathological characteristics for classification.</p><p><strong>Results: </strong>Both image-only (area under the receiver operating characteristic curve [AUROC] = 0.83 [SE = 0.001] and 0.78 [SE = 0.001] for customized VGG-16 and AutoML models, respectively) and multimodal (AUROC = 0.89 [SE = 0.03]) networks had high discriminatory accuracy for breast cancer. The clinicopathological-characteristics-only model had the lowest AUROC (0.54 [SE = 0.03]). In addition, compared with the customized VGG-16 model, which performed better than the AutoML model, the multimodal network had improved accuracy (AUROC = 0.89 [SE = 0.03] vs 0.83 [SE = 0.02]), sensitivity (AUROC = 0.93 [SE = 0.04] vs 0.83 [SE = 0.003]), and specificity (AUROC = 0.86 [SE = 0.03] vs 0.84 [SE = 0.003]).</p><p><strong>Conclusion: </strong>This study opens promising avenues for breast cancer risk assessment in women with benign breast disease. Integrating whole-slide images and clinicopathological characteristics through a multimodal approach substantially improved predictive model performance. Future research will explore deep learning techniques to understand benign breast disease progression to invasive breast cancer.</p>\",\"PeriodicalId\":14681,\"journal\":{\"name\":\"JNCI Cancer Spectrum\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12105608/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JNCI Cancer Spectrum\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jncics/pkaf037\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JNCI Cancer Spectrum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jncics/pkaf037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:乳腺良性疾病(BBD)是乳腺癌(BC)发展的重要危险因素。在这项研究中,我们使用不同的深度学习(DL)方法分析了诊断性BBD活检中苏木精和伊红染色的全切片图像(WSIs),以预测随后发展为乳腺癌的患者(病例)和未发展为乳腺癌的患者(对照组)。方法:我们将946例女性BBD病例和对照组随机分为训练组(331例,331例对照)和测试组(142例,142例对照)。我们使用定制的VGG-16和AutoML模型使用wsi进行图像分类;仅使用临床病理特征进行逻辑回归分类;以及结合wsi和临床病理特征进行分类的多模式网络。结果:单图像网络(定制VGG-16和AutoML的接收者工作特征曲线下面积,AUROC分别为0.83(标准误差,SE: 0.001)和0.78 (SE: 0.001))和多模式网络(AUROC为0.89 (SE: 0.03))对BC具有较高的鉴别准确率。仅临床病理特征模型的AUROC最低,为0.54 (SE: 0.03)。此外,与表现优于AutoML的定制VGG-16相比,多模态网络的准确率为0.89 (SE: 0.03) vs 0.83 (SE: 0.02),灵敏度为0.93 (SE: 0.04) vs 0.83 (SE: 0.003),特异性为0.86 (SE: 0.03) vs 0.84 (SE: 0.003)。结论:本研究为良性乳腺疾病女性乳腺癌风险评估开辟了有希望的途径。通过多模态方法整合整个幻灯片图像和临床病理特征显著提高了预测模型的性能。未来的研究将探索DL技术来了解BBD向浸润性BC的进展。
Deep learning analysis of hematoxylin and eosin-stained benign breast biopsies to predict future invasive breast cancer.
Background: Benign breast disease is an important risk factor for breast cancer development. In this study, we analyzed hematoxylin and eosin-stained whole-slide images from diagnostic benign breast disease biopsies using different deep learning approaches to predict which individuals would subsequently developed breast cancer (cases) or would not (controls).
Methods: We randomly divided cases and controls from a nested case-control study of 946 women with benign breast disease into training (331 cases, 331 control individuals) and test (142 cases, 142 control individuals) groups. We employed customized VGG-16 and AutoML machine learning models for image-only classification using whole-slide images, logistic regression for classification using only clinicopathological characteristics, and a multimodal network combining whole-slide images and clinicopathological characteristics for classification.
Results: Both image-only (area under the receiver operating characteristic curve [AUROC] = 0.83 [SE = 0.001] and 0.78 [SE = 0.001] for customized VGG-16 and AutoML models, respectively) and multimodal (AUROC = 0.89 [SE = 0.03]) networks had high discriminatory accuracy for breast cancer. The clinicopathological-characteristics-only model had the lowest AUROC (0.54 [SE = 0.03]). In addition, compared with the customized VGG-16 model, which performed better than the AutoML model, the multimodal network had improved accuracy (AUROC = 0.89 [SE = 0.03] vs 0.83 [SE = 0.02]), sensitivity (AUROC = 0.93 [SE = 0.04] vs 0.83 [SE = 0.003]), and specificity (AUROC = 0.86 [SE = 0.03] vs 0.84 [SE = 0.003]).
Conclusion: This study opens promising avenues for breast cancer risk assessment in women with benign breast disease. Integrating whole-slide images and clinicopathological characteristics through a multimodal approach substantially improved predictive model performance. Future research will explore deep learning techniques to understand benign breast disease progression to invasive breast cancer.