Lie Cai, Michael Golatta, Chris Sidey-Gibbons, Richard G Barr, André Pfob
{"title":"更新成像软件对乳腺癌诊断机器学习模型性能的影响:一项多中心回顾性研究。","authors":"Lie Cai, Michael Golatta, Chris Sidey-Gibbons, Richard G Barr, André Pfob","doi":"10.1007/s00404-024-07901-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models' generalizability by validating them on external data generated by both the original updated software versions.</p><p><strong>Methods: </strong>We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).</p><p><strong>Results: </strong>We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, p < 0.001 and 0.934 vs. 0.872, p < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, p = 0.045). SVM was not calibrated.</p><p><strong>Conclusion: </strong>In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.</p>","PeriodicalId":8330,"journal":{"name":"Archives of Gynecology and Obstetrics","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: a multi-center, retrospective study.\",\"authors\":\"Lie Cai, Michael Golatta, Chris Sidey-Gibbons, Richard G Barr, André Pfob\",\"doi\":\"10.1007/s00404-024-07901-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models' generalizability by validating them on external data generated by both the original updated software versions.</p><p><strong>Methods: </strong>We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).</p><p><strong>Results: </strong>We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, p < 0.001 and 0.934 vs. 0.872, p < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, p = 0.045). SVM was not calibrated.</p><p><strong>Conclusion: </strong>In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.</p>\",\"PeriodicalId\":8330,\"journal\":{\"name\":\"Archives of Gynecology and Obstetrics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Archives of Gynecology and Obstetrics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00404-024-07901-8\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OBSTETRICS & GYNECOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Archives of Gynecology and Obstetrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00404-024-07901-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
The impact of updated imaging software on the performance of machine learning models for breast cancer diagnosis: a multi-center, retrospective study.
Purpose: Artificial Intelligence models based on medical (imaging) data are increasingly developed. However, the imaging software on which the original data is generated is frequently updated. The impact of updated imaging software on the performance of AI models is unclear. We aimed to develop machine learning models using shear wave elastography (SWE) data to identify malignant breast lesions and to test the models' generalizability by validating them on external data generated by both the original updated software versions.
Methods: We developed and validated different machine learning models (GLM, MARS, XGBoost, SVM) using multicenter, international SWE data (NCT02638935) using tenfold cross-validation. Findings were compared to the histopathologic evaluation of the biopsy specimen or 2-year follow-up. The outcome measure was the area under the curve (AUROC).
Results: We included 1288 cases in the development set using the original imaging software and 385 cases in the validation set using both, original and updated software. In the external validation set, the GLM and XGBoost models showed better performance with the updated software data compared to the original software data (AUROC 0.941 vs. 0.902, p < 0.001 and 0.934 vs. 0.872, p < 0.001). The MARS model showed worse performance with the updated software data (0.847 vs. 0.894, p = 0.045). SVM was not calibrated.
Conclusion: In this multicenter study using SWE data, some machine learning models demonstrated great potential to bridge the gap between original software and updated software, whereas others exhibited weak generalizability.
期刊介绍:
Founded in 1870 as "Archiv für Gynaekologie", Archives of Gynecology and Obstetrics has a long and outstanding tradition. Since 1922 the journal has been the Organ of the Deutsche Gesellschaft für Gynäkologie und Geburtshilfe. "The Archives of Gynecology and Obstetrics" is circulated in over 40 countries world wide and is indexed in "PubMed/Medline" and "Science Citation Index Expanded/Journal Citation Report".
The journal publishes invited and submitted reviews; peer-reviewed original articles about clinical topics and basic research as well as news and views and guidelines and position statements from all sub-specialties in gynecology and obstetrics.