Laura Brink , Ricardo Amaya Romero , Laura Coombs , Mike Tilkin , Sina Mazaheri MD , Judy Gichoya MD , Zachary Zaiman MD , Hari Trivedi MD , Adam Medina MD , Bernardo C. Bizzo MD, PhD , Ken Chang MD , Jayashree Kalpathy-Cramer MD , Mannudeep K. Kalra MD , Bruno Astuto MD , Carolina Ramirez MD , Sharmila Majumdar MD , Amie Y. Lee MD , Christoph I. Lee MD, MS, MBA , Nathan M. Cross MD, MS , Po-Hao Chen MD , Christoph Wald MD
{"title":"Multi-Institutional Evaluation and Training of Breast Density Classification AI Algorithm Using ACR Connect and AI-LAB","authors":"Laura Brink , Ricardo Amaya Romero , Laura Coombs , Mike Tilkin , Sina Mazaheri MD , Judy Gichoya MD , Zachary Zaiman MD , Hari Trivedi MD , Adam Medina MD , Bernardo C. Bizzo MD, PhD , Ken Chang MD , Jayashree Kalpathy-Cramer MD , Mannudeep K. Kalra MD , Bruno Astuto MD , Carolina Ramirez MD , Sharmila Majumdar MD , Amie Y. Lee MD , Christoph I. Lee MD, MS, MBA , Nathan M. Cross MD, MS , Po-Hao Chen MD , Christoph Wald MD","doi":"10.1016/j.jacr.2024.11.003","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To demonstrate and test the capabilities of the ACR Connect and AI-LAB software platform by implementing multi-institutional artificial intelligence (AI) training and validation for breast density classification.</div></div><div><h3>Methods</h3><div>In this proof-of-concept study, six US-based hospitals installed Connect and AI-LAB. A breast density algorithm was trained and tested on retrospective mammograms. We recorded time to receive institutional review board approval, to install software locally, and to complete the testing and training. We calculated the performance of the breast density algorithm at each participating hospital and compared it to the performance of a holdout multi-institutional clinical trial testing dataset and a retrospective multi-institutional dataset. We calculated the performance of the locally fine-tuned models on the holdout test datasets.</div></div><div><h3>Results</h3><div>The median time to receive institutional review board approval was 66 days, and the median time to successfully install Connect and AI-LAB locally was 157 days. The median time to complete breast density algorithm testing and training was 216 days. The breast density algorithm performed worse at each hospital than on the holdout test dataset, suggesting poor generalizability of the base model. The fine-tuned models had mixed performance locally and performed poorly on the test dataset.</div></div><div><h3>Discussion</h3><div>In this study, we demonstrate the successful installation and implementation of Connect and AI-LAB software platforms at six facilities using a breast density algorithm. Our results suggest poor generalizability of an algorithm trained on a single dataset and algorithms fine-tuned at individual institutions, emphasizing the hypothetical importance of multi-institutional testing and training.</div></div>","PeriodicalId":49044,"journal":{"name":"Journal of the American College of Radiology","volume":"22 2","pages":"Pages 211-219"},"PeriodicalIF":4.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American College of Radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1546144024009128","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
To demonstrate and test the capabilities of the ACR Connect and AI-LAB software platform by implementing multi-institutional artificial intelligence (AI) training and validation for breast density classification.
Methods
In this proof-of-concept study, six US-based hospitals installed Connect and AI-LAB. A breast density algorithm was trained and tested on retrospective mammograms. We recorded time to receive institutional review board approval, to install software locally, and to complete the testing and training. We calculated the performance of the breast density algorithm at each participating hospital and compared it to the performance of a holdout multi-institutional clinical trial testing dataset and a retrospective multi-institutional dataset. We calculated the performance of the locally fine-tuned models on the holdout test datasets.
Results
The median time to receive institutional review board approval was 66 days, and the median time to successfully install Connect and AI-LAB locally was 157 days. The median time to complete breast density algorithm testing and training was 216 days. The breast density algorithm performed worse at each hospital than on the holdout test dataset, suggesting poor generalizability of the base model. The fine-tuned models had mixed performance locally and performed poorly on the test dataset.
Discussion
In this study, we demonstrate the successful installation and implementation of Connect and AI-LAB software platforms at six facilities using a breast density algorithm. Our results suggest poor generalizability of an algorithm trained on a single dataset and algorithms fine-tuned at individual institutions, emphasizing the hypothetical importance of multi-institutional testing and training.
期刊介绍:
The official journal of the American College of Radiology, JACR informs its readers of timely, pertinent, and important topics affecting the practice of diagnostic radiologists, interventional radiologists, medical physicists, and radiation oncologists. In so doing, JACR improves their practices and helps optimize their role in the health care system. By providing a forum for informative, well-written articles on health policy, clinical practice, practice management, data science, and education, JACR engages readers in a dialogue that ultimately benefits patient care.