Jiahan Zhang, Yang Lei, Junyi Xia, Ming Chao, Tian Liu
{"title":"用分散数据增强剂量-体积参数预测的联邦学习。","authors":"Jiahan Zhang, Yang Lei, Junyi Xia, Ming Chao, Tian Liu","doi":"10.1002/mp.17566","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>The widespread adoption of knowledge-based planning in radiation oncology clinics is hindered by the lack of data and the difficulty associated with sharing medical data.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>This study aims to assess the feasibility of mitigating this challenge through federated learning (FL): a centralized model trained with distributed datasets, while keeping data localized and private.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>This concept was tested using 273 prostate 45 Gy plans. The cases were split into a training set with 220 cases and a validation set with 53 cases. The training set was further separated into 10 subsets to simulate treatment plans from different clinics. A gradient-boosting model was used to predict bladder and rectum V<sub>30Gy</sub>, V<sub>35Gy</sub>, and V<sub>40Gy</sub>. The Federated Averaging algorithm was employed to aggregate the individual model weights from distributed datasets. Grid search with five-fold in-training-set cross-validation was implemented to tune model hyperparameters. Additionally, we evaluated the robustness of the FL approach by varying the distribution of the training set data in several scenarios, including different number of sites and imbalanced data across sites.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The mean absolute error (MAE) for the FL model (4.7% ± 2.9%) is significantly lower than individual models trained separately (6.5% ± 4.9%, <i>p</i> < 0.001) and similar to a traditional centralized model (4.4% ± 2.8%, <i>p</i> = 0.14). The federated model is robust to the number of subsets, showing MAE of 4.7% ± 3.2%, 4.8% ± 3.1%, 4.8% ± 2.9%, 4.5% ± 2.8%, 4.9% ± 3.3%, and 4.8% ± 3.1% for 5, 10, 15, 20, 25, and 30 subsets, respectively. For the two imbalanced datasets, the FL model achieves MAEs of 4.5% ± 2.9% and 5.6% ± 4.0%, non-inferior to the balanced data model. For all bladder and rectum metrics, the FL model significantly outperforms 36.7% of individual models.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>This study demonstrates the potential advantages of implementing a federated model over training individual models: the proposed FL approach achieves similar prediction accuracy as a conventional model without requiring centralized data storage. Even when local models struggle to produce accurate predictions due to data scarcity, the federated model consistently maintains high performance.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 3","pages":"1408-1415"},"PeriodicalIF":3.2000,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated learning for enhanced dose–volume parameter prediction with decentralized data\",\"authors\":\"Jiahan Zhang, Yang Lei, Junyi Xia, Ming Chao, Tian Liu\",\"doi\":\"10.1002/mp.17566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>The widespread adoption of knowledge-based planning in radiation oncology clinics is hindered by the lack of data and the difficulty associated with sharing medical data.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>This study aims to assess the feasibility of mitigating this challenge through federated learning (FL): a centralized model trained with distributed datasets, while keeping data localized and private.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>This concept was tested using 273 prostate 45 Gy plans. The cases were split into a training set with 220 cases and a validation set with 53 cases. The training set was further separated into 10 subsets to simulate treatment plans from different clinics. A gradient-boosting model was used to predict bladder and rectum V<sub>30Gy</sub>, V<sub>35Gy</sub>, and V<sub>40Gy</sub>. The Federated Averaging algorithm was employed to aggregate the individual model weights from distributed datasets. Grid search with five-fold in-training-set cross-validation was implemented to tune model hyperparameters. Additionally, we evaluated the robustness of the FL approach by varying the distribution of the training set data in several scenarios, including different number of sites and imbalanced data across sites.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The mean absolute error (MAE) for the FL model (4.7% ± 2.9%) is significantly lower than individual models trained separately (6.5% ± 4.9%, <i>p</i> < 0.001) and similar to a traditional centralized model (4.4% ± 2.8%, <i>p</i> = 0.14). The federated model is robust to the number of subsets, showing MAE of 4.7% ± 3.2%, 4.8% ± 3.1%, 4.8% ± 2.9%, 4.5% ± 2.8%, 4.9% ± 3.3%, and 4.8% ± 3.1% for 5, 10, 15, 20, 25, and 30 subsets, respectively. For the two imbalanced datasets, the FL model achieves MAEs of 4.5% ± 2.9% and 5.6% ± 4.0%, non-inferior to the balanced data model. For all bladder and rectum metrics, the FL model significantly outperforms 36.7% of individual models.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>This study demonstrates the potential advantages of implementing a federated model over training individual models: the proposed FL approach achieves similar prediction accuracy as a conventional model without requiring centralized data storage. Even when local models struggle to produce accurate predictions due to data scarcity, the federated model consistently maintains high performance.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 3\",\"pages\":\"1408-1415\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/mp.17566\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mp.17566","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Federated learning for enhanced dose–volume parameter prediction with decentralized data
Background
The widespread adoption of knowledge-based planning in radiation oncology clinics is hindered by the lack of data and the difficulty associated with sharing medical data.
Purpose
This study aims to assess the feasibility of mitigating this challenge through federated learning (FL): a centralized model trained with distributed datasets, while keeping data localized and private.
Methods
This concept was tested using 273 prostate 45 Gy plans. The cases were split into a training set with 220 cases and a validation set with 53 cases. The training set was further separated into 10 subsets to simulate treatment plans from different clinics. A gradient-boosting model was used to predict bladder and rectum V30Gy, V35Gy, and V40Gy. The Federated Averaging algorithm was employed to aggregate the individual model weights from distributed datasets. Grid search with five-fold in-training-set cross-validation was implemented to tune model hyperparameters. Additionally, we evaluated the robustness of the FL approach by varying the distribution of the training set data in several scenarios, including different number of sites and imbalanced data across sites.
Results
The mean absolute error (MAE) for the FL model (4.7% ± 2.9%) is significantly lower than individual models trained separately (6.5% ± 4.9%, p < 0.001) and similar to a traditional centralized model (4.4% ± 2.8%, p = 0.14). The federated model is robust to the number of subsets, showing MAE of 4.7% ± 3.2%, 4.8% ± 3.1%, 4.8% ± 2.9%, 4.5% ± 2.8%, 4.9% ± 3.3%, and 4.8% ± 3.1% for 5, 10, 15, 20, 25, and 30 subsets, respectively. For the two imbalanced datasets, the FL model achieves MAEs of 4.5% ± 2.9% and 5.6% ± 4.0%, non-inferior to the balanced data model. For all bladder and rectum metrics, the FL model significantly outperforms 36.7% of individual models.
Conclusions
This study demonstrates the potential advantages of implementing a federated model over training individual models: the proposed FL approach achieves similar prediction accuracy as a conventional model without requiring centralized data storage. Even when local models struggle to produce accurate predictions due to data scarcity, the federated model consistently maintains high performance.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.