Alexander Zhou, Zelong Liu, Andrew Tieu, Nikhil Patel, Sean Sun, Anthony Yang, Peter Choi, Hao-Chih Lee, Mickael Tordjman, Louisa Deyer, Yunhao Mei, Valentin Fauveau, Georgios Soultanidis, Bachir Taouli, Mingqian Huang, Amish Doshi, Zahi A Fayad, Timothy Deyer, Xueyan Mei
{"title":"MRAnnotator:对44个结构进行多解剖和多序列MRI分割。","authors":"Alexander Zhou, Zelong Liu, Andrew Tieu, Nikhil Patel, Sean Sun, Anthony Yang, Peter Choi, Hao-Chih Lee, Mickael Tordjman, Louisa Deyer, Yunhao Mei, Valentin Fauveau, Georgios Soultanidis, Bachir Taouli, Mingqian Huang, Amish Doshi, Zahi A Fayad, Timothy Deyer, Xueyan Mei","doi":"10.1093/radadv/umae035","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To develop a deep learning model for multi-anatomy segmentation of diverse anatomic structures on MRI.</p><p><strong>Materials and methods: </strong>In this retrospective study, 44 structures were annotated using a model-assisted workflow with manual human finalization in 2 curated datasets: an internal dataset of 1518 MRI sequences (843 patients) from various clinical sites within a health system, and an external dataset of 397 MRI sequences (263 patients) from an independent imaging center for benchmarking. The internal dataset was used to train an nnU-Net model (MRAnnotator), while the external dataset evaluated MRAnnotator's generalizability across significant image acquisition distribution shifts. MRAnnotator was further benchmarked against an nnU-Net model trained on the AMOS dataset and 2 current multi-anatomy MRI segmentation models, TotalSegmentator MRI (TSM) and MRSegmentator (MRS). Performance throughout was quantified using the Dice score.</p><p><strong>Results: </strong>MRAnnotator achieved an overall average Dice score of 0.878 (95% CI: 0.873, 0.884) on the internal dataset test set and 0.875 (95% CI: 0.869, 0.880) on the external dataset benchmark, demonstrating strong generalization (<i>P</i> = .899). On the AMOS test set, MRAnnotator achieved comparable performance for relevant classes (0.889 [0.866, 0.909]) to an AMOS-trained nnU-Net (0.895 [0.871, 0.915]) (<i>P</i> = .361) and outperformed TSM (0.822 [0.800, 0.842], <i>P</i> < .001) and MRS (0.867 [0.844, 0.887], <i>P</i> < .001). TSM and MRS were also evaluated on the relevant classes from the internal and external datasets and were unable to achieve comparable performance to MRAnnotator.</p><p><strong>Conclusion: </strong>MRAnnotator achieves robust and generalizable MRI segmentation across 44 anatomic structures. Future direction will incorporate additional anatomic structures into the datasets and model. Model weights are publicly available on GitHub. The external test set with annotations is available upon request.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"2 1","pages":"umae035"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12429175/pdf/","citationCount":"0","resultStr":"{\"title\":\"MRAnnotator: multi-anatomy and many-sequence MRI segmentation of 44 structures.\",\"authors\":\"Alexander Zhou, Zelong Liu, Andrew Tieu, Nikhil Patel, Sean Sun, Anthony Yang, Peter Choi, Hao-Chih Lee, Mickael Tordjman, Louisa Deyer, Yunhao Mei, Valentin Fauveau, Georgios Soultanidis, Bachir Taouli, Mingqian Huang, Amish Doshi, Zahi A Fayad, Timothy Deyer, Xueyan Mei\",\"doi\":\"10.1093/radadv/umae035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To develop a deep learning model for multi-anatomy segmentation of diverse anatomic structures on MRI.</p><p><strong>Materials and methods: </strong>In this retrospective study, 44 structures were annotated using a model-assisted workflow with manual human finalization in 2 curated datasets: an internal dataset of 1518 MRI sequences (843 patients) from various clinical sites within a health system, and an external dataset of 397 MRI sequences (263 patients) from an independent imaging center for benchmarking. The internal dataset was used to train an nnU-Net model (MRAnnotator), while the external dataset evaluated MRAnnotator's generalizability across significant image acquisition distribution shifts. MRAnnotator was further benchmarked against an nnU-Net model trained on the AMOS dataset and 2 current multi-anatomy MRI segmentation models, TotalSegmentator MRI (TSM) and MRSegmentator (MRS). Performance throughout was quantified using the Dice score.</p><p><strong>Results: </strong>MRAnnotator achieved an overall average Dice score of 0.878 (95% CI: 0.873, 0.884) on the internal dataset test set and 0.875 (95% CI: 0.869, 0.880) on the external dataset benchmark, demonstrating strong generalization (<i>P</i> = .899). On the AMOS test set, MRAnnotator achieved comparable performance for relevant classes (0.889 [0.866, 0.909]) to an AMOS-trained nnU-Net (0.895 [0.871, 0.915]) (<i>P</i> = .361) and outperformed TSM (0.822 [0.800, 0.842], <i>P</i> < .001) and MRS (0.867 [0.844, 0.887], <i>P</i> < .001). TSM and MRS were also evaluated on the relevant classes from the internal and external datasets and were unable to achieve comparable performance to MRAnnotator.</p><p><strong>Conclusion: </strong>MRAnnotator achieves robust and generalizable MRI segmentation across 44 anatomic structures. Future direction will incorporate additional anatomic structures into the datasets and model. Model weights are publicly available on GitHub. The external test set with annotations is available upon request.</p>\",\"PeriodicalId\":519940,\"journal\":{\"name\":\"Radiology advances\",\"volume\":\"2 1\",\"pages\":\"umae035\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12429175/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radiology advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/radadv/umae035\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/radadv/umae035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
MRAnnotator: multi-anatomy and many-sequence MRI segmentation of 44 structures.
Purpose: To develop a deep learning model for multi-anatomy segmentation of diverse anatomic structures on MRI.
Materials and methods: In this retrospective study, 44 structures were annotated using a model-assisted workflow with manual human finalization in 2 curated datasets: an internal dataset of 1518 MRI sequences (843 patients) from various clinical sites within a health system, and an external dataset of 397 MRI sequences (263 patients) from an independent imaging center for benchmarking. The internal dataset was used to train an nnU-Net model (MRAnnotator), while the external dataset evaluated MRAnnotator's generalizability across significant image acquisition distribution shifts. MRAnnotator was further benchmarked against an nnU-Net model trained on the AMOS dataset and 2 current multi-anatomy MRI segmentation models, TotalSegmentator MRI (TSM) and MRSegmentator (MRS). Performance throughout was quantified using the Dice score.
Results: MRAnnotator achieved an overall average Dice score of 0.878 (95% CI: 0.873, 0.884) on the internal dataset test set and 0.875 (95% CI: 0.869, 0.880) on the external dataset benchmark, demonstrating strong generalization (P = .899). On the AMOS test set, MRAnnotator achieved comparable performance for relevant classes (0.889 [0.866, 0.909]) to an AMOS-trained nnU-Net (0.895 [0.871, 0.915]) (P = .361) and outperformed TSM (0.822 [0.800, 0.842], P < .001) and MRS (0.867 [0.844, 0.887], P < .001). TSM and MRS were also evaluated on the relevant classes from the internal and external datasets and were unable to achieve comparable performance to MRAnnotator.
Conclusion: MRAnnotator achieves robust and generalizable MRI segmentation across 44 anatomic structures. Future direction will incorporate additional anatomic structures into the datasets and model. Model weights are publicly available on GitHub. The external test set with annotations is available upon request.