Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance.

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Pub Date : 2023-10-01 DOI:10.1007/978-3-031-43907-0_62

DongAo Ma, Jiaxuan Pang, Michael B Gotway, Jianming Liang

{"title":"Foundation Ark: Accruing and Reusing Knowledge for Superior and Robust Performance.","authors":"DongAo Ma, Jiaxuan Pang, Michael B Gotway, Jianming Liang","doi":"10.1007/978-3-031-43907-0_62","DOIUrl":null,"url":null,"abstract":"Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google's proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark's superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google's proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14220 ","pages":"651-662"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11095392/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/978-3-031-43907-0_62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning nowadays offers expert-level and sometimes even super-expert-level performance, but achieving such performance demands massive annotated data for training (e.g., Google's proprietary CXR Foundation Model (CXR-FM) was trained on 821,544 labeled and mostly private chest X-rays (CXRs)). Numerous datasets are publicly available in medical imaging but individually small and heterogeneous in expert labels. We envision a powerful and robust foundation model that can be trained by aggregating numerous small public datasets. To realize this vision, we have developed Ark, a framework that accrues and reuses knowledge from heterogeneous expert annotations in various datasets. As a proof of concept, we have trained two Ark models on 335,484 and 704,363 CXRs, respectively, by merging several datasets including ChestX-ray14, CheXpert, MIMIC-II, and VinDr-CXR, evaluated them on a wide range of imaging tasks covering both classification and segmentation via fine-tuning, linear-probing, and gender-bias analysis, and demonstrated our Ark's superior and robust performance over the state-of-the-art (SOTA) fully/self-supervised baselines and Google's proprietary CXR-FM. This enhanced performance is attributed to our simple yet powerful observation that aggregating numerous public datasets diversifies patient populations and accrues knowledge from diverse experts, yielding unprecedented performance yet saving annotation cost. With all codes and pretrained models released at GitHub.com/JLiangLab/Ark, we hope that Ark exerts an important impact on open science, as accruing and reusing knowledge from expert annotations in public datasets can potentially surpass the performance of proprietary models trained on unusually large data, inspiring many more researchers worldwide to share codes and datasets to build open foundation models, accelerate open science, and democratize deep learning for medical imaging.

查看原文本刊更多论文

基础方舟：积累和重复使用知识，实现卓越而稳健的绩效。

如今，深度学习可以提供专家级，有时甚至是超专家级的性能，但要达到这样的性能，需要海量标注数据进行训练（例如，谷歌专有的 CXR 基础模型（CXR-FM）就是在 821,544 张标注且大多是私人的胸部 X 光片（CXR）上训练出来的）。医学影像领域有许多公开的数据集，但每个数据集的规模都很小，而且专家标签也不尽相同。我们设想通过汇集众多小型公共数据集，训练出一个强大而稳健的基础模型。为了实现这一愿景，我们开发了方舟，这是一个从各种数据集中的异构专家注释中积累和重用知识的框架。作为概念验证，我们通过合并多个数据集（包括 ChestX-ray14、CheXpert、MIMIC-II 和 VinDr-CXR），分别在 335,484 张和 704,363 张 CXR 上训练了两个 Ark 模型，并通过微调对它们进行了广泛的成像任务评估，包括分类和分割、并证明了我们的方舟比最先进的（SOTA）完全/自我监督基线和谷歌专有的 CXR-FM 性能更优越、更稳健。性能的提升归功于我们简单而有力的观察，即汇聚众多公共数据集可使患者群体多样化，并从不同专家那里积累知识，从而产生前所未有的性能，同时节省注释成本。随着所有代码和预训练模型在GitHub.com/JLiangLab/Ark上发布，我们希望方舟能对开放科学产生重要影响，因为从公共数据集的专家注释中积累和重用知识，有可能超越在异常大的数据上训练的专有模型的性能，激励全世界更多研究人员共享代码和数据集，以建立开放基础模型，加速开放科学，并使医学影像的深度学习民主化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention

自引率

0.00%

发文量