An empirical framework for domain generalization in clinical settings

Haoran Zhang, Natalie Dullerud, L. Seyyed-Kalantari, Q. Morris, Shalmali Joshi, M. Ghassemi
{"title":"An empirical framework for domain generalization in clinical settings","authors":"Haoran Zhang, Natalie Dullerud, L. Seyyed-Kalantari, Q. Morris, Shalmali Joshi, M. Ghassemi","doi":"10.1145/3450439.3451878","DOIUrl":null,"url":null,"abstract":"Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem by creating models that learn invariances across environments. In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data. We introduce a framework to induce synthetic but realistic domain shifts and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data, in line with prior work on general imaging datasets. However, a subset of realistic induced-shift scenarios in clinical time series data exhibit limited performance gains. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":"12 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Conference on Health, Inference, and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3450439.3451878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

Abstract

Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem by creating models that learn invariances across environments. In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data. We introduce a framework to induce synthetic but realistic domain shifts and sampling bias to stress-test these methods over existing non-healthcare benchmarks. We find that current domain generalization methods do not achieve significant gains in out-of-distribution performance over empirical risk minimization on real-world medical imaging data, in line with prior work on general imaging datasets. However, a subset of realistic induced-shift scenarios in clinical time series data exhibit limited performance gains. We characterize these scenarios in detail, and recommend best practices for domain generalization in the clinical setting.
在临床设置领域概括的经验框架
临床机器学习模型在训练期间未见的数据集(例如新医院或人口)中表现明显下降。领域泛化的最新发展通过创建学习跨环境不变性的模型,为这个问题提供了一个有希望的解决方案。在这项工作中,我们对八种领域泛化方法在多地点临床时间序列和医学影像数据上的性能进行了基准测试。我们引入了一个框架,以诱导合成但现实的域转移和抽样偏差,以在现有的非医疗保健基准上对这些方法进行压力测试。我们发现,目前的领域泛化方法在实际医学成像数据上的失分布性能并没有显著优于经验风险最小化,这与之前在一般成像数据集上的工作一致。然而,临床时间序列数据中现实的诱导转移场景的子集表现出有限的性能收益。我们详细描述了这些场景,并推荐了在临床环境中进行领域概括的最佳实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信