A comprehensive EHR timeseries pre-training benchmark

Proceedings of the ACM Conference on Health, Inference, and Learning Pub Date : 2021-04-08 DOI:10.1145/3450439.3451877

Matthew B. A. McDermott, Bret A. Nestor, Evan Kim, Wancong Zhang, A. Goldenberg, Peter Szolovits, M. Ghassemi

{"title":"A comprehensive EHR timeseries pre-training benchmark","authors":"Matthew B. A. McDermott, Bret A. Nestor, Evan Kim, Wancong Zhang, A. Goldenberg, Peter Szolovits, M. Ghassemi","doi":"10.1145/3450439.3451877","DOIUrl":null,"url":null,"abstract":"Pre-training (PT) has been used successfully in many areas of machine learning. One area where PT would be extremely impactful is over electronic health record (EHR) data. Successful PT strategies on this modality could improve model performance in data-scarce contexts such as modeling for rare diseases or allowing smaller hospitals to benefit from data from larger health systems. While many PT strategies have been explored in other domains, much less exploration has occurred for EHR data. One reason for this may be the lack of standardized benchmarks suitable for developing and testing PT algorithms. In this work, we establish a PT benchmark dataset for EHR timeseries data, establishing cohorts, a diverse set of fine-tuning tasks, and PT-focused evaluation regimes across two public EHR datasets: MIMIC-III and eICU. This benchmark fills an essential hole in the field by enabling a robust manner of iterating on PT strategies for this modality. To show the value of this benchmark and provide baselines for further research, we also profile two simple PT algorithms: a self-supervised, masked imputation system and a weakly-supervised, multi-task system. We find that PT strategies (in particular weakly-supervised PT methods) can offer significant gains over traditional learning in few-shot settings, especially on tasks with strong class imbalance. Our full benchmark and code are publicly available at https://github.com/mmcdermott/comprehensive_MTL_EHR","PeriodicalId":87342,"journal":{"name":"Proceedings of the ACM Conference on Health, Inference, and Learning","volume":"57 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Conference on Health, Inference, and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3450439.3451877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

Abstract

Pre-training (PT) has been used successfully in many areas of machine learning. One area where PT would be extremely impactful is over electronic health record (EHR) data. Successful PT strategies on this modality could improve model performance in data-scarce contexts such as modeling for rare diseases or allowing smaller hospitals to benefit from data from larger health systems. While many PT strategies have been explored in other domains, much less exploration has occurred for EHR data. One reason for this may be the lack of standardized benchmarks suitable for developing and testing PT algorithms. In this work, we establish a PT benchmark dataset for EHR timeseries data, establishing cohorts, a diverse set of fine-tuning tasks, and PT-focused evaluation regimes across two public EHR datasets: MIMIC-III and eICU. This benchmark fills an essential hole in the field by enabling a robust manner of iterating on PT strategies for this modality. To show the value of this benchmark and provide baselines for further research, we also profile two simple PT algorithms: a self-supervised, masked imputation system and a weakly-supervised, multi-task system. We find that PT strategies (in particular weakly-supervised PT methods) can offer significant gains over traditional learning in few-shot settings, especially on tasks with strong class imbalance. Our full benchmark and code are publicly available at https://github.com/mmcdermott/comprehensive_MTL_EHR

查看原文本刊更多论文

一个全面的EHR时间序列预训练基准

预训练(PT)已经成功地应用于机器学习的许多领域。PT极具影响力的一个领域是电子健康记录(EHR)数据。基于这种模式的成功PT战略可以改善数据稀缺环境下的模型性能，例如为罕见疾病建模或允许小型医院从大型卫生系统的数据中受益。虽然许多PT策略已经在其他领域进行了探索，但对电子病历数据的探索要少得多。造成这种情况的一个原因可能是缺乏适合开发和测试PT算法的标准化基准。在这项工作中，我们为EHR时间序列数据建立了一个PT基准数据集，建立了队列，多种微调任务，并在两个公共EHR数据集(MIMIC-III和eICU)中建立了以PT为重点的评估机制。该基准通过支持对这种模式的PT策略进行迭代的健壮方式，填补了该领域的一个重要空白。为了展示该基准的价值并为进一步研究提供基准，我们还介绍了两种简单的PT算法:自监督、掩码输入系统和弱监督、多任务系统。我们发现PT策略(特别是弱监督PT方法)可以在少数镜头设置中提供比传统学习显著的收益，特别是在具有强烈类别不平衡的任务中。我们完整的基准测试和代码可以在https://github.com/mmcdermott/comprehensive_MTL_EHR上公开获得

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ACM Conference on Health, Inference, and Learning

自引率

0.00%

发文量