电子病历中的病理生理学特征可在数据集时空转移的情况下维持模型性能。

Raphael Brosula, Conor K Corbin, Jonathan H Chen
{"title":"电子病历中的病理生理学特征可在数据集时空转移的情况下维持模型性能。","authors":"Raphael Brosula, Conor K Corbin, Jonathan H Chen","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into <i>feature groups</i> by their source (e.g. medication orders, diagnosis codes and lab results) and <i>feature categories</i> based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141811/pdf/","citationCount":"0","resultStr":"{\"title\":\"Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift.\",\"authors\":\"Raphael Brosula, Conor K Corbin, Jonathan H Chen\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into <i>feature groups</i> by their source (e.g. medication orders, diagnosis codes and lab results) and <i>feature categories</i> based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.</p>\",\"PeriodicalId\":72181,\"journal\":{\"name\":\"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11141811/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

电子病历(EMR)等真实世界数据流的获取加速了临床应用中监督机器学习(ML)模型的开发。然而,很少有研究调查 EMR 中的特定特征对模型性能在时间数据集转移下的不同影响。为了解释 EMR 中的特征如何随着时间的推移对模型产生影响,本研究将特征按其来源(如医嘱、诊断代码和化验结果)聚合成特征组,并根据其对患者病理生理学或医疗流程的反映将特征分类。我们采用夏普利值来解释特征组和特征类别对初始和持续模型性能的边际贡献。我们对三项标准临床预测任务进行了研究,发现虽然不同任务的特征对初始性能的贡献不同,但病理生理特征有助于缓解时间辨别能力的退化。这些结果提供了可解释的见解,说明特定特征组如何对模型性能和对时间数据集转移的稳健性做出贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Pathophysiological Features in Electronic Medical Records Sustain Model Performance under Temporal Dataset Shift.

Access to real-world data streams like electronic medical records (EMRs) has accelerated the development of supervised machine learning (ML) models for clinical applications. However, few studies investigate the differential impact of particular features in the EMR on model performance under temporal dataset shift. To explain how features in the EMR impact models over time, this study aggregates features into feature groups by their source (e.g. medication orders, diagnosis codes and lab results) and feature categories based on their reflection of patient pathophysiology or healthcare processes. We adapt Shapley values to explain feature groups' and feature categories' marginal contribution to initial and sustained model performance. We investigate three standard clinical prediction tasks and find that while feature contributions to initial performance differ across tasks, pathophysiological features help mitigate temporal discrimination deterioration. These results provide interpretable insights on how specific feature groups contribute to model performance and robustness to temporal dataset shift.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信