Electronic Health Record (EHR) Enhanced Signal Detection Using Tree-Based Scan Statistic Methods.

IF 4.8 2区 医学 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Massimiliano Russo, Sushama Kattinakere Sreedhara, Joshua Smith, Sharon E Davis, Judith C Maro, Thomas Deramus, Joyce Lii, Jie Yang, Rishi Desai, José J Hernández-Muñoz, Yong Ma, Youjin Wang, Jamal T Jones, Shirley V Wang
{"title":"Electronic Health Record (EHR) Enhanced Signal Detection Using Tree-Based Scan Statistic Methods.","authors":"Massimiliano Russo, Sushama Kattinakere Sreedhara, Joshua Smith, Sharon E Davis, Judith C Maro, Thomas Deramus, Joyce Lii, Jie Yang, Rishi Desai, José J Hernández-Muñoz, Yong Ma, Youjin Wang, Jamal T Jones, Shirley V Wang","doi":"10.1093/aje/kwaf199","DOIUrl":null,"url":null,"abstract":"<p><p>Tree-based scan statistics (TBSS) are data mining methods that screen thousands of hierarchically related health outcomes to detect unsuspected adverse drug effects. TBSS traditionally analyze claims data with outcomes defined via diagnosis codes. TBSS have not been previously applied to rich clinical information in Electronic Health Records (EHR). We developed approaches for integrating EHR data in TBSS analyses, including outcomes derived from natural language processing (NLP) applied to clinical notes and laboratory results, related via multipath hierarchical structures. We consider four settings that sequentially add sources of outcomes to the TBSS tree: 1) diagnosis code, 2) NLP-derived outcomes, 3) binary outcomes from lab results, and 4) continuous lab results. In a comparative cohort study involving second-generation sulfonylureas (SUs) and dipeptidyl peptidase 4 (DPP-4) inhibitors among adults with type-2 diabetes, with an a priori expected signal of hypoglycemia, diagnosis code data showed no statistical alerts for inpatient or emergency department settings. Adding NLP-derived outcomes resulted in an alert for \"Headaches\" (p=0.047), a nonspecific symptom of hypoglycemia. Progressively adding binary and continuous lab results produced the same alert. Integrating EHR in TBSS can be useful for the detection of safety signals for further investigation.</p>","PeriodicalId":7472,"journal":{"name":"American journal of epidemiology","volume":" ","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/aje/kwaf199","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

Tree-based scan statistics (TBSS) are data mining methods that screen thousands of hierarchically related health outcomes to detect unsuspected adverse drug effects. TBSS traditionally analyze claims data with outcomes defined via diagnosis codes. TBSS have not been previously applied to rich clinical information in Electronic Health Records (EHR). We developed approaches for integrating EHR data in TBSS analyses, including outcomes derived from natural language processing (NLP) applied to clinical notes and laboratory results, related via multipath hierarchical structures. We consider four settings that sequentially add sources of outcomes to the TBSS tree: 1) diagnosis code, 2) NLP-derived outcomes, 3) binary outcomes from lab results, and 4) continuous lab results. In a comparative cohort study involving second-generation sulfonylureas (SUs) and dipeptidyl peptidase 4 (DPP-4) inhibitors among adults with type-2 diabetes, with an a priori expected signal of hypoglycemia, diagnosis code data showed no statistical alerts for inpatient or emergency department settings. Adding NLP-derived outcomes resulted in an alert for "Headaches" (p=0.047), a nonspecific symptom of hypoglycemia. Progressively adding binary and continuous lab results produced the same alert. Integrating EHR in TBSS can be useful for the detection of safety signals for further investigation.

利用基于树的扫描统计方法增强电子健康记录(EHR)信号检测。
基于树的扫描统计(TBSS)是一种数据挖掘方法,用于筛选数千种分层相关的健康结果,以检测未预料到的药物不良反应。TBSS传统上分析索赔数据,结果通过诊断代码定义。TBSS以前没有应用于电子健康记录(EHR)中丰富的临床信息。我们开发了将电子病历数据整合到TBSS分析中的方法,包括应用于临床记录和实验室结果的自然语言处理(NLP)结果,这些结果通过多路径分层结构相关联。我们考虑了四种设置,依次将结果来源添加到TBSS树中:1)诊断代码,2)nlp衍生的结果,3)实验室结果的二进制结果,以及4)连续的实验室结果。在一项涉及二代磺脲类药物(SUs)和二肽基肽酶4 (DPP-4)抑制剂的比较队列研究中,2型糖尿病成人患者有低血糖的先验预期信号,诊断代码数据显示住院或急诊科设置没有统计警报。添加nlp衍生的结果导致“头痛”警报(p=0.047),这是低血糖的一种非特异性症状。逐步增加二进制和连续的实验结果产生相同的警报。将电子病历集成到TBSS中,可以帮助检测安全信号,以便进一步调查。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
American journal of epidemiology
American journal of epidemiology 医学-公共卫生、环境卫生与职业卫生
CiteScore
7.40
自引率
4.00%
发文量
221
审稿时长
3-6 weeks
期刊介绍: The American Journal of Epidemiology is the oldest and one of the premier epidemiologic journals devoted to the publication of empirical research findings, opinion pieces, and methodological developments in the field of epidemiologic research. It is a peer-reviewed journal aimed at both fellow epidemiologists and those who use epidemiologic data, including public health workers and clinicians.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信