使用结构化和非结构化电子健康记录数据估计初级保健中静脉血栓栓塞可能性的算法的开发。

IF 9.9 1区 医学 Q1 HEMATOLOGY
Siona Prasad,Patricia C Dykes,Richard Schreiber,Shadi Hijjawi,Khalid Nawab,Alice Kim,Stuart Lipsitz,Ania Syrowatka,Lipika Samal,David W Bates,Veysel Karani Baris,Tien Thai,Michael Sainlaire,Frank Y Chang,John Novoa-Laurentiev,Gregory Piazza,Wenyu Song
{"title":"使用结构化和非结构化电子健康记录数据估计初级保健中静脉血栓栓塞可能性的算法的开发。","authors":"Siona Prasad,Patricia C Dykes,Richard Schreiber,Shadi Hijjawi,Khalid Nawab,Alice Kim,Stuart Lipsitz,Ania Syrowatka,Lipika Samal,David W Bates,Veysel Karani Baris,Tien Thai,Michael Sainlaire,Frank Y Chang,John Novoa-Laurentiev,Gregory Piazza,Wenyu Song","doi":"10.1002/ajh.70096","DOIUrl":null,"url":null,"abstract":"Venous thromboembolism (VTE) is a major public health concern. It is often clinically difficult to diagnose and affects up to 900 000 individuals annually in the United States. Delayed or missed VTE diagnosis can impact treatment and increase morbidity and mortality. This retrospective study utilized structured and unstructured electronic health record (EHR) data from a large integrated care network in the northeastern US, focusing on 4678 adult patients presenting with at least one VTE-associated sign or symptom at primary care visits during 2019-2020. Feature selection incorporated expert-guided and data-driven approaches, resulting in a final set of demographic, clinical history, and sign/symptom risk factors. The primary analysis developed seven machine learning models to predict VTE incidence. Secondary analyses included the prediction of timely and delayed VTE diagnoses. All models showed predictive ability with area under the curve (AUC) of 0.83-0.88. The logistic regression model demonstrated robust performance in predicting incident VTE cases, achieving an AUC of 0.88 (95% CI: 0.86-0.90). Multiple risk factors were identified, including cancer history, smoking history, and spinal cord trauma. Variations in the top risk factors between timely and delayed prediction models highlighted how certain patients were more likely to have a delayed or missed diagnosis. This study highlights the potential for data-driven tools to facilitate timely, point-of-care VTE detection by leveraging structured and unstructured EHR data. The prediction model accurately estimated the likelihood of incident VTEs, especially in cases diagnosed late, showing potential to reduce costly diagnostic delays.","PeriodicalId":7724,"journal":{"name":"American Journal of Hematology","volume":"41 1","pages":""},"PeriodicalIF":9.9000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development of an Algorithm for Estimating the Likelihood of Venous Thromboembolism in Primary Care Using Structured and Unstructured Electronic Health Record Data.\",\"authors\":\"Siona Prasad,Patricia C Dykes,Richard Schreiber,Shadi Hijjawi,Khalid Nawab,Alice Kim,Stuart Lipsitz,Ania Syrowatka,Lipika Samal,David W Bates,Veysel Karani Baris,Tien Thai,Michael Sainlaire,Frank Y Chang,John Novoa-Laurentiev,Gregory Piazza,Wenyu Song\",\"doi\":\"10.1002/ajh.70096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Venous thromboembolism (VTE) is a major public health concern. It is often clinically difficult to diagnose and affects up to 900 000 individuals annually in the United States. Delayed or missed VTE diagnosis can impact treatment and increase morbidity and mortality. This retrospective study utilized structured and unstructured electronic health record (EHR) data from a large integrated care network in the northeastern US, focusing on 4678 adult patients presenting with at least one VTE-associated sign or symptom at primary care visits during 2019-2020. Feature selection incorporated expert-guided and data-driven approaches, resulting in a final set of demographic, clinical history, and sign/symptom risk factors. The primary analysis developed seven machine learning models to predict VTE incidence. Secondary analyses included the prediction of timely and delayed VTE diagnoses. All models showed predictive ability with area under the curve (AUC) of 0.83-0.88. The logistic regression model demonstrated robust performance in predicting incident VTE cases, achieving an AUC of 0.88 (95% CI: 0.86-0.90). Multiple risk factors were identified, including cancer history, smoking history, and spinal cord trauma. Variations in the top risk factors between timely and delayed prediction models highlighted how certain patients were more likely to have a delayed or missed diagnosis. This study highlights the potential for data-driven tools to facilitate timely, point-of-care VTE detection by leveraging structured and unstructured EHR data. The prediction model accurately estimated the likelihood of incident VTEs, especially in cases diagnosed late, showing potential to reduce costly diagnostic delays.\",\"PeriodicalId\":7724,\"journal\":{\"name\":\"American Journal of Hematology\",\"volume\":\"41 1\",\"pages\":\"\"},\"PeriodicalIF\":9.9000,\"publicationDate\":\"2025-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Hematology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/ajh.70096\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEMATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Hematology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/ajh.70096","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEMATOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

静脉血栓栓塞(VTE)是一个主要的公共卫生问题。它通常在临床上难以诊断,在美国每年影响多达90万人。静脉血栓栓塞诊断的延迟或漏诊会影响治疗并增加发病率和死亡率。这项回顾性研究利用了来自美国东北部大型综合医疗网络的结构化和非结构化电子健康记录(EHR)数据,重点研究了2019-2020年期间在初级保健就诊时出现至少一种静脉血栓栓塞相关体征或症状的4678名成年患者。特征选择结合了专家指导和数据驱动的方法,最终形成了一套人口统计学、临床病史和体征/症状风险因素。初步分析开发了7个机器学习模型来预测静脉血栓栓塞的发生率。次要分析包括静脉血栓栓塞的及时诊断和延迟诊断的预测。所有模型均具有较好的预测能力,曲线下面积(AUC)为0.83 ~ 0.88。逻辑回归模型在预测静脉血栓栓塞事件方面表现出稳健的表现,AUC为0.88 (95% CI: 0.86-0.90)。确定了多种危险因素,包括癌症史、吸烟史和脊髓损伤。及时和延迟预测模型之间的主要风险因素的差异突出了某些患者更有可能延迟或错过诊断。这项研究强调了数据驱动工具的潜力,通过利用结构化和非结构化电子病历数据,促进及时、即时的静脉血栓栓塞检测。该预测模型准确地估计了发生静脉血栓栓塞的可能性,特别是在诊断较晚的病例中,显示出减少昂贵的诊断延误的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Development of an Algorithm for Estimating the Likelihood of Venous Thromboembolism in Primary Care Using Structured and Unstructured Electronic Health Record Data.
Venous thromboembolism (VTE) is a major public health concern. It is often clinically difficult to diagnose and affects up to 900 000 individuals annually in the United States. Delayed or missed VTE diagnosis can impact treatment and increase morbidity and mortality. This retrospective study utilized structured and unstructured electronic health record (EHR) data from a large integrated care network in the northeastern US, focusing on 4678 adult patients presenting with at least one VTE-associated sign or symptom at primary care visits during 2019-2020. Feature selection incorporated expert-guided and data-driven approaches, resulting in a final set of demographic, clinical history, and sign/symptom risk factors. The primary analysis developed seven machine learning models to predict VTE incidence. Secondary analyses included the prediction of timely and delayed VTE diagnoses. All models showed predictive ability with area under the curve (AUC) of 0.83-0.88. The logistic regression model demonstrated robust performance in predicting incident VTE cases, achieving an AUC of 0.88 (95% CI: 0.86-0.90). Multiple risk factors were identified, including cancer history, smoking history, and spinal cord trauma. Variations in the top risk factors between timely and delayed prediction models highlighted how certain patients were more likely to have a delayed or missed diagnosis. This study highlights the potential for data-driven tools to facilitate timely, point-of-care VTE detection by leveraging structured and unstructured EHR data. The prediction model accurately estimated the likelihood of incident VTEs, especially in cases diagnosed late, showing potential to reduce costly diagnostic delays.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
15.70
自引率
3.90%
发文量
363
审稿时长
3-6 weeks
期刊介绍: The American Journal of Hematology offers extensive coverage of experimental and clinical aspects of blood diseases in humans and animal models. The journal publishes original contributions in both non-malignant and malignant hematological diseases, encompassing clinical and basic studies in areas such as hemostasis, thrombosis, immunology, blood banking, and stem cell biology. Clinical translational reports highlighting innovative therapeutic approaches for the diagnosis and treatment of hematological diseases are actively encouraged.The American Journal of Hematology features regular original laboratory and clinical research articles, brief research reports, critical reviews, images in hematology, as well as letters and correspondence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信