探索使用自然语言处理支持全国性静脉血栓栓塞监测的适用性:模型评估研究。

Aaron Wendelboe, Ibrahim Saber, Justin Dvorak, Alys Adamski, Natalie Feland, Nimia Reyes, Karon Abe, Thomas Ortel, Gary Raskob
{"title":"探索使用自然语言处理支持全国性静脉血栓栓塞监测的适用性:模型评估研究。","authors":"Aaron Wendelboe, Ibrahim Saber, Justin Dvorak, Alys Adamski, Natalie Feland, Nimia Reyes, Karon Abe, Thomas Ortel, Gary Raskob","doi":"10.2196/36877","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Venous thromboembolism (VTE) is a preventable, common vascular disease that has been estimated to affect up to 900,000 people per year. It has been associated with risk factors such as recent surgery, cancer, and hospitalization. VTE surveillance for patient management and safety can be improved via natural language processing (NLP). NLP tools have the ability to access electronic medical records, identify patients that meet the VTE case definition, and subsequently enter the relevant information into a database for hospital review.</p><p><strong>Objective: </strong>We aimed to evaluate the performance of a VTE identification model of IDEAL-X (Information and Data Extraction Using Adaptive Learning; Emory University)-an NLP tool-in automatically classifying cases of VTE by \"reading\" unstructured text from diagnostic imaging records collected from 2012 to 2014.</p><p><strong>Methods: </strong>After accessing imaging records from pilot surveillance systems for VTE from Duke University and the University of Oklahoma Health Sciences Center (OUHSC), we used a VTE identification model of IDEAL-X to classify cases of VTE that had previously been manually classified. Experts reviewed the technicians' comments in each record to determine if a VTE event occurred. The performance measures calculated (with 95% CIs) were accuracy, sensitivity, specificity, and positive and negative predictive values. Chi-square tests of homogeneity were conducted to evaluate differences in performance measures by site, using a significance level of .05.</p><p><strong>Results: </strong>The VTE model of IDEAL-X \"read\" 1591 records from Duke University and 1487 records from the OUHSC, for a total of 3078 records. The combined performance measures were 93.7% accuracy (95% CI 93.7%-93.8%), 96.3% sensitivity (95% CI 96.2%-96.4%), 92% specificity (95% CI 91.9%-92%), an 89.1% positive predictive value (95% CI 89%-89.2%), and a 97.3% negative predictive value (95% CI 97.3%-97.4%). The sensitivity was higher at Duke University (97.9%, 95% CI 97.8%-98%) than at the OUHSC (93.3%, 95% CI 93.1%-93.4%; <i>P</i><.001), but the specificity was higher at the OUHSC (95.9%, 95% CI 95.8%-96%) than at Duke University (86.5%, 95% CI 86.4%-86.7%; <i>P</i><.001).</p><p><strong>Conclusions: </strong>The VTE model of IDEAL-X accurately classified cases of VTE from the pilot surveillance systems of two separate health systems in Durham, North Carolina, and Oklahoma City, Oklahoma. NLP is a promising tool for the design and implementation of an automated, cost-effective national surveillance system for VTE. Conducting public health surveillance at a national scale is important for measuring disease burden and the impact of prevention measures. We recommend additional studies to identify how integrating IDEAL-X in a medical record system could further automate the surveillance process.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"3 1","pages":"e36877"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193259/pdf/","citationCount":"0","resultStr":"{\"title\":\"Exploring the Applicability of Using Natural Language Processing to Support Nationwide Venous Thromboembolism Surveillance: Model Evaluation Study.\",\"authors\":\"Aaron Wendelboe, Ibrahim Saber, Justin Dvorak, Alys Adamski, Natalie Feland, Nimia Reyes, Karon Abe, Thomas Ortel, Gary Raskob\",\"doi\":\"10.2196/36877\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Venous thromboembolism (VTE) is a preventable, common vascular disease that has been estimated to affect up to 900,000 people per year. It has been associated with risk factors such as recent surgery, cancer, and hospitalization. VTE surveillance for patient management and safety can be improved via natural language processing (NLP). NLP tools have the ability to access electronic medical records, identify patients that meet the VTE case definition, and subsequently enter the relevant information into a database for hospital review.</p><p><strong>Objective: </strong>We aimed to evaluate the performance of a VTE identification model of IDEAL-X (Information and Data Extraction Using Adaptive Learning; Emory University)-an NLP tool-in automatically classifying cases of VTE by \\\"reading\\\" unstructured text from diagnostic imaging records collected from 2012 to 2014.</p><p><strong>Methods: </strong>After accessing imaging records from pilot surveillance systems for VTE from Duke University and the University of Oklahoma Health Sciences Center (OUHSC), we used a VTE identification model of IDEAL-X to classify cases of VTE that had previously been manually classified. Experts reviewed the technicians' comments in each record to determine if a VTE event occurred. The performance measures calculated (with 95% CIs) were accuracy, sensitivity, specificity, and positive and negative predictive values. Chi-square tests of homogeneity were conducted to evaluate differences in performance measures by site, using a significance level of .05.</p><p><strong>Results: </strong>The VTE model of IDEAL-X \\\"read\\\" 1591 records from Duke University and 1487 records from the OUHSC, for a total of 3078 records. The combined performance measures were 93.7% accuracy (95% CI 93.7%-93.8%), 96.3% sensitivity (95% CI 96.2%-96.4%), 92% specificity (95% CI 91.9%-92%), an 89.1% positive predictive value (95% CI 89%-89.2%), and a 97.3% negative predictive value (95% CI 97.3%-97.4%). The sensitivity was higher at Duke University (97.9%, 95% CI 97.8%-98%) than at the OUHSC (93.3%, 95% CI 93.1%-93.4%; <i>P</i><.001), but the specificity was higher at the OUHSC (95.9%, 95% CI 95.8%-96%) than at Duke University (86.5%, 95% CI 86.4%-86.7%; <i>P</i><.001).</p><p><strong>Conclusions: </strong>The VTE model of IDEAL-X accurately classified cases of VTE from the pilot surveillance systems of two separate health systems in Durham, North Carolina, and Oklahoma City, Oklahoma. NLP is a promising tool for the design and implementation of an automated, cost-effective national surveillance system for VTE. Conducting public health surveillance at a national scale is important for measuring disease burden and the impact of prevention measures. We recommend additional studies to identify how integrating IDEAL-X in a medical record system could further automate the surveillance process.</p>\",\"PeriodicalId\":73552,\"journal\":{\"name\":\"JMIR bioinformatics and biotechnology\",\"volume\":\"3 1\",\"pages\":\"e36877\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193259/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR bioinformatics and biotechnology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/36877\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR bioinformatics and biotechnology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/36877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:静脉血栓栓塞症(VTE)是一种可预防的常见血管疾病,据估计每年影响多达 90 万人。它与近期手术、癌症和住院等风险因素有关。通过自然语言处理 (NLP) 可以改善对患者管理和安全的 VTE 监控。NLP 工具能够访问电子病历,识别符合 VTE 病例定义的患者,然后将相关信息输入数据库供医院审查:我们旨在评估 IDEAL-X(埃默里大学自适应学习信息和数据提取工具)--一种 NLP 工具--的 VTE 识别模型的性能,该模型通过 "阅读 "2012 年至 2014 年收集的诊断成像记录中的非结构化文本,自动对 VTE 病例进行分类:我们从杜克大学和俄克拉荷马大学健康科学中心 (OUHSC) 的 VTE 试点监控系统中获取了成像记录,然后使用 IDEAL-X 的 VTE 识别模型对之前人工分类的 VTE 病例进行分类。专家们查看了每份记录中技术人员的注释,以确定是否发生了 VTE 事件。计算出的性能指标(含 95% CI)包括准确性、灵敏度、特异性以及阳性和阴性预测值。在显著性水平为 0.05 的情况下,进行了同质性的卡方检验,以评估不同地点的性能指标差异:结果:IDEAL-X 的 VTE 模型 "读取 "了杜克大学的 1591 条记录和 OUHSC 的 1487 条记录,共计 3078 条记录。综合性能指标为准确率 93.7%(95% CI 93.7%-93.8%)、灵敏度 96.3%(95% CI 96.2%-96.4%)、特异性 92%(95% CI 91.9%-92%)、阳性预测值 89.1%(95% CI 89%-89.2%)和阴性预测值 97.3%(95% CI 97.3%-97.4%)。杜克大学的灵敏度(97.9%,95% CI 97.8%-98%)高于华侨大学医院(93.3%,95% CI 93.1%-93.4%;PPConclusions:IDEAL-X 的 VTE 模型对北卡罗来纳州达勒姆市和俄克拉荷马州俄克拉荷马市两个独立医疗系统试点监控系统中的 VTE 病例进行了准确分类。对于设计和实施自动化、经济高效的 VTE 全国监测系统而言,NLP 是一种很有前途的工具。在全国范围内开展公共卫生监测对于衡量疾病负担和预防措施的影响非常重要。我们建议开展更多研究,以确定如何将 IDEAL-X 集成到病历系统中,从而进一步实现监测过程的自动化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Exploring the Applicability of Using Natural Language Processing to Support Nationwide Venous Thromboembolism Surveillance: Model Evaluation Study.

Exploring the Applicability of Using Natural Language Processing to Support Nationwide Venous Thromboembolism Surveillance: Model Evaluation Study.

Background: Venous thromboembolism (VTE) is a preventable, common vascular disease that has been estimated to affect up to 900,000 people per year. It has been associated with risk factors such as recent surgery, cancer, and hospitalization. VTE surveillance for patient management and safety can be improved via natural language processing (NLP). NLP tools have the ability to access electronic medical records, identify patients that meet the VTE case definition, and subsequently enter the relevant information into a database for hospital review.

Objective: We aimed to evaluate the performance of a VTE identification model of IDEAL-X (Information and Data Extraction Using Adaptive Learning; Emory University)-an NLP tool-in automatically classifying cases of VTE by "reading" unstructured text from diagnostic imaging records collected from 2012 to 2014.

Methods: After accessing imaging records from pilot surveillance systems for VTE from Duke University and the University of Oklahoma Health Sciences Center (OUHSC), we used a VTE identification model of IDEAL-X to classify cases of VTE that had previously been manually classified. Experts reviewed the technicians' comments in each record to determine if a VTE event occurred. The performance measures calculated (with 95% CIs) were accuracy, sensitivity, specificity, and positive and negative predictive values. Chi-square tests of homogeneity were conducted to evaluate differences in performance measures by site, using a significance level of .05.

Results: The VTE model of IDEAL-X "read" 1591 records from Duke University and 1487 records from the OUHSC, for a total of 3078 records. The combined performance measures were 93.7% accuracy (95% CI 93.7%-93.8%), 96.3% sensitivity (95% CI 96.2%-96.4%), 92% specificity (95% CI 91.9%-92%), an 89.1% positive predictive value (95% CI 89%-89.2%), and a 97.3% negative predictive value (95% CI 97.3%-97.4%). The sensitivity was higher at Duke University (97.9%, 95% CI 97.8%-98%) than at the OUHSC (93.3%, 95% CI 93.1%-93.4%; P<.001), but the specificity was higher at the OUHSC (95.9%, 95% CI 95.8%-96%) than at Duke University (86.5%, 95% CI 86.4%-86.7%; P<.001).

Conclusions: The VTE model of IDEAL-X accurately classified cases of VTE from the pilot surveillance systems of two separate health systems in Durham, North Carolina, and Oklahoma City, Oklahoma. NLP is a promising tool for the design and implementation of an automated, cost-effective national surveillance system for VTE. Conducting public health surveillance at a national scale is important for measuring disease burden and the impact of prevention measures. We recommend additional studies to identify how integrating IDEAL-X in a medical record system could further automate the surveillance process.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.90
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信