The Digital Analytic Patient Reviewer (DAPR) for COVID-19 Data Mart Validation.

IF 1.3 4区 医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS
Heekyong Park, Taowei David Wang, Nich Wattanasin, Victor M Castro, Vivian Gainer, Sergey Goryachev, Shawn Murphy
{"title":"The Digital Analytic Patient Reviewer (DAPR) for COVID-19 Data Mart Validation.","authors":"Heekyong Park,&nbsp;Taowei David Wang,&nbsp;Nich Wattanasin,&nbsp;Victor M Castro,&nbsp;Vivian Gainer,&nbsp;Sergey Goryachev,&nbsp;Shawn Murphy","doi":"10.1055/a-1938-0436","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.</p><p><strong>Methods: </strong>Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.</p><p><strong>Results: </strong>The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.</p><p><strong>Conclusion: </strong>Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 5-06","pages":"167-173"},"PeriodicalIF":1.3000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-1938-0436","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.

Methods: Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.

Results: The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.

Conclusion: Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.

COVID-19数据集市验证的数字分析患者审稿人(DAPR)。
目的:为了为2019冠状病毒病(COVID-19)研究提供高质量的数据,我们在麻省总医院(MGB) COVID-19数据集市中验证了衍生的COVID-19临床指标和22种相关机器学习表型。方法:对数据集市中150例covid -19阳性患者进行回顾性手工图表复习。为了支持对大范围目标数据的快速图表审查,我们提供了一个基于自然语言处理(NLP)的图表审查工具,数字分析患者审查(DAPR)。在这项工作中,我们设计了一个专门的患者总结视图,并开发了新的127 NLP逻辑来提取COVID-19相关的医学概念和目标表型。此外,我们将DAPR转换为研究目的,以便患者信息仅用于批准的研究目的,并支持快速访问集成的患者信息。最后,我们进行了一项调查来评估DAPR的验证难度和有用性。结果:COVID-19阳性队列、COVID-19索引日期、COVID-19相关入院、入院日期等概念在所有评价指标中均具有较高值。然而,与大流行前人群的阳性预测值相比,三种表型表现出显著的性能下降。基于这些结果,我们从数据集市中删除了这三种表型。在使用该工具的调查中,参与者对使用DAPR进行图表审查表达了积极的态度。他们认为验证很容易,DAPR帮助找到了相关信息。还讨论了一些验证困难。结论:在图表审核中使用NLP技术有助于应对COVID-19数据验证任务的挑战,并加快了流程。因此,我们可以及时提供更可靠的研究数据,应对COVID-19危机。DAPR的好处可以扩展到其他领域。我们计划将其应用于更广泛的研究小组。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods of Information in Medicine
Methods of Information in Medicine 医学-计算机:信息系统
CiteScore
3.70
自引率
11.80%
发文量
33
审稿时长
6-12 weeks
期刊介绍: Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信