The Digital Analytic Patient Reviewer (DAPR) for COVID-19 Data Mart Validation.

IF 1.8 4区医学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS

Methods of Information in Medicine Pub Date : 2022-12-01 DOI:10.1055/a-1938-0436

Heekyong Park, Taowei David Wang, Nich Wattanasin, Victor M Castro, Vivian Gainer, Sergey Goryachev, Shawn Murphy

{"title":"The Digital Analytic Patient Reviewer (DAPR) for COVID-19 Data Mart Validation.","authors":"Heekyong Park, Taowei David Wang, Nich Wattanasin, Victor M Castro, Vivian Gainer, Sergey Goryachev, Shawn Murphy","doi":"10.1055/a-1938-0436","DOIUrl":null,"url":null,"abstract":"Objective: To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.Methods: Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.Results: The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.Conclusion: Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 5-06","pages":"167-173"},"PeriodicalIF":1.8000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-1938-0436","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.

Methods: Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.

Results: The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.

Conclusion: Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.

查看原文本刊更多论文

COVID-19数据集市验证的数字分析患者审稿人(DAPR)。

目的:为了为2019冠状病毒病(COVID-19)研究提供高质量的数据，我们在麻省总医院(MGB) COVID-19数据集市中验证了衍生的COVID-19临床指标和22种相关机器学习表型。方法:对数据集市中150例covid -19阳性患者进行回顾性手工图表复习。为了支持对大范围目标数据的快速图表审查，我们提供了一个基于自然语言处理(NLP)的图表审查工具，数字分析患者审查(DAPR)。在这项工作中，我们设计了一个专门的患者总结视图，并开发了新的127 NLP逻辑来提取COVID-19相关的医学概念和目标表型。此外，我们将DAPR转换为研究目的，以便患者信息仅用于批准的研究目的，并支持快速访问集成的患者信息。最后，我们进行了一项调查来评估DAPR的验证难度和有用性。结果:COVID-19阳性队列、COVID-19索引日期、COVID-19相关入院、入院日期等概念在所有评价指标中均具有较高值。然而，与大流行前人群的阳性预测值相比，三种表型表现出显著的性能下降。基于这些结果，我们从数据集市中删除了这三种表型。在使用该工具的调查中，参与者对使用DAPR进行图表审查表达了积极的态度。他们认为验证很容易，DAPR帮助找到了相关信息。还讨论了一些验证困难。结论:在图表审核中使用NLP技术有助于应对COVID-19数据验证任务的挑战，并加快了流程。因此，我们可以及时提供更可靠的研究数据，应对COVID-19危机。DAPR的好处可以扩展到其他领域。我们计划将其应用于更广泛的研究小组。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Methods of Information in Medicine 医学-计算机：信息系统

CiteScore

3.70

自引率

11.80%

发文量

审稿时长

6-12 weeks

期刊介绍： Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.