Heekyong Park, Taowei David Wang, Nich Wattanasin, Victor M Castro, Vivian Gainer, Sergey Goryachev, Shawn Murphy
{"title":"The Digital Analytic Patient Reviewer (DAPR) for COVID-19 Data Mart Validation.","authors":"Heekyong Park, Taowei David Wang, Nich Wattanasin, Victor M Castro, Vivian Gainer, Sergey Goryachev, Shawn Murphy","doi":"10.1055/a-1938-0436","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.</p><p><strong>Methods: </strong>Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.</p><p><strong>Results: </strong>The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.</p><p><strong>Conclusion: </strong>Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.</p>","PeriodicalId":49822,"journal":{"name":"Methods of Information in Medicine","volume":"61 5-06","pages":"167-173"},"PeriodicalIF":1.3000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods of Information in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-1938-0436","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.
Methods: Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.
Results: The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.
Conclusion: Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.
期刊介绍:
Good medicine and good healthcare demand good information. Since the journal''s founding in 1962, Methods of Information in Medicine has stressed the methodology and scientific fundamentals of organizing, representing and analyzing data, information and knowledge in biomedicine and health care. Covering publications in the fields of biomedical and health informatics, medical biometry, and epidemiology, the journal publishes original papers, reviews, reports, opinion papers, editorials, and letters to the editor. From time to time, the journal publishes articles on particular focus themes as part of a journal''s issue.