Yan Huang, Xiaojin Li, Deepa Dongarwar, Hulin Wu, Guo-Qiang Zhang
{"title":"Data Mining Pipeline for COVID-19 Vaccine Safety Analysis Using a Large Electronic Health Record.","authors":"Yan Huang, Xiaojin Li, Deepa Dongarwar, Hulin Wu, Guo-Qiang Zhang","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>We developed a novel data mining pipeline that automatically extracts potential COVID-19 vaccine-related adverse events from a large Electronic Health Record (EHR) dataset. We applied this pipeline to Optum<sup>®</sup> de-identified COVID-19 EHR dataset containing COVID-19 vaccine records between December 11, 2020 and January 20, 2022. We compared post-vaccination diagnoses between the COVID-19 vaccine group and the influenza vaccine group among 553,682 individuals without COVID-19 infection. We extracted 1,414 ICD-10 diagnosis categories (first three ICD10 digits) within 180 days after the first dose of the COVID-19 vaccine. We then ranked the diagnosis codes using the adverse event rates and adjusted odds ratio based on the self-controlled case series analysis. Using inverse probability of censoring weighting, we estimated the right-censored time-to-event records. Our results show that the COVID-19 vaccine has a similar adverse events rate to the influenza vaccine. We found 20 types of potential COVID-19 vaccine-related adverse events that may need further investigation.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2023 ","pages":"271-280"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10283124/pdf/2352.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We developed a novel data mining pipeline that automatically extracts potential COVID-19 vaccine-related adverse events from a large Electronic Health Record (EHR) dataset. We applied this pipeline to Optum® de-identified COVID-19 EHR dataset containing COVID-19 vaccine records between December 11, 2020 and January 20, 2022. We compared post-vaccination diagnoses between the COVID-19 vaccine group and the influenza vaccine group among 553,682 individuals without COVID-19 infection. We extracted 1,414 ICD-10 diagnosis categories (first three ICD10 digits) within 180 days after the first dose of the COVID-19 vaccine. We then ranked the diagnosis codes using the adverse event rates and adjusted odds ratio based on the self-controlled case series analysis. Using inverse probability of censoring weighting, we estimated the right-censored time-to-event records. Our results show that the COVID-19 vaccine has a similar adverse events rate to the influenza vaccine. We found 20 types of potential COVID-19 vaccine-related adverse events that may need further investigation.