Yuanda Zhu, A. Mahale, Kourtney Peters, Lejy Mathew, F. Giuste, B. Anderson, May D. Wang
{"title":"Using natural language processing on free-text clinical notes to identify patients with long-term COVID effects","authors":"Yuanda Zhu, A. Mahale, Kourtney Peters, Lejy Mathew, F. Giuste, B. Anderson, May D. Wang","doi":"10.1145/3535508.3545555","DOIUrl":null,"url":null,"abstract":"As of May 15th, 2022, the novel coronavirus SARS-COV-2 has infected 517 million people and resulted in more than 6.2 million deaths around the world. About 40% to 87% of patients suffer from persistent symptoms weeks or months after their original infection. Despite remarkable progress in preventing and treating acute COVID-19 conditions, the clinical diagnosis of long-term COVID remains difficult. In this work, we use free-text clinical notes and natural language processing (NLP) techniques to explore long-term COVID effects. We first obtain free-text clinical notes from 719 outpatient encounters representing patients treated by physicians at Emory Clinic to detect patterns in patients with long-term COVID symptoms. We apply state-of-the-art NLP frameworks to automatically identify patients with long-term COVID effects, achieving 0.881 recall (sensitivity) score for note-level prediction. We further interpret the prediction outcomes and discuss potential phenotypes. Our work aims to provide a data-driven solution to identify patients who have developed persistent symptoms after acute COVID infection. With this work, clinicians may be able to identify patients who have long-term COVID symptoms to optimize treatment.","PeriodicalId":354504,"journal":{"name":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","volume":"21 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3535508.3545555","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
As of May 15th, 2022, the novel coronavirus SARS-COV-2 has infected 517 million people and resulted in more than 6.2 million deaths around the world. About 40% to 87% of patients suffer from persistent symptoms weeks or months after their original infection. Despite remarkable progress in preventing and treating acute COVID-19 conditions, the clinical diagnosis of long-term COVID remains difficult. In this work, we use free-text clinical notes and natural language processing (NLP) techniques to explore long-term COVID effects. We first obtain free-text clinical notes from 719 outpatient encounters representing patients treated by physicians at Emory Clinic to detect patterns in patients with long-term COVID symptoms. We apply state-of-the-art NLP frameworks to automatically identify patients with long-term COVID effects, achieving 0.881 recall (sensitivity) score for note-level prediction. We further interpret the prediction outcomes and discuss potential phenotypes. Our work aims to provide a data-driven solution to identify patients who have developed persistent symptoms after acute COVID infection. With this work, clinicians may be able to identify patients who have long-term COVID symptoms to optimize treatment.