Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives.

IF 3.3 Q2 ONCOLOGY

JCO Clinical Cancer Informatics Pub Date : 2024-08-01 DOI:10.1200/CCI.23.00235

Alaa Albashayreh, Anindita Bandyopadhyay, Nahid Zeinali, Min Zhang, Weiguo Fan, Stephanie Gilbertson White

{"title":"Natural Language Processing Accurately Differentiates Cancer Symptom Information in Electronic Health Record Narratives.","authors":"Alaa Albashayreh, Anindita Bandyopadhyay, Nahid Zeinali, Min Zhang, Weiguo Fan, Stephanie Gilbertson White","doi":"10.1200/CCI.23.00235","DOIUrl":null,"url":null,"abstract":"Purpose: Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer.Methods: We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing.Results: The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes).Conclusion: We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2300235"},"PeriodicalIF":3.3000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00235","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Identifying cancer symptoms in electronic health record (EHR) narratives is feasible with natural language processing (NLP). However, more efficient NLP systems are needed to detect various symptoms and distinguish observed symptoms from negated symptoms and medication-related side effects. We evaluated the accuracy of NLP in (1) detecting 14 symptom groups (ie, pain, fatigue, swelling, depressed mood, anxiety, nausea/vomiting, pruritus, headache, shortness of breath, constipation, numbness/tingling, decreased appetite, impaired memory, disturbed sleep) and (2) distinguishing observed symptoms in EHR narratives among patients with cancer.

Methods: We extracted 902,508 notes for 11,784 unique patients diagnosed with cancer and developed a gold standard corpus of 1,112 notes labeled for presence or absence of 14 symptom groups. We trained an embeddings-augmented NLP system integrating human and machine intelligence and conventional machine learning algorithms. NLP metrics were calculated on a gold standard corpus subset for testing.

Results: The interannotator agreement for labeling the gold standard corpus was excellent at 92%. The embeddings-augmented NLP model achieved the best performance (F1 score = 0.877). The highest NLP accuracy was observed in pruritus (F1 score = 0.937) while the lowest accuracy was in swelling (F1 score = 0.787). After classifying the entire data set with embeddings-augmented NLP, we found that 41% of the notes included symptom documentation. Pain was the most documented symptom (29% of all notes) while impaired memory was the least documented (0.7% of all notes).

Conclusion: We illustrated the feasibility of detecting 14 symptom groups in EHR narratives and showed that an embeddings-augmented NLP system outperforms conventional machine learning algorithms in detecting symptom information and differentiating observed symptoms from negated symptoms and medication-related side effects.

查看原文本刊更多论文

自然语言处理技术准确区分电子健康记录叙述中的癌症症状信息。

目的：利用自然语言处理（NLP）技术识别电子健康记录（EHR）叙述中的癌症症状是可行的。然而，需要更高效的 NLP 系统来检测各种症状，并将观察到的症状与否定症状和药物相关副作用区分开来。我们评估了 NLP 在以下方面的准确性：(1) 检测 14 组症状（即疼痛、疲劳、肿胀、情绪低落、焦虑、恶心/呕吐、瘙痒、头痛、气短、便秘、麻木/刺痛、食欲下降、记忆力减退、睡眠紊乱）；(2) 区分癌症患者电子病历叙述中的观察到的症状：我们提取了 11,784 名癌症患者的 902,508 份笔记，并开发了一个由 1,112 份笔记组成的金标准语料库，标注了 14 个症状组的存在与否。我们训练了一个嵌入式增强 NLP 系统，该系统集成了人类智能、机器智能和传统机器学习算法。在黄金标准语料子集上计算了 NLP 指标，以进行测试：结果：对黄金标准语料进行标注时，标注者之间的一致性非常好，达到 92%。嵌入式增强 NLP 模型取得了最佳性能（F1 分数 = 0.877）。瘙痒症的 NLP 准确率最高（F1 分数 = 0.937），而肿胀症的准确率最低（F1 分数 = 0.787）。使用嵌入式增强 NLP 对整个数据集进行分类后，我们发现 41% 的笔记包含症状记录。疼痛是记录最多的症状（占所有笔记的 29%），而记忆受损是记录最少的症状（占所有笔记的 0.7%）：我们展示了在电子病历叙述中检测 14 个症状组的可行性，并表明在检测症状信息以及区分观察到的症状与否定症状和药物相关副作用方面，嵌入式增强 NLP 系统优于传统的机器学习算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JCO Clinical Cancer Informatics ONCOLOGY-

CiteScore

6.20

自引率

4.80%

发文量

190