诊断代码与临床笔记在糖尿病视网膜病变患者分类中的比较

IF 3.2 Q1 OPHTHALMOLOGY

Ophthalmology science Pub Date : 2024-06-14 DOI:10.1016/j.xops.2024.100564

{"title":"诊断代码与临床笔记在糖尿病视网膜病变患者分类中的比较","authors":"","doi":"10.1016/j.xops.2024.100564","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><p>Electronic health records (EHRs) contain a vast amount of clinical data. Improved automated classification approaches have the potential to accurately and efficiently identify patient cohorts for research. We evaluated if a rule-based natural language processing (NLP) algorithm using clinical notes performed better for classifying proliferative diabetic retinopathy (PDR) and nonproliferative diabetic retinopathy (NPDR) severity compared with International Classification of Diseases, ninth edition (ICD-9) or 10th edition (ICD-10) codes.</p></div><div><h3>Design</h3><p>Cross-sectional study.</p></div><div><h3>Subjects</h3><p>Deidentified EHR data from an academic medical center identified 2366 patients aged ≥18 years, with diabetes mellitus, diabetic retinopathy (DR), and available clinical notes.</p></div><div><h3>Methods</h3><p>From these 2366 patients, 306 random patients (100 training set, 206 test set) underwent chart review by ophthalmologists to establish the gold standard. International Classification of Diseases codes were extracted from the EHR. The notes algorithm identified positive mention of PDR and NPDR severity from clinical notes. Proliferative diabetic retinopathy and NPDR severity classification by ICD codes and the notes algorithm were compared with the gold standard. The entire DR cohort (N = 2366) was then classified as having presence (or absence) of PDR using ICD codes and the notes algorithm.</p></div><div><h3>Main Outcome Measures</h3><p>Sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score for the notes algorithm compared with ICD codes using a gold standard of chart review.</p></div><div><h3>Results</h3><p>For PDR classification of the test set patients, the notes algorithm performed better than ICD codes for all metrics. Specifically, the notes algorithm had significantly higher sensitivity (90.5% [95% confidence interval 85.7, 94.9] vs. 68.4% [60.4, 75.3]), but similar PPV (98.0% [95.4–100] vs. 94.7% [90.3, 98.3]) respectively. The F1 score was 0.941 [0.910, 0.966] for the notes algorithm compared with 0.794 [0.734, 0.842] for ICD codes. For PDR classification, ICD-10 codes performed better than ICD-9 codes (F1 score 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]). For NPDR severity classification, the notes algorithm performed similarly to ICD codes, but performance was limited by small sample size.</p></div><div><h3>Conclusions</h3><p>The notes algorithm outperformed ICD codes for PDR classification. The findings demonstrate the significant potential of applying a rule-based NLP algorithm to clinical notes to increase the efficiency and accuracy of cohort selection for research.</p></div><div><h3>Financial Disclosure(s)</h3><p>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</p></div>","PeriodicalId":74363,"journal":{"name":"Ophthalmology science","volume":"4 6","pages":"Article 100564"},"PeriodicalIF":3.2000,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666914524001003/pdfft?md5=958f568c39babd1a1573d7125a9a1d48&pid=1-s2.0-S2666914524001003-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy\",\"authors\":\"\",\"doi\":\"10.1016/j.xops.2024.100564\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Purpose</h3><p>Electronic health records (EHRs) contain a vast amount of clinical data. Improved automated classification approaches have the potential to accurately and efficiently identify patient cohorts for research. We evaluated if a rule-based natural language processing (NLP) algorithm using clinical notes performed better for classifying proliferative diabetic retinopathy (PDR) and nonproliferative diabetic retinopathy (NPDR) severity compared with International Classification of Diseases, ninth edition (ICD-9) or 10th edition (ICD-10) codes.</p></div><div><h3>Design</h3><p>Cross-sectional study.</p></div><div><h3>Subjects</h3><p>Deidentified EHR data from an academic medical center identified 2366 patients aged ≥18 years, with diabetes mellitus, diabetic retinopathy (DR), and available clinical notes.</p></div><div><h3>Methods</h3><p>From these 2366 patients, 306 random patients (100 training set, 206 test set) underwent chart review by ophthalmologists to establish the gold standard. International Classification of Diseases codes were extracted from the EHR. The notes algorithm identified positive mention of PDR and NPDR severity from clinical notes. Proliferative diabetic retinopathy and NPDR severity classification by ICD codes and the notes algorithm were compared with the gold standard. The entire DR cohort (N = 2366) was then classified as having presence (or absence) of PDR using ICD codes and the notes algorithm.</p></div><div><h3>Main Outcome Measures</h3><p>Sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score for the notes algorithm compared with ICD codes using a gold standard of chart review.</p></div><div><h3>Results</h3><p>For PDR classification of the test set patients, the notes algorithm performed better than ICD codes for all metrics. Specifically, the notes algorithm had significantly higher sensitivity (90.5% [95% confidence interval 85.7, 94.9] vs. 68.4% [60.4, 75.3]), but similar PPV (98.0% [95.4–100] vs. 94.7% [90.3, 98.3]) respectively. The F1 score was 0.941 [0.910, 0.966] for the notes algorithm compared with 0.794 [0.734, 0.842] for ICD codes. For PDR classification, ICD-10 codes performed better than ICD-9 codes (F1 score 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]). For NPDR severity classification, the notes algorithm performed similarly to ICD codes, but performance was limited by small sample size.</p></div><div><h3>Conclusions</h3><p>The notes algorithm outperformed ICD codes for PDR classification. The findings demonstrate the significant potential of applying a rule-based NLP algorithm to clinical notes to increase the efficiency and accuracy of cohort selection for research.</p></div><div><h3>Financial Disclosure(s)</h3><p>Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.</p></div>\",\"PeriodicalId\":74363,\"journal\":{\"name\":\"Ophthalmology science\",\"volume\":\"4 6\",\"pages\":\"Article 100564\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-06-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666914524001003/pdfft?md5=958f568c39babd1a1573d7125a9a1d48&pid=1-s2.0-S2666914524001003-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmology science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666914524001003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmology science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666914524001003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的电子健康记录（EHR）包含大量临床数据。改进后的自动分类方法有可能准确、高效地识别出用于研究的患者群体。我们评估了基于规则的自然语言处理（NLP）算法在增殖性糖尿病视网膜病变（PDR）和非增殖性糖尿病视网膜病变（NPDR）严重程度分类方面的表现是否优于国际疾病分类第九版（ICD-9）或第十版（ICD-10）代码。方法从这 2366 名患者中随机抽取 306 名患者（100 名训练集，206 名测试集）接受眼科医生的病历审查，以建立金标准。从电子病历中提取国际疾病分类代码。注释算法从临床注释中识别出阳性的 PDR 和 NPDR 严重程度。通过 ICD 代码和笔记算法对增生性糖尿病视网膜病变和 NPDR 严重程度进行分类，并与金标准进行比较。然后使用 ICD 编码和笔记算法将整个 DR 队列（N = 2366）划分为存在（或不存在）PDR.主要结果测量笔记算法的灵敏度、特异性、阳性预测值 (PPV)、阴性预测值和 F1 分数与使用病历审查金标准的 ICD 编码进行比较.结果对于测试集患者的 PDR 分类，笔记算法在所有指标上都优于 ICD 编码。具体来说，笔记算法的灵敏度（90.5% [95% 置信区间 85.7, 94.9] vs. 68.4% [60.4, 75.3]）明显高于 ICD 编码，但 PPV（98.0% [95.4-100] vs. 94.7% [90.3, 98.3]）相近。注释算法的 F1 得分为 0.941 [0.910, 0.966]，而 ICD 代码的 F1 得分为 0.794 [0.734, 0.842]。在 PDR 分类方面，ICD-10 编码的表现优于 ICD-9 编码（F1 得分为 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]）。在 NPDR 严重程度分类方面，笔记算法的表现与 ICD 代码相似，但由于样本量较小，性能受到了限制。研究结果表明，将基于规则的 NLP 算法应用到临床笔记中，可以大大提高研究中队列选择的效率和准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparison of Diagnosis Codes to Clinical Notes in Classifying Patients with Diabetic Retinopathy

Purpose

Electronic health records (EHRs) contain a vast amount of clinical data. Improved automated classification approaches have the potential to accurately and efficiently identify patient cohorts for research. We evaluated if a rule-based natural language processing (NLP) algorithm using clinical notes performed better for classifying proliferative diabetic retinopathy (PDR) and nonproliferative diabetic retinopathy (NPDR) severity compared with International Classification of Diseases, ninth edition (ICD-9) or 10th edition (ICD-10) codes.

Design

Cross-sectional study.

Subjects

Deidentified EHR data from an academic medical center identified 2366 patients aged ≥18 years, with diabetes mellitus, diabetic retinopathy (DR), and available clinical notes.

Methods

From these 2366 patients, 306 random patients (100 training set, 206 test set) underwent chart review by ophthalmologists to establish the gold standard. International Classification of Diseases codes were extracted from the EHR. The notes algorithm identified positive mention of PDR and NPDR severity from clinical notes. Proliferative diabetic retinopathy and NPDR severity classification by ICD codes and the notes algorithm were compared with the gold standard. The entire DR cohort (N = 2366) was then classified as having presence (or absence) of PDR using ICD codes and the notes algorithm.

Main Outcome Measures

Sensitivity, specificity, positive predictive value (PPV), negative predictive value, and F1 score for the notes algorithm compared with ICD codes using a gold standard of chart review.

Results

For PDR classification of the test set patients, the notes algorithm performed better than ICD codes for all metrics. Specifically, the notes algorithm had significantly higher sensitivity (90.5% [95% confidence interval 85.7, 94.9] vs. 68.4% [60.4, 75.3]), but similar PPV (98.0% [95.4–100] vs. 94.7% [90.3, 98.3]) respectively. The F1 score was 0.941 [0.910, 0.966] for the notes algorithm compared with 0.794 [0.734, 0.842] for ICD codes. For PDR classification, ICD-10 codes performed better than ICD-9 codes (F1 score 0.836 [0.771, 0.878] vs. 0.596 [0.222, 0.692]). For NPDR severity classification, the notes algorithm performed similarly to ICD codes, but performance was limited by small sample size.

Conclusions

The notes algorithm outperformed ICD codes for PDR classification. The findings demonstrate the significant potential of applying a rule-based NLP algorithm to clinical notes to increase the efficiency and accuracy of cohort selection for research.