Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records

Mayo Clinic Proceedings. Digital health Pub Date : 2024-02-01 DOI:10.1016/j.mcpdig.2024.01.001

Cristian Soto Jacome MD , Danny Segura Torres MD , Jungwei W. Fan PhD , Ricardo Loor-Torres MD , Mayra Duran MD , Misk Al Zahidy MS , Esteban Cabezas MD , Mariana Borras-Osorio MD , David Toro-Tobon MD , Yuqi Wu PhD , Yonghui Wu PhD , Naykky Singh Ospina MD, MS , Juan P. Brito MD, MS

{"title":"Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records","authors":"Cristian Soto Jacome MD , Danny Segura Torres MD , Jungwei W. Fan PhD , Ricardo Loor-Torres MD , Mayra Duran MD , Misk Al Zahidy MS , Esteban Cabezas MD , Mariana Borras-Osorio MD , David Toro-Tobon MD , Yuqi Wu PhD , Yonghui Wu PhD , Naykky Singh Ospina MD, MS , Juan P. Brito MD, MS","doi":"10.1016/j.mcpdig.2024.01.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>To address thyroid cancer overdiagnosis, we aim to develop a natural language processing (NLP) algorithm to determine the appropriateness of thyroid ultrasounds (TUS).</p></div><div><h3>Patients and Methods</h3><p>Between 2017 and 2021, we identified 18,000 TUS patients at Mayo Clinic and selected 628 for chart review to create a ground truth dataset based on consensus. We developed a rule-based NLP pipeline to identify TUS as appropriate TUS (aTUS) or inappropriate TUS (iTUS) using patients’ clinical notes and additional meta information. In addition, we designed an abbreviated NLP pipeline (aNLP) solely focusing on labels from TUS order requisitions to facilitate deployment at other health care systems. Our dataset was split into a training set of 468 (75%) and a test set of 160 (25%), using the former for rule development and the latter for performance evaluation.</p></div><div><h3>Results</h3><p>There were 449 (95.9%) patients identified as aTUS and 19 (4.06%) as iTUS in the training set; there are 155 (96.88%) patients identified as aTUS and 5 (3.12%) were iTUS in the test set. In the training set, the pipeline achieved a sensitivity of 0.99, specificity of 0.95, and positive predictive value of 1.0 for detecting aTUS. The testing cohort revealed a sensitivity of 0.96, specificity of 0.80, and positive predictive value of 0.99. Similar performance metrics were observed in the aNLP pipeline.</p></div><div><h3>Conclusion</h3><p>The NLP models can accurately identify the appropriateness of a thyroid ultrasound from clinical documentation and order requisition information, a critical initial step toward evaluating the drivers and outcomes of TUS use and subsequent thyroid cancer overdiagnosis.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 67-74"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000014/pdfft?md5=b25e9a7547bfbd148935d7e81234eadb&pid=1-s2.0-S2949761224000014-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mayo Clinic Proceedings. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949761224000014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Objective

To address thyroid cancer overdiagnosis, we aim to develop a natural language processing (NLP) algorithm to determine the appropriateness of thyroid ultrasounds (TUS).

Patients and Methods

Between 2017 and 2021, we identified 18,000 TUS patients at Mayo Clinic and selected 628 for chart review to create a ground truth dataset based on consensus. We developed a rule-based NLP pipeline to identify TUS as appropriate TUS (aTUS) or inappropriate TUS (iTUS) using patients’ clinical notes and additional meta information. In addition, we designed an abbreviated NLP pipeline (aNLP) solely focusing on labels from TUS order requisitions to facilitate deployment at other health care systems. Our dataset was split into a training set of 468 (75%) and a test set of 160 (25%), using the former for rule development and the latter for performance evaluation.

Results

There were 449 (95.9%) patients identified as aTUS and 19 (4.06%) as iTUS in the training set; there are 155 (96.88%) patients identified as aTUS and 5 (3.12%) were iTUS in the test set. In the training set, the pipeline achieved a sensitivity of 0.99, specificity of 0.95, and positive predictive value of 1.0 for detecting aTUS. The testing cohort revealed a sensitivity of 0.96, specificity of 0.80, and positive predictive value of 0.99. Similar performance metrics were observed in the aNLP pipeline.

Conclusion

The NLP models can accurately identify the appropriateness of a thyroid ultrasound from clinical documentation and order requisition information, a critical initial step toward evaluating the drivers and outcomes of TUS use and subsequent thyroid cancer overdiagnosis.

查看原文本刊更多论文

通过电子健康记录的自然语言处理识别甲状腺超声检查适宜性

目标为解决甲状腺癌过度诊断问题，我们旨在开发一种自然语言处理（NLP）算法，以确定甲状腺超声检查（TUS）的适当性。患者和方法2017年至2021年期间，我们在梅奥诊所确定了18000名TUS患者，并选择了628名患者进行病历审查，以创建基于共识的基本真实数据集。我们开发了基于规则的 NLP 管道，利用患者的临床笔记和其他元信息将 TUS 识别为合适的 TUS（aTUS）或不合适的 TUS（iTUS）。此外，我们还设计了一个简略的 NLP 管道 (aNLP)，仅关注 TUS 订单申请单中的标签，以方便在其他医疗系统中部署。我们的数据集分为 468 个训练集（占 75%）和 160 个测试集（占 25%），前者用于规则开发，后者用于性能评估。结果在训练集中，有 449 名（95.9%）患者被识别为 aTUS，19 名（4.06%）被识别为 iTUS；在测试集中，有 155 名（96.88%）患者被识别为 aTUS，5 名（3.12%）被识别为 iTUS。在训练集中，管道检测 aTUS 的灵敏度为 0.99，特异度为 0.95，阳性预测值为 1.0。测试组的灵敏度为 0.96，特异性为 0.80，阳性预测值为 0.99。结论：NLP 模型可以从临床文件和请购单信息中准确识别甲状腺超声检查的适当性，这是评估甲状腺超声检查使用及随后甲状腺癌过度诊断的驱动因素和结果的关键性第一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Mayo Clinic Proceedings. Digital health Medicine and Dentistry (General), Health Informatics, Public Health and Health Policy

自引率

0.00%

发文量

审稿时长

47 days