Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records

Cristian Soto Jacome MD , Danny Segura Torres MD , Jungwei W. Fan PhD , Ricardo Loor-Torres MD , Mayra Duran MD , Misk Al Zahidy MS , Esteban Cabezas MD , Mariana Borras-Osorio MD , David Toro-Tobon MD , Yuqi Wu PhD , Yonghui Wu PhD , Naykky Singh Ospina MD, MS , Juan P. Brito MD, MS
{"title":"Thyroid Ultrasound Appropriateness Identification Through Natural Language Processing of Electronic Health Records","authors":"Cristian Soto Jacome MD ,&nbsp;Danny Segura Torres MD ,&nbsp;Jungwei W. Fan PhD ,&nbsp;Ricardo Loor-Torres MD ,&nbsp;Mayra Duran MD ,&nbsp;Misk Al Zahidy MS ,&nbsp;Esteban Cabezas MD ,&nbsp;Mariana Borras-Osorio MD ,&nbsp;David Toro-Tobon MD ,&nbsp;Yuqi Wu PhD ,&nbsp;Yonghui Wu PhD ,&nbsp;Naykky Singh Ospina MD, MS ,&nbsp;Juan P. Brito MD, MS","doi":"10.1016/j.mcpdig.2024.01.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>To address thyroid cancer overdiagnosis, we aim to develop a natural language processing (NLP) algorithm to determine the appropriateness of thyroid ultrasounds (TUS).</p></div><div><h3>Patients and Methods</h3><p>Between 2017 and 2021, we identified 18,000 TUS patients at Mayo Clinic and selected 628 for chart review to create a ground truth dataset based on consensus. We developed a rule-based NLP pipeline to identify TUS as appropriate TUS (aTUS) or inappropriate TUS (iTUS) using patients’ clinical notes and additional meta information. In addition, we designed an abbreviated NLP pipeline (aNLP) solely focusing on labels from TUS order requisitions to facilitate deployment at other health care systems. Our dataset was split into a training set of 468 (75%) and a test set of 160 (25%), using the former for rule development and the latter for performance evaluation.</p></div><div><h3>Results</h3><p>There were 449 (95.9%) patients identified as aTUS and 19 (4.06%) as iTUS in the training set; there are 155 (96.88%) patients identified as aTUS and 5 (3.12%) were iTUS in the test set. In the training set, the pipeline achieved a sensitivity of 0.99, specificity of 0.95, and positive predictive value of 1.0 for detecting aTUS. The testing cohort revealed a sensitivity of 0.96, specificity of 0.80, and positive predictive value of 0.99. Similar performance metrics were observed in the aNLP pipeline.</p></div><div><h3>Conclusion</h3><p>The NLP models can accurately identify the appropriateness of a thyroid ultrasound from clinical documentation and order requisition information, a critical initial step toward evaluating the drivers and outcomes of TUS use and subsequent thyroid cancer overdiagnosis.</p></div>","PeriodicalId":74127,"journal":{"name":"Mayo Clinic Proceedings. Digital health","volume":"2 1","pages":"Pages 67-74"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949761224000014/pdfft?md5=b25e9a7547bfbd148935d7e81234eadb&pid=1-s2.0-S2949761224000014-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mayo Clinic Proceedings. Digital health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949761224000014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

To address thyroid cancer overdiagnosis, we aim to develop a natural language processing (NLP) algorithm to determine the appropriateness of thyroid ultrasounds (TUS).

Patients and Methods

Between 2017 and 2021, we identified 18,000 TUS patients at Mayo Clinic and selected 628 for chart review to create a ground truth dataset based on consensus. We developed a rule-based NLP pipeline to identify TUS as appropriate TUS (aTUS) or inappropriate TUS (iTUS) using patients’ clinical notes and additional meta information. In addition, we designed an abbreviated NLP pipeline (aNLP) solely focusing on labels from TUS order requisitions to facilitate deployment at other health care systems. Our dataset was split into a training set of 468 (75%) and a test set of 160 (25%), using the former for rule development and the latter for performance evaluation.

Results

There were 449 (95.9%) patients identified as aTUS and 19 (4.06%) as iTUS in the training set; there are 155 (96.88%) patients identified as aTUS and 5 (3.12%) were iTUS in the test set. In the training set, the pipeline achieved a sensitivity of 0.99, specificity of 0.95, and positive predictive value of 1.0 for detecting aTUS. The testing cohort revealed a sensitivity of 0.96, specificity of 0.80, and positive predictive value of 0.99. Similar performance metrics were observed in the aNLP pipeline.

Conclusion

The NLP models can accurately identify the appropriateness of a thyroid ultrasound from clinical documentation and order requisition information, a critical initial step toward evaluating the drivers and outcomes of TUS use and subsequent thyroid cancer overdiagnosis.

通过电子健康记录的自然语言处理识别甲状腺超声检查适宜性
目标为解决甲状腺癌过度诊断问题,我们旨在开发一种自然语言处理(NLP)算法,以确定甲状腺超声检查(TUS)的适当性。患者和方法2017年至2021年期间,我们在梅奥诊所确定了18000名TUS患者,并选择了628名患者进行病历审查,以创建基于共识的基本真实数据集。我们开发了基于规则的 NLP 管道,利用患者的临床笔记和其他元信息将 TUS 识别为合适的 TUS(aTUS)或不合适的 TUS(iTUS)。此外,我们还设计了一个简略的 NLP 管道 (aNLP),仅关注 TUS 订单申请单中的标签,以方便在其他医疗系统中部署。我们的数据集分为 468 个训练集(占 75%)和 160 个测试集(占 25%),前者用于规则开发,后者用于性能评估。结果在训练集中,有 449 名(95.9%)患者被识别为 aTUS,19 名(4.06%)被识别为 iTUS;在测试集中,有 155 名(96.88%)患者被识别为 aTUS,5 名(3.12%)被识别为 iTUS。在训练集中,管道检测 aTUS 的灵敏度为 0.99,特异度为 0.95,阳性预测值为 1.0。测试组的灵敏度为 0.96,特异性为 0.80,阳性预测值为 0.99。结论:NLP 模型可以从临床文件和请购单信息中准确识别甲状腺超声检查的适当性,这是评估甲状腺超声检查使用及随后甲状腺癌过度诊断的驱动因素和结果的关键性第一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Mayo Clinic Proceedings. Digital health
Mayo Clinic Proceedings. Digital health Medicine and Dentistry (General), Health Informatics, Public Health and Health Policy
自引率
0.00%
发文量
0
审稿时长
47 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信