使用自然语言处理和机器学习来识别和分类肺结节,自动化放射学报告中的偶然发现。

C. French, D. Kurbegov, D. Spigel, M. Makowski, Samantha R. Terker, P. Clark
{"title":"使用自然语言处理和机器学习来识别和分类肺结节,自动化放射学报告中的偶然发现。","authors":"C. French, D. Kurbegov, D. Spigel, M. Makowski, Samantha R. Terker, P. Clark","doi":"10.1200/jgo.2019.5.suppl.49","DOIUrl":null,"url":null,"abstract":"49 Background: Pulmonary nodule incidental findings challenge providers to balance resource efficiency and high clinical quality. Incidental findings tend to be under evaluated with studies reporting appropriate follow-up rates as low as 29%. The efficient identification of patients with high risk nodules is foundational to ensuring appropriate follow-up and requires the clinical reading and classification of radiology reports. We tested the feasibility of automating this process with natural language processing (NLP) and machine learning (ML). Methods: In cooperation with Sarah Cannon, the Cancer Institute of HCA Healthcare, we conducted a series of experiments on 8,879 free-text, narrative CT radiology reports. A representative sample of health system ED, IP, and OP reports dated from Dec 2015 - April 2017 were divided into a development set for model training and validation, and a test set to evaluate model performance. A “Nodule Model” was trained to detect the reported presence of a pulmonary nodule and a rules-based “Size Model” was developed to extract the size of the nodule in mms. Reports were bucketed into three prediction groups: ≥ 6 mm, <6 mm, and no size indicated. Nodules were placed in a queue for follow-up if the nodule was predicted ≥ 6 mm, or if the nodule had no size indicated and the report contained the word “mass.” The Fleischner Society Guidelines and clinical review informed these definitions. Results: Precision and recall metrics were calculated for multiple model thresholds. A threshold was selected based on the validation set calculations and a success criterion of 90% queue precision was selected to minimize false positives. On the test dataset, the F1 measure of the entire pipeline was 72.9%, recall was 60.3%, and queue precision was 90.2%, exceeding success criteria. Conclusions: The experiments demonstrate the feasibility of technology to automate the detection and classification of pulmonary nodule incidental findings in radiology reports. This approach promises to improve healthcare quality by increasing the rate of appropriate lung nodule incidental finding follow-up and treatment without excessive labor or risking overutilization.","PeriodicalId":15862,"journal":{"name":"Journal of global oncology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automate incidental findings in radiology reports using natural language processing and machine learning to identify and classify lung nodules.\",\"authors\":\"C. French, D. Kurbegov, D. Spigel, M. Makowski, Samantha R. Terker, P. Clark\",\"doi\":\"10.1200/jgo.2019.5.suppl.49\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"49 Background: Pulmonary nodule incidental findings challenge providers to balance resource efficiency and high clinical quality. Incidental findings tend to be under evaluated with studies reporting appropriate follow-up rates as low as 29%. The efficient identification of patients with high risk nodules is foundational to ensuring appropriate follow-up and requires the clinical reading and classification of radiology reports. We tested the feasibility of automating this process with natural language processing (NLP) and machine learning (ML). Methods: In cooperation with Sarah Cannon, the Cancer Institute of HCA Healthcare, we conducted a series of experiments on 8,879 free-text, narrative CT radiology reports. A representative sample of health system ED, IP, and OP reports dated from Dec 2015 - April 2017 were divided into a development set for model training and validation, and a test set to evaluate model performance. A “Nodule Model” was trained to detect the reported presence of a pulmonary nodule and a rules-based “Size Model” was developed to extract the size of the nodule in mms. Reports were bucketed into three prediction groups: ≥ 6 mm, <6 mm, and no size indicated. Nodules were placed in a queue for follow-up if the nodule was predicted ≥ 6 mm, or if the nodule had no size indicated and the report contained the word “mass.” The Fleischner Society Guidelines and clinical review informed these definitions. Results: Precision and recall metrics were calculated for multiple model thresholds. A threshold was selected based on the validation set calculations and a success criterion of 90% queue precision was selected to minimize false positives. On the test dataset, the F1 measure of the entire pipeline was 72.9%, recall was 60.3%, and queue precision was 90.2%, exceeding success criteria. Conclusions: The experiments demonstrate the feasibility of technology to automate the detection and classification of pulmonary nodule incidental findings in radiology reports. This approach promises to improve healthcare quality by increasing the rate of appropriate lung nodule incidental finding follow-up and treatment without excessive labor or risking overutilization.\",\"PeriodicalId\":15862,\"journal\":{\"name\":\"Journal of global oncology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of global oncology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1200/jgo.2019.5.suppl.49\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of global oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/jgo.2019.5.suppl.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

49背景:肺结节的偶然发现挑战了提供者在资源效率和高临床质量之间的平衡。偶然发现往往被低估,研究报告适当的随访率低至29%。有效识别高危结节患者是确保适当随访的基础,需要临床阅读和分类放射学报告。我们用自然语言处理(NLP)和机器学习(ML)测试了自动化这一过程的可行性。方法:与HCA医疗保健癌症研究所Sarah Cannon合作,我们对8879份自由文本、叙述性CT放射学报告进行了一系列实验。2015年12月至2017年4月的卫生系统ED、IP和OP报告的代表性样本分为用于模型训练和验证的开发集和用于评估模型性能的测试集。训练了一个“结节模型”来检测报告中是否存在肺结节,并开发了一个基于规则的“大小模型”来提取以mms为单位的结节大小。报告分为三个预测组:≥6 mm、<6 mm,未显示大小。如果预测结节≥6毫米,或者结节没有显示大小,并且报告中包含“肿块”一词,则将结节排在队列中进行随访。Fleischner学会指南和临床审查告知了这些定义。结果:计算了多个模型阈值的精确度和召回率指标。根据验证集计算选择阈值,并选择90%队列精度的成功标准,以最大限度地减少误报。在测试数据集上,整个管道的F1度量为72.9%,召回率为60.3%,队列精度为90.2%,超过了成功标准。结论:实验证明了在放射学报告中自动检测和分类肺结节偶然发现的技术的可行性。这种方法有望通过提高适当的肺结节偶然发现随访和治疗率来提高医疗质量,而不会产生过度劳动或过度使用的风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automate incidental findings in radiology reports using natural language processing and machine learning to identify and classify lung nodules.
49 Background: Pulmonary nodule incidental findings challenge providers to balance resource efficiency and high clinical quality. Incidental findings tend to be under evaluated with studies reporting appropriate follow-up rates as low as 29%. The efficient identification of patients with high risk nodules is foundational to ensuring appropriate follow-up and requires the clinical reading and classification of radiology reports. We tested the feasibility of automating this process with natural language processing (NLP) and machine learning (ML). Methods: In cooperation with Sarah Cannon, the Cancer Institute of HCA Healthcare, we conducted a series of experiments on 8,879 free-text, narrative CT radiology reports. A representative sample of health system ED, IP, and OP reports dated from Dec 2015 - April 2017 were divided into a development set for model training and validation, and a test set to evaluate model performance. A “Nodule Model” was trained to detect the reported presence of a pulmonary nodule and a rules-based “Size Model” was developed to extract the size of the nodule in mms. Reports were bucketed into three prediction groups: ≥ 6 mm, <6 mm, and no size indicated. Nodules were placed in a queue for follow-up if the nodule was predicted ≥ 6 mm, or if the nodule had no size indicated and the report contained the word “mass.” The Fleischner Society Guidelines and clinical review informed these definitions. Results: Precision and recall metrics were calculated for multiple model thresholds. A threshold was selected based on the validation set calculations and a success criterion of 90% queue precision was selected to minimize false positives. On the test dataset, the F1 measure of the entire pipeline was 72.9%, recall was 60.3%, and queue precision was 90.2%, exceeding success criteria. Conclusions: The experiments demonstrate the feasibility of technology to automate the detection and classification of pulmonary nodule incidental findings in radiology reports. This approach promises to improve healthcare quality by increasing the rate of appropriate lung nodule incidental finding follow-up and treatment without excessive labor or risking overutilization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
20 weeks
期刊介绍: The Journal of Global Oncology (JGO) is an online only, open access journal focused on cancer care, research and care delivery issues unique to countries and settings with limited healthcare resources. JGO aims to provide a home for high-quality literature that fulfills a growing need for content describing the array of challenges health care professionals in resource-constrained settings face. Article types include original reports, review articles, commentaries, correspondence/replies, special articles and editorials.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信