Natural Language Processing Algorithm Used for Staging Pulmonary Oncology from Free-Text Radiological Reports: “Including PET-CT and Validation Towards Clinical Use”

IF 3.8 2区工程技术 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Journal of Digital Imaging Pub Date : 2024-01-12 DOI:10.1007/s10278-023-00913-x

J. Martijn Nobel, Sander Puts, Jasenko Krdzalic, Karen M. L. Zegers, Marc B. I. Lobbes, Simon G. F. Robben, André L. A. J. Dekker

{"title":"Natural Language Processing Algorithm Used for Staging Pulmonary Oncology from Free-Text Radiological Reports: “Including PET-CT and Validation Towards Clinical Use”","authors":"J. Martijn Nobel, Sander Puts, Jasenko Krdzalic, Karen M. L. Zegers, Marc B. I. Lobbes, Simon G. F. Robben, André L. A. J. Dekker","doi":"10.1007/s10278-023-00913-x","DOIUrl":null,"url":null,"abstract":"Natural language processing (NLP) can be used to process and structure free text, such as (free text) radiological reports. In radiology, it is important that reports are complete and accurate for clinical staging of, for instance, pulmonary oncology. A computed tomography (CT) or positron emission tomography (PET)-CT scan is of great importance in tumor staging, and NLP may be of additional value to the radiological report when used in the staging process as it may be able to extract the T and N stage of the 8th tumor–node–metastasis (TNM) classification system. The purpose of this study is to evaluate a new TN algorithm (TN-PET-CT) by adding a layer of metabolic activity to an already existing rule-based NLP algorithm (TN-CT). This new TN-PET-CT algorithm is capable of staging chest CT examinations as well as PET-CT scans. The study design made it possible to perform a subgroup analysis to test the external validation of the prior TN-CT algorithm. For information extraction and matching, pyContextNLP, SpaCy, and regular expressions were used. Overall TN accuracy score of the TN-PET-CT algorithm was 0.73 and 0.62 in the training and validation set (N = 63, N = 100). The external validation of the TN-CT classifier (N = 65) was 0.72. Overall, it is possible to adjust the TN-CT algorithm into a TN-PET-CT algorithm. However, outcomes highly depend on the accuracy of the report, the used vocabulary, and its context to express, for example, uncertainty. This is true for both the adjusted PET-CT algorithm and for the CT algorithm when applied in another hospital.","PeriodicalId":50214,"journal":{"name":"Journal of Digital Imaging","volume":"51 1","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Digital Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s10278-023-00913-x","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Natural language processing (NLP) can be used to process and structure free text, such as (free text) radiological reports. In radiology, it is important that reports are complete and accurate for clinical staging of, for instance, pulmonary oncology. A computed tomography (CT) or positron emission tomography (PET)-CT scan is of great importance in tumor staging, and NLP may be of additional value to the radiological report when used in the staging process as it may be able to extract the T and N stage of the 8th tumor–node–metastasis (TNM) classification system. The purpose of this study is to evaluate a new TN algorithm (TN-PET-CT) by adding a layer of metabolic activity to an already existing rule-based NLP algorithm (TN-CT). This new TN-PET-CT algorithm is capable of staging chest CT examinations as well as PET-CT scans. The study design made it possible to perform a subgroup analysis to test the external validation of the prior TN-CT algorithm. For information extraction and matching, pyContextNLP, SpaCy, and regular expressions were used. Overall TN accuracy score of the TN-PET-CT algorithm was 0.73 and 0.62 in the training and validation set (N = 63, N = 100). The external validation of the TN-CT classifier (N = 65) was 0.72. Overall, it is possible to adjust the TN-CT algorithm into a TN-PET-CT algorithm. However, outcomes highly depend on the accuracy of the report, the used vocabulary, and its context to express, for example, uncertainty. This is true for both the adjusted PET-CT algorithm and for the CT algorithm when applied in another hospital.

Abstract Image

查看原文本刊更多论文

根据自由文本放射报告对肺部肿瘤进行分期的自然语言处理算法："包括正电子发射计算机断层扫描（PET-CT）和临床应用验证"

自然语言处理（NLP）可用于处理和构建自由文本，如（自由文本）放射报告。在放射学领域，报告的完整性和准确性对于肺肿瘤等疾病的临床分期非常重要。计算机断层扫描（CT）或正电子发射断层扫描（PET）-CT 扫描在肿瘤分期中非常重要，而在分期过程中使用 NLP 可能会为放射报告带来额外的价值，因为它可以提取第 8 个肿瘤-结节-转移（TNM）分类系统中的 T 期和 N 期。本研究的目的是评估一种新的 TN 算法（TN-PET-CT），即在已有的基于规则的 NLP 算法（TN-CT）中添加一层代谢活动。这种新的 TN-PET-CT 算法能够对胸部 CT 检查和 PET-CT 扫描进行分期。研究设计使得进行亚组分析成为可能，以测试先前 TN-CT 算法的外部验证。信息提取和匹配使用了 pyContextNLP、SpaCy 和正则表达式。在训练集和验证集（N = 63，N = 100）中，TN-PET-CT 算法的总体 TN 准确率分别为 0.73 和 0.62。TN-CT 分类器的外部验证得分（N = 65）为 0.72。总体而言，将 TN-CT 算法调整为 TN-PET-CT 算法是可行的。不过，结果在很大程度上取决于报告的准确性、使用的词汇以及表达不确定性等内容的上下文。调整后的 PET-CT 算法和在其他医院应用的 CT 算法都是如此。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Digital Imaging 医学-核医学

CiteScore

7.50

自引率

6.80%

发文量

192

审稿时长

6-12 weeks

期刊介绍： The Journal of Digital Imaging (JDI) is the official peer-reviewed journal of the Society for Imaging Informatics in Medicine (SIIM). JDI’s goal is to enhance the exchange of knowledge encompassed by the general topic of Imaging Informatics in Medicine such as research and practice in clinical, engineering, and information technologies and techniques in all medical imaging environments. JDI topics are of interest to researchers, developers, educators, physicians, and imaging informatics professionals. Suggested Topics PACS and component systems; imaging informatics for the enterprise; image-enabled electronic medical records; RIS and HIS; digital image acquisition; image processing; image data compression; 3D, visualization, and multimedia; speech recognition; computer-aided diagnosis; facilities design; imaging vocabularies and ontologies; Transforming the Radiological Interpretation Process (TRIP™); DICOM and other standards; workflow and process modeling and simulation; quality assurance; archive integrity and security; teleradiology; digital mammography; and radiological informatics education.