{"title":"Application of natural language processing to post-structuring of rectal cancer MRI reports","authors":"W. Liu , L. Cai , Y. Li","doi":"10.1016/j.crad.2023.10.032","DOIUrl":null,"url":null,"abstract":"<div><h3>AIM</h3><p>To evaluate a natural language processing (NLP) system for extracting structured information from the free-form text of rectal cancer magnetic resonance imaging (MRI) reports written in Chinese.</p></div><div><h3>MATERIALS AND METHODS</h3><p>A rule-based NLP model that could extract 11 key image features of rectal cancer was constructed using 358 MRI reports of rectal cancer written between 2015 and 2021. Fifty reports written before 2015 and 50 written after 2021 were used as test datasets, and the reference standard was determined by manual extraction of information by two radiologists. The length and reporting rate of image features in pre-2015 and post-2021 datasets, as well as the accuracy, precision, recall, and F1 score of feature extraction by the NLP system, were compared. The time required for the NLP to extract data was compared with that required by the radiologists.</p></div><div><h3>RESULTS</h3><p>Reports written after 2021 had longer diagnostic impression sections than reports written before 2015. The reporting rate of key imaging features of rectal cancer was 36.55% before 2015 and 79.82% after 2021. The accuracy, precision, recall, and F1 score of NLP for correct extraction of values from reports were 93.82%, 95.63%, 87.06%, and 91.15%, respectively, for pre-2015 reports, and 92.55%, 98.53%, 94.15%, and 96.29%, respectively, for post-2021 reports. NLP generated all the structured information in <1 second.</p></div><div><h3>CONCLUSIONS</h3><p>The NLP system with rule-based pattern matching achieved rapid and accurate structured processing of rectal cancer MRI reports. MRI reports with structured templates are more suitable for NLP-based extraction of information.</p></div>","PeriodicalId":10695,"journal":{"name":"Clinical radiology","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical radiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0009926023005172","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
AIM
To evaluate a natural language processing (NLP) system for extracting structured information from the free-form text of rectal cancer magnetic resonance imaging (MRI) reports written in Chinese.
MATERIALS AND METHODS
A rule-based NLP model that could extract 11 key image features of rectal cancer was constructed using 358 MRI reports of rectal cancer written between 2015 and 2021. Fifty reports written before 2015 and 50 written after 2021 were used as test datasets, and the reference standard was determined by manual extraction of information by two radiologists. The length and reporting rate of image features in pre-2015 and post-2021 datasets, as well as the accuracy, precision, recall, and F1 score of feature extraction by the NLP system, were compared. The time required for the NLP to extract data was compared with that required by the radiologists.
RESULTS
Reports written after 2021 had longer diagnostic impression sections than reports written before 2015. The reporting rate of key imaging features of rectal cancer was 36.55% before 2015 and 79.82% after 2021. The accuracy, precision, recall, and F1 score of NLP for correct extraction of values from reports were 93.82%, 95.63%, 87.06%, and 91.15%, respectively, for pre-2015 reports, and 92.55%, 98.53%, 94.15%, and 96.29%, respectively, for post-2021 reports. NLP generated all the structured information in <1 second.
CONCLUSIONS
The NLP system with rule-based pattern matching achieved rapid and accurate structured processing of rectal cancer MRI reports. MRI reports with structured templates are more suitable for NLP-based extraction of information.
期刊介绍:
Clinical Radiology is published by Elsevier on behalf of The Royal College of Radiologists. Clinical Radiology is an International Journal bringing you original research, editorials and review articles on all aspects of diagnostic imaging, including:
• Computed tomography
• Magnetic resonance imaging
• Ultrasonography
• Digital radiology
• Interventional radiology
• Radiography
• Nuclear medicine
Papers on radiological protection, quality assurance, audit in radiology and matters relating to radiological training and education are also included. In addition, each issue contains correspondence, book reviews and notices of forthcoming events.