Large Language Model Approach for Zero-Shot Information Extraction and Clustering of Japanese Radiology Reports: Algorithm Development and Validation.

IF 3.3 Q2 ONCOLOGY

JMIR Cancer Pub Date : 2025-01-23 DOI:10.2196/57275

Yosuke Yamagishi, Yuta Nakamura, Shouhei Hanaoka, Osamu Abe

{"title":"Large Language Model Approach for Zero-Shot Information Extraction and Clustering of Japanese Radiology Reports: Algorithm Development and Validation.","authors":"Yosuke Yamagishi, Yuta Nakamura, Shouhei Hanaoka, Osamu Abe","doi":"10.2196/57275","DOIUrl":null,"url":null,"abstract":"Background: The application of natural language processing in medicine has increased significantly, including tasks such as information extraction and classification. Natural language processing plays a crucial role in structuring free-form radiology reports, facilitating the interpretation of textual content, and enhancing data utility through clustering techniques. Clustering allows for the identification of similar lesions and disease patterns across a broad dataset, making it useful for aggregating information and discovering new insights in medical imaging. However, most publicly available medical datasets are in English, with limited resources in other languages. This scarcity poses a challenge for development of models geared toward non-English downstream tasks.Objective: This study aimed to develop and evaluate an algorithm that uses large language models (LLMs) to extract information from Japanese lung cancer radiology reports and perform clustering analysis. The effectiveness of this approach was assessed and compared with previous supervised methods.Methods: This study employed the MedTxt-RR dataset, comprising 135 Japanese radiology reports from 9 radiologists who interpreted the computed tomography images of 15 lung cancer patients obtained from Radiopaedia. Previously used in the NTCIR-16 (NII Testbeds and Community for Information Access Research) shared task for clustering performance competition, this dataset was ideal for comparing the clustering ability of our algorithm with those of previous methods. The dataset was split into 8 cases for development and 7 for testing, respectively. The study's approach involved using the LLM to extract information pertinent to lung cancer findings and transforming it into numeric features for clustering, using the K-means method. Performance was evaluated using 135 reports for information extraction accuracy and 63 test reports for clustering performance. This study focused on the accuracy of automated systems for extracting tumor size, location, and laterality from clinical reports. The clustering performance was evaluated using normalized mutual information, adjusted mutual information , and the Fowlkes-Mallows index for both the development and test data.Results: The tumor size was accurately identified in 99 out of 135 reports (73.3%), with errors in 36 reports (26.7%), primarily due to missing or incorrect size information. Tumor location and laterality were identified with greater accuracy in 112 out of 135 reports (83%); however, 23 reports (17%) contained errors mainly due to empty values or incorrect data. Clustering performance of the test data yielded an normalized mutual information of 0.6414, adjusted mutual information of 0.5598, and Fowlkes-Mallows index of 0.5354. The proposed method demonstrated superior performance across all evaluation metrics compared to previous methods.Conclusions: The unsupervised LLM approach surpassed the existing supervised methods in clustering Japanese radiology reports. These findings suggest that LLMs hold promise for extracting information from radiology reports and integrating it into disease-specific knowledge structures.","PeriodicalId":45538,"journal":{"name":"JMIR Cancer","volume":"11 ","pages":"e57275"},"PeriodicalIF":3.3000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11867198/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Cancer","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/57275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The application of natural language processing in medicine has increased significantly, including tasks such as information extraction and classification. Natural language processing plays a crucial role in structuring free-form radiology reports, facilitating the interpretation of textual content, and enhancing data utility through clustering techniques. Clustering allows for the identification of similar lesions and disease patterns across a broad dataset, making it useful for aggregating information and discovering new insights in medical imaging. However, most publicly available medical datasets are in English, with limited resources in other languages. This scarcity poses a challenge for development of models geared toward non-English downstream tasks.

Objective: This study aimed to develop and evaluate an algorithm that uses large language models (LLMs) to extract information from Japanese lung cancer radiology reports and perform clustering analysis. The effectiveness of this approach was assessed and compared with previous supervised methods.

Methods: This study employed the MedTxt-RR dataset, comprising 135 Japanese radiology reports from 9 radiologists who interpreted the computed tomography images of 15 lung cancer patients obtained from Radiopaedia. Previously used in the NTCIR-16 (NII Testbeds and Community for Information Access Research) shared task for clustering performance competition, this dataset was ideal for comparing the clustering ability of our algorithm with those of previous methods. The dataset was split into 8 cases for development and 7 for testing, respectively. The study's approach involved using the LLM to extract information pertinent to lung cancer findings and transforming it into numeric features for clustering, using the K-means method. Performance was evaluated using 135 reports for information extraction accuracy and 63 test reports for clustering performance. This study focused on the accuracy of automated systems for extracting tumor size, location, and laterality from clinical reports. The clustering performance was evaluated using normalized mutual information, adjusted mutual information , and the Fowlkes-Mallows index for both the development and test data.

Results: The tumor size was accurately identified in 99 out of 135 reports (73.3%), with errors in 36 reports (26.7%), primarily due to missing or incorrect size information. Tumor location and laterality were identified with greater accuracy in 112 out of 135 reports (83%); however, 23 reports (17%) contained errors mainly due to empty values or incorrect data. Clustering performance of the test data yielded an normalized mutual information of 0.6414, adjusted mutual information of 0.5598, and Fowlkes-Mallows index of 0.5354. The proposed method demonstrated superior performance across all evaluation metrics compared to previous methods.

Conclusions: The unsupervised LLM approach surpassed the existing supervised methods in clustering Japanese radiology reports. These findings suggest that LLMs hold promise for extracting information from radiology reports and integrating it into disease-specific knowledge structures.

查看原文本刊更多论文

日本放射学报告的零射击信息提取和聚类的大语言模型方法：算法开发和验证。

背景：自然语言处理在医学领域的应用显著增加，包括信息提取和分类等任务。自然语言处理在构建自由格式的放射学报告、促进文本内容的解释以及通过聚类技术增强数据效用方面起着至关重要的作用。聚类允许在广泛的数据集中识别相似的病变和疾病模式，使其对汇总信息和发现医学成像中的新见解非常有用。然而，大多数公开可用的医疗数据集都是英文的，其他语言的资源有限。这种稀缺性对面向非英语下游任务的模型的开发提出了挑战。目的：本研究旨在开发和评估一种使用大语言模型（LLMs）从日本肺癌放射学报告中提取信息并进行聚类分析的算法。评估了该方法的有效性，并与以前的监督方法进行了比较。方法：本研究采用MedTxt-RR数据集，包括来自9名放射科医生的135份日本放射学报告，这些放射科医生解释了从Radiopaedia获得的15名肺癌患者的计算机断层扫描图像。该数据集之前用于ntir -16 （NII Testbeds and Community for Information Access Research）的聚类性能竞争共享任务，是比较我们的算法与之前方法的聚类能力的理想数据集。数据集被分成8个用于开发的案例和7个用于测试的案例。该研究的方法包括使用LLM提取与肺癌发现相关的信息，并使用K-means方法将其转换为用于聚类的数字特征。使用135份信息提取准确性报告和63份聚类性能测试报告来评估性能。本研究着重于从临床报告中提取肿瘤大小、位置和侧边的自动化系统的准确性。使用归一化互信息、调整互信息和Fowlkes-Mallows指数对开发数据和测试数据进行聚类性能评估。结果：135份报告中有99份（73.3%）准确识别肿瘤大小，36份（26.7%）报告错误，主要是由于缺失或不正确的大小信息。在135份报告中，有112份（83%）的肿瘤位置和侧边被准确识别；然而，23份报告（17%）包含错误，主要是由于空值或数据不正确。测试数据的聚类性能得到归一化互信息为0.6414，调整互信息为0.5598，Fowlkes-Mallows指数为0.5354。与以前的方法相比，所提出的方法在所有评估指标上都表现出优越的性能。结论：无监督LLM方法在日本放射学报告聚类方面优于现有的监督方法。这些发现表明llm有望从放射学报告中提取信息并将其整合到特定疾病的知识结构中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊