Large language model-based multi-source integration pipeline for automated diagnostic classification and zero-shot prognoses for brain tumor

Meta-Radiology Pub Date : 2025-04-29 DOI:10.1016/j.metrad.2025.100150

Zhuoqi Ma , Lulu Bi , Paige Collins , Owen Leary , Maliha Imami , Zhusi Zhong , Shaolei Lu , Grayson Baird , Nikos Tapinos , Ugur Cetintemel , Harrison Bai , Jerrold Boxerman , Zhicheng Jiao

{"title":"Large language model-based multi-source integration pipeline for automated diagnostic classification and zero-shot prognoses for brain tumor","authors":"Zhuoqi Ma , Lulu Bi , Paige Collins , Owen Leary , Maliha Imami , Zhusi Zhong , Shaolei Lu , Grayson Baird , Nikos Tapinos , Ugur Cetintemel , Harrison Bai , Jerrold Boxerman , Zhicheng Jiao","doi":"10.1016/j.metrad.2025.100150","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>In this study, we use large language models (LLMs) to integrate information from multi-source medical reports to enhance the accuracy of automated diagnostic classification and prognosis for brain tumors.</div></div><div><h3>Materials and methods</h3><div>Brain MRI reports from a cohort of 426 brain tumor patients were manually labeled for tumor presence and stability. Pathology reports from the same cohort were incorporated as an additional information source. A pre-trained LLM was used to extract features from the multi-source reports, and a Multi-layer perceptron (MLP) was trained for classification tasks. Model performance was evaluated on the test set using Micro F1 scores and AUROCs. The model’s zero-shot prognostic capability was validated on an independent cohort of 33 glioblastoma patients.</div></div><div><h3>Results</h3><div>Micro F1-score 0.849 (95%CI: 0.814, 0.880) for tumor presence classification and 0.929 (95%CI: 0.904, 0.954) for tumor stability classification are reached. Compared to using solely radiology reports, the developed model showed improvements on Micro F1 of 10.4 % for tumor presence and 5.6 % for stability classification. Log-rank tests confirmed significant distinction between the high- and low-risk patient groups stratified by model-predicted “Tumor Stability” label (<em>p</em>-value = 0.017), confirming the prognostic value of the model-generated labels.</div></div><div><h3>Conclusion</h3><div>This study developed a multi-source integration model based on LLMs for automated diagnostic classification and zero-shot prognosis of brain tumors. The integration of multi-source reports improved classification accuracy compared to single-source reports. Predicted tumor stability labels demonstrated survival prognostic capabilities. These findings confirm the potential of LLMs in brain tumor research, supporting precision diagnostics and prognosis.</div></div>","PeriodicalId":100921,"journal":{"name":"Meta-Radiology","volume":"3 2","pages":"Article 100150"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Meta-Radiology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2950162825000189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

In this study, we use large language models (LLMs) to integrate information from multi-source medical reports to enhance the accuracy of automated diagnostic classification and prognosis for brain tumors.

Materials and methods

Brain MRI reports from a cohort of 426 brain tumor patients were manually labeled for tumor presence and stability. Pathology reports from the same cohort were incorporated as an additional information source. A pre-trained LLM was used to extract features from the multi-source reports, and a Multi-layer perceptron (MLP) was trained for classification tasks. Model performance was evaluated on the test set using Micro F1 scores and AUROCs. The model’s zero-shot prognostic capability was validated on an independent cohort of 33 glioblastoma patients.

Results

Micro F1-score 0.849 (95%CI: 0.814, 0.880) for tumor presence classification and 0.929 (95%CI: 0.904, 0.954) for tumor stability classification are reached. Compared to using solely radiology reports, the developed model showed improvements on Micro F1 of 10.4 % for tumor presence and 5.6 % for stability classification. Log-rank tests confirmed significant distinction between the high- and low-risk patient groups stratified by model-predicted “Tumor Stability” label (p-value = 0.017), confirming the prognostic value of the model-generated labels.

Conclusion

This study developed a multi-source integration model based on LLMs for automated diagnostic classification and zero-shot prognosis of brain tumors. The integration of multi-source reports improved classification accuracy compared to single-source reports. Predicted tumor stability labels demonstrated survival prognostic capabilities. These findings confirm the potential of LLMs in brain tumor research, supporting precision diagnostics and prognosis.

Abstract Image

查看原文本刊更多论文

基于大语言模型的多源集成流水线对脑肿瘤的自动诊断分类和零概率预测

目的利用大语言模型（large language models, LLMs）整合多源医学报告信息，提高脑肿瘤自动诊断分类和预后的准确性。材料和方法对426例脑肿瘤患者的脑MRI报告进行手工标记，以确定肿瘤的存在和稳定性。来自同一队列的病理报告被纳入作为额外的信息源。使用预训练的LLM从多源报告中提取特征，并训练多层感知器（MLP）进行分类任务。在测试集上使用Micro F1分数和auroc来评估模型的性能。该模型的零概率预后能力在33名胶质母细胞瘤患者的独立队列中得到验证。结果肿瘤存在性分类的micro f1评分为0.849 (95%CI: 0.814, 0.880)，肿瘤稳定性分类的micro f1评分为0.929 （95%CI: 0.904, 0.954）。与单独使用放射学报告相比，开发的模型显示Micro F1的肿瘤存在性提高了10.4%，稳定性分类提高了5.6%。Log-rank检验证实了由模型预测的“肿瘤稳定性”标签分层的高危和低危患者组之间存在显著差异（p值= 0.017），证实了模型生成标签的预后价值。结论本研究建立了一种基于llm的多源集成模型，用于脑肿瘤的自动诊断分类和零概率预后。与单源报告相比，多源报告的集成提高了分类准确性。预测肿瘤稳定性标签证明了生存预后能力。这些发现证实了llm在脑肿瘤研究中的潜力，支持精确诊断和预后。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Meta-Radiology

自引率

0.00%

发文量