Role of large language models for etiological classification of brain stroke based on MRI brain reports: a feasibility study

IF 2 4区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Magnetic resonance imaging Pub Date : 2025-10-06 DOI:10.1016/j.mri.2025.110538

Jorge Escartín , Pilar López-Úbeda , Teodoro Martín-Noguerol , Antonio Luna

{"title":"Role of large language models for etiological classification of brain stroke based on MRI brain reports: a feasibility study","authors":"Jorge Escartín , Pilar López-Úbeda , Teodoro Martín-Noguerol , Antonio Luna","doi":"10.1016/j.mri.2025.110538","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Ischemic stroke, a leading cause of global disability and mortality, demands precise etiological classification for effective management. The variability in the use of existing stroke classification systems, along with the challenges in manual etiological labeling from brain MRI radiological reports, calls for an innovative approach. This study aims to develop and evaluate a Natural Language Processing (NLP) algorithm using transformer-based models for the extraction and classification of ischemic stroke types from MRI reports, enhancing diagnostic efficiency and stroke management.</div></div><div><h3>Methods</h3><div>We built a dataset comprising 635 brain MRI reports, annotated for four distinct ischemic stroke types. All were clinically consistent with focal neurologic impairment due to stroke. The study involved evaluating two pre-trained models BERT (Bert clinical and Beto) and two models RoBERTa (Roberta clinical trials and Roberta biomedical), focusing on their ability to accurately classify stroke subtypes.</div></div><div><h3>Results</h3><div>The Roberta biomedical model emerged as the most effective, demonstrating superior performance with an accuracy of 76.7 % with statistically significant results. This model also achieved the highest precision, recall, and F1 scores across all stroke types, indicating its robustness in stroke subtype classification.</div></div><div><h3>Conclusion</h3><div>The study highlights the potential of NLP algorithms in automating stroke classification from MRI reports, which could significantly aid in diagnostic processes and streamline stroke management strategies.</div></div>","PeriodicalId":18165,"journal":{"name":"Magnetic resonance imaging","volume":"124 ","pages":"Article 110538"},"PeriodicalIF":2.0000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Magnetic resonance imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0730725X2500222X","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

Ischemic stroke, a leading cause of global disability and mortality, demands precise etiological classification for effective management. The variability in the use of existing stroke classification systems, along with the challenges in manual etiological labeling from brain MRI radiological reports, calls for an innovative approach. This study aims to develop and evaluate a Natural Language Processing (NLP) algorithm using transformer-based models for the extraction and classification of ischemic stroke types from MRI reports, enhancing diagnostic efficiency and stroke management.

Methods

We built a dataset comprising 635 brain MRI reports, annotated for four distinct ischemic stroke types. All were clinically consistent with focal neurologic impairment due to stroke. The study involved evaluating two pre-trained models BERT (Bert clinical and Beto) and two models RoBERTa (Roberta clinical trials and Roberta biomedical), focusing on their ability to accurately classify stroke subtypes.

Results

The Roberta biomedical model emerged as the most effective, demonstrating superior performance with an accuracy of 76.7 % with statistically significant results. This model also achieved the highest precision, recall, and F1 scores across all stroke types, indicating its robustness in stroke subtype classification.

Conclusion

The study highlights the potential of NLP algorithms in automating stroke classification from MRI reports, which could significantly aid in diagnostic processes and streamline stroke management strategies.

查看原文本刊更多论文

基于MRI脑报告的大语言模型在脑卒中病因分类中的作用：可行性研究。

目的：缺血性脑卒中是全球致残和死亡的主要原因，需要精确的病因分类以进行有效的治疗。现有脑卒中分类系统使用的可变性，以及从脑MRI放射学报告中手动标记病因的挑战，需要一种创新的方法。本研究旨在开发和评估一种基于变压器模型的自然语言处理（NLP）算法，用于从MRI报告中提取和分类缺血性卒中类型，提高诊断效率和卒中管理。方法：我们建立了一个包含635份脑MRI报告的数据集，对四种不同的缺血性卒中类型进行了注释。所有患者均符合脑卒中引起的局灶性神经损伤。该研究包括评估两个预训练模型BERT （BERT临床和Beto）和两个模型RoBERTa （RoBERTa临床试验和RoBERTa生物医学），重点关注它们准确分类中风亚型的能力。结果：Roberta生物医学模型最有效，准确率为76.7 %，具有统计学意义。该模型在所有脑卒中类型中也取得了最高的准确率、召回率和F1分数，表明其在脑卒中亚型分类中的稳健性。结论：该研究强调了NLP算法在从MRI报告中自动分类脑卒中方面的潜力，这可以显著帮助诊断过程和简化脑卒中管理策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Magnetic resonance imaging 医学-核医学

CiteScore

4.70

自引率

4.00%

发文量

194

审稿时长

83 days

期刊介绍： Magnetic Resonance Imaging (MRI) is the first international multidisciplinary journal encompassing physical, life, and clinical science investigations as they relate to the development and use of magnetic resonance imaging. MRI is dedicated to both basic research, technological innovation and applications, providing a single forum for communication among radiologists, physicists, chemists, biochemists, biologists, engineers, internists, pathologists, physiologists, computer scientists, and mathematicians.