Jorge Escartín , Pilar López-Úbeda , Teodoro Martín-Noguerol , Antonio Luna
{"title":"Role of large language models for etiological classification of brain stroke based on MRI brain reports: a feasibility study","authors":"Jorge Escartín , Pilar López-Úbeda , Teodoro Martín-Noguerol , Antonio Luna","doi":"10.1016/j.mri.2025.110538","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>Ischemic stroke, a leading cause of global disability and mortality, demands precise etiological classification for effective management. The variability in the use of existing stroke classification systems, along with the challenges in manual etiological labeling from brain MRI radiological reports, calls for an innovative approach. This study aims to develop and evaluate a Natural Language Processing (NLP) algorithm using transformer-based models for the extraction and classification of ischemic stroke types from MRI reports, enhancing diagnostic efficiency and stroke management.</div></div><div><h3>Methods</h3><div>We built a dataset comprising 635 brain MRI reports, annotated for four distinct ischemic stroke types. All were clinically consistent with focal neurologic impairment due to stroke. The study involved evaluating two pre-trained models BERT (Bert clinical and Beto) and two models RoBERTa (Roberta clinical trials and Roberta biomedical), focusing on their ability to accurately classify stroke subtypes.</div></div><div><h3>Results</h3><div>The Roberta biomedical model emerged as the most effective, demonstrating superior performance with an accuracy of 76.7 % with statistically significant results. This model also achieved the highest precision, recall, and F1 scores across all stroke types, indicating its robustness in stroke subtype classification.</div></div><div><h3>Conclusion</h3><div>The study highlights the potential of NLP algorithms in automating stroke classification from MRI reports, which could significantly aid in diagnostic processes and streamline stroke management strategies.</div></div>","PeriodicalId":18165,"journal":{"name":"Magnetic resonance imaging","volume":"124 ","pages":"Article 110538"},"PeriodicalIF":2.0000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Magnetic resonance imaging","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0730725X2500222X","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose
Ischemic stroke, a leading cause of global disability and mortality, demands precise etiological classification for effective management. The variability in the use of existing stroke classification systems, along with the challenges in manual etiological labeling from brain MRI radiological reports, calls for an innovative approach. This study aims to develop and evaluate a Natural Language Processing (NLP) algorithm using transformer-based models for the extraction and classification of ischemic stroke types from MRI reports, enhancing diagnostic efficiency and stroke management.
Methods
We built a dataset comprising 635 brain MRI reports, annotated for four distinct ischemic stroke types. All were clinically consistent with focal neurologic impairment due to stroke. The study involved evaluating two pre-trained models BERT (Bert clinical and Beto) and two models RoBERTa (Roberta clinical trials and Roberta biomedical), focusing on their ability to accurately classify stroke subtypes.
Results
The Roberta biomedical model emerged as the most effective, demonstrating superior performance with an accuracy of 76.7 % with statistically significant results. This model also achieved the highest precision, recall, and F1 scores across all stroke types, indicating its robustness in stroke subtype classification.
Conclusion
The study highlights the potential of NLP algorithms in automating stroke classification from MRI reports, which could significantly aid in diagnostic processes and streamline stroke management strategies.
期刊介绍:
Magnetic Resonance Imaging (MRI) is the first international multidisciplinary journal encompassing physical, life, and clinical science investigations as they relate to the development and use of magnetic resonance imaging. MRI is dedicated to both basic research, technological innovation and applications, providing a single forum for communication among radiologists, physicists, chemists, biochemists, biologists, engineers, internists, pathologists, physiologists, computer scientists, and mathematicians.