GPT4LFS (generative pre-trained transformer 4 omni for lumbar foramina stenosis): enhancing lumbar foraminal stenosis image classification through large multimodal models.

IF 4.9 1区 医学 Q1 CLINICAL NEUROLOGY
Elzat Elham-Yilizati Yilihamu, Fan-Shuo Zeng, Jun Shang, Jin-Tao Yang, Hai Zhong, Shi-Qing Feng
{"title":"GPT4LFS (generative pre-trained transformer 4 omni for lumbar foramina stenosis): enhancing lumbar foraminal stenosis image classification through large multimodal models.","authors":"Elzat Elham-Yilizati Yilihamu, Fan-Shuo Zeng, Jun Shang, Jin-Tao Yang, Hai Zhong, Shi-Qing Feng","doi":"10.1016/j.spinee.2025.03.011","DOIUrl":null,"url":null,"abstract":"<p><strong>Background context: </strong>Lumbar foraminal stenosis (LFS) is a common spinal condition that requires accurate assessment. Current magnetic resonance imaging (MRI) reporting processes are often inefficient, and while deep learning has potential for improvement, challenges in generalization and interpretability limit its diagnostic effectiveness compared to physician expertise.</p><p><strong>Purpose: </strong>The present study aimed to leverage a multimodal large language model to improve the accuracy and efficiency of LFS image classification, thereby enabling rapid and precise automated diagnosis, reducing the dependence on manually annotated data, and enhancing diagnostic efficiency.</p><p><strong>Study design/setting: </strong>Retrospective study conducted from April 2017 to March 2023.</p><p><strong>Patient sample: </strong>Sagittal T1-weighted MRI data for the lumbar spine were collected from 1,200 patients across three medical centers. A total of 810 patient cases were included in the final analysis, with data collected from seven different MRI devices.</p><p><strong>Outcome measures: </strong>Automated classification of LFS using the multi modal large language model. Accuracy, sensitivity, Specificity and Cohen's Kappa coefficient were calculated.</p><p><strong>Methods: </strong>An advanced multimodal fusion framework GPT4LFS was developed with the primary objective of integrating imaging data and natural language descriptions to comprehensively capture the complex LFS features. The model employed a pre-trained ConvNeXt as the image processing module for extracting high-dimensional imaging features. Concurrently, medical descriptive texts generated by the multimodal large language model GPT-4o and encoded and feature-extracted using RoBERTa were utilized to optimize the model's contextual understanding capabilities. The Mamba architecture was implemented during the feature fusion stage, effectively integrating imaging and textual features and thereby enhancing the performance of the classification task. Finally, the stability of the model's detection results was validated by evaluating classification task metrics, such as the accuracy, sensitivity, specificity, and Kappa coefficients.</p><p><strong>Results: </strong>The training set comprised 6,299 images from 635 patients, the internal test set included 820 images from 82 patients, and the external test set was composed of 930 images from 93 patients. The GPT4LFS model demonstrated an overall accuracy of 93.7%, sensitivity of 95.8%, and specificity of 94.5% in the internal test set (Kappa = 0.89,95% confidence interval (CI): 0.84-0.96, p<.001). In the external test set, the overall accuracy was 92.2%, with a sensitivity of 92.2% and a specificity of 97.4% (Kappa = 0.88, 95% CI: 0.84-0.89, p<.001). Both the internal and external test sets showed excellent consistency in the model. After the article is published, we will make the full code publicly available on GitHub.</p><p><strong>Conclusions: </strong>Using the GPT4LFS model for LFS image categorization demonstrated accuracy and the capacity for feature description at a level commensurate with that of professional clinicians.</p>","PeriodicalId":49484,"journal":{"name":"Spine Journal","volume":" ","pages":""},"PeriodicalIF":4.9000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.spinee.2025.03.011","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background context: Lumbar foraminal stenosis (LFS) is a common spinal condition that requires accurate assessment. Current magnetic resonance imaging (MRI) reporting processes are often inefficient, and while deep learning has potential for improvement, challenges in generalization and interpretability limit its diagnostic effectiveness compared to physician expertise.

Purpose: The present study aimed to leverage a multimodal large language model to improve the accuracy and efficiency of LFS image classification, thereby enabling rapid and precise automated diagnosis, reducing the dependence on manually annotated data, and enhancing diagnostic efficiency.

Study design/setting: Retrospective study conducted from April 2017 to March 2023.

Patient sample: Sagittal T1-weighted MRI data for the lumbar spine were collected from 1,200 patients across three medical centers. A total of 810 patient cases were included in the final analysis, with data collected from seven different MRI devices.

Outcome measures: Automated classification of LFS using the multi modal large language model. Accuracy, sensitivity, Specificity and Cohen's Kappa coefficient were calculated.

Methods: An advanced multimodal fusion framework GPT4LFS was developed with the primary objective of integrating imaging data and natural language descriptions to comprehensively capture the complex LFS features. The model employed a pre-trained ConvNeXt as the image processing module for extracting high-dimensional imaging features. Concurrently, medical descriptive texts generated by the multimodal large language model GPT-4o and encoded and feature-extracted using RoBERTa were utilized to optimize the model's contextual understanding capabilities. The Mamba architecture was implemented during the feature fusion stage, effectively integrating imaging and textual features and thereby enhancing the performance of the classification task. Finally, the stability of the model's detection results was validated by evaluating classification task metrics, such as the accuracy, sensitivity, specificity, and Kappa coefficients.

Results: The training set comprised 6,299 images from 635 patients, the internal test set included 820 images from 82 patients, and the external test set was composed of 930 images from 93 patients. The GPT4LFS model demonstrated an overall accuracy of 93.7%, sensitivity of 95.8%, and specificity of 94.5% in the internal test set (Kappa = 0.89,95% confidence interval (CI): 0.84-0.96, p<.001). In the external test set, the overall accuracy was 92.2%, with a sensitivity of 92.2% and a specificity of 97.4% (Kappa = 0.88, 95% CI: 0.84-0.89, p<.001). Both the internal and external test sets showed excellent consistency in the model. After the article is published, we will make the full code publicly available on GitHub.

Conclusions: Using the GPT4LFS model for LFS image categorization demonstrated accuracy and the capacity for feature description at a level commensurate with that of professional clinicians.

求助全文
约1分钟内获得全文 求助全文
来源期刊
Spine Journal
Spine Journal 医学-临床神经学
CiteScore
8.20
自引率
6.70%
发文量
680
审稿时长
13.1 weeks
期刊介绍: The Spine Journal, the official journal of the North American Spine Society, is an international and multidisciplinary journal that publishes original, peer-reviewed articles on research and treatment related to the spine and spine care, including basic science and clinical investigations. It is a condition of publication that manuscripts submitted to The Spine Journal have not been published, and will not be simultaneously submitted or published elsewhere. The Spine Journal also publishes major reviews of specific topics by acknowledged authorities, technical notes, teaching editorials, and other special features, Letters to the Editor-in-Chief are encouraged.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信