增强对脊柱创伤磁共振成像(MRI)报告的理解:人工智能对胸腰椎骨折解释的可读性分析。

JMIR AI Pub Date : 2025-07-01 DOI:10.2196/69654
David C Sing, Kishan S Shah, Michael Pompliano, Paul H Yi, Calogero Velluto, Ali Bagheri, Robert K Eastlack, Stephen R Stephan, Gregory M Mundis
{"title":"增强对脊柱创伤磁共振成像(MRI)报告的理解:人工智能对胸腰椎骨折解释的可读性分析。","authors":"David C Sing, Kishan S Shah, Michael Pompliano, Paul H Yi, Calogero Velluto, Ali Bagheri, Robert K Eastlack, Stephen R Stephan, Gregory M Mundis","doi":"10.2196/69654","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Magnetic resonance imaging (MRI) reports are challenging for patients to interpret and may subject patients to unnecessary anxiety. The advent of advanced artificial intelligence (AI) large language models (LLMs), such as GPT-4o, hold promise for translating complex medical information into layman terms.</p><p><strong>Objective: </strong>This paper aims to evaluate the accuracy, helpfulness, and readability of GPT-4o in explaining MRI reports of patients with thoracolumbar fractures.</p><p><strong>Methods: </strong>MRI reports of 20 patients presenting with thoracic or lumbar vertebral body fractures were obtained. GPT-4o was prompted to explain the MRI report in layman's terms. The generated explanations were then presented to 7 board-certified spine surgeons for evaluation on the reports' helpfulness and accuracy. The MRI report text and GPT-4o explanations were then analyzed to grade the readability of the texts using the Flesch Readability Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL) Scale.</p><p><strong>Results: </strong>The layman explanations provided by GPT-4o were found to be helpful by all surgeons in 17 cases, with 6 of 7 surgeons finding the information helpful in the remaining 3 cases. ChatGPT-generated layman reports were rated as \"accurate\" by all 7 surgeons in 11/20 cases (55%). In an additional 5/20 cases (25%), 6 out of 7 surgeons agreed on their accuracy. In the remaining 4/20 cases (20%), accuracy ratings varied, with 4 or 5 surgeons considering them accurate. Review of surgeon feedback on inaccuracies revealed that the radiology reports were often insufficiently detailed. The mean FRES score of the MRI reports was significantly lower than the GPT-4o explanations (32.15, SD 15.89 vs 53.9, SD 7.86; P<.001). The mean FKGL score of the MRI reports trended higher compared to the GPT-4o explanations (11th-12th grade vs 10th-11th grade level; P=.11).</p><p><strong>Conclusions: </strong>Overall helpfulness and readability ratings for AI-generated summaries of MRI reports were high, with few inaccuracies recorded. This study demonstrates the potential of GPT-4o to serve as a valuable tool for enhancing patient comprehension of MRI report findings.</p>","PeriodicalId":73551,"journal":{"name":"JMIR AI","volume":"4 ","pages":"e69654"},"PeriodicalIF":0.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12231343/pdf/","citationCount":"0","resultStr":"{\"title\":\"Enhancing Magnetic Resonance Imaging (MRI) Report Comprehension in Spinal Trauma: Readability Analysis of AI-Generated Explanations for Thoracolumbar Fractures.\",\"authors\":\"David C Sing, Kishan S Shah, Michael Pompliano, Paul H Yi, Calogero Velluto, Ali Bagheri, Robert K Eastlack, Stephen R Stephan, Gregory M Mundis\",\"doi\":\"10.2196/69654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Magnetic resonance imaging (MRI) reports are challenging for patients to interpret and may subject patients to unnecessary anxiety. The advent of advanced artificial intelligence (AI) large language models (LLMs), such as GPT-4o, hold promise for translating complex medical information into layman terms.</p><p><strong>Objective: </strong>This paper aims to evaluate the accuracy, helpfulness, and readability of GPT-4o in explaining MRI reports of patients with thoracolumbar fractures.</p><p><strong>Methods: </strong>MRI reports of 20 patients presenting with thoracic or lumbar vertebral body fractures were obtained. GPT-4o was prompted to explain the MRI report in layman's terms. The generated explanations were then presented to 7 board-certified spine surgeons for evaluation on the reports' helpfulness and accuracy. The MRI report text and GPT-4o explanations were then analyzed to grade the readability of the texts using the Flesch Readability Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL) Scale.</p><p><strong>Results: </strong>The layman explanations provided by GPT-4o were found to be helpful by all surgeons in 17 cases, with 6 of 7 surgeons finding the information helpful in the remaining 3 cases. ChatGPT-generated layman reports were rated as \\\"accurate\\\" by all 7 surgeons in 11/20 cases (55%). In an additional 5/20 cases (25%), 6 out of 7 surgeons agreed on their accuracy. In the remaining 4/20 cases (20%), accuracy ratings varied, with 4 or 5 surgeons considering them accurate. Review of surgeon feedback on inaccuracies revealed that the radiology reports were often insufficiently detailed. The mean FRES score of the MRI reports was significantly lower than the GPT-4o explanations (32.15, SD 15.89 vs 53.9, SD 7.86; P<.001). The mean FKGL score of the MRI reports trended higher compared to the GPT-4o explanations (11th-12th grade vs 10th-11th grade level; P=.11).</p><p><strong>Conclusions: </strong>Overall helpfulness and readability ratings for AI-generated summaries of MRI reports were high, with few inaccuracies recorded. This study demonstrates the potential of GPT-4o to serve as a valuable tool for enhancing patient comprehension of MRI report findings.</p>\",\"PeriodicalId\":73551,\"journal\":{\"name\":\"JMIR AI\",\"volume\":\"4 \",\"pages\":\"e69654\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12231343/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/69654\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/69654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:磁共振成像(MRI)报告对患者来说具有挑战性,并可能使患者产生不必要的焦虑。先进的人工智能(AI)大型语言模型(llm)的出现,如gpt - 40,有望将复杂的医学信息翻译成外行术语。目的:评价gpt - 40在解释胸腰椎骨折患者MRI报告中的准确性、有用性和可读性。方法:收集20例胸椎或腰椎骨折患者的MRI报告。gpt - 40被提示用外行术语解释MRI报告。然后将生成的解释提交给7位经委员会认证的脊柱外科医生,以评估报告的有用性和准确性。然后分析MRI报告文本和gpt - 40解释,使用Flesch可读性简易评分(FRES)和Flesch- kincaid分级水平(FKGL)量表对文本的可读性进行分级。结果:17例外科医生均认为gpt - 40的外行解释有帮助,其余3例中,7例外科医生中有6例认为gpt - 40的外行解释有帮助。chatgpt生成的外行报告在11/20(55%)的病例中被所有7名外科医生评为“准确”。在另外5/20的病例(25%)中,7位外科医生中有6位同意他们的准确性。在剩下的4/20例(20%)中,准确率评分各不相同,有4到5名外科医生认为他们是准确的。回顾外科医生对不准确的反馈,发现放射学报告往往不够详细。MRI报告的平均FRES评分明显低于gpt - 40解释(32.15,SD 15.89 vs 53.9, SD 7.86;结论:人工智能生成的MRI报告摘要的总体有用性和可读性评分很高,记录的不准确性很少。这项研究证明了gpt - 40作为一种有价值的工具,可以增强患者对MRI报告结果的理解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Enhancing Magnetic Resonance Imaging (MRI) Report Comprehension in Spinal Trauma: Readability Analysis of AI-Generated Explanations for Thoracolumbar Fractures.

Background: Magnetic resonance imaging (MRI) reports are challenging for patients to interpret and may subject patients to unnecessary anxiety. The advent of advanced artificial intelligence (AI) large language models (LLMs), such as GPT-4o, hold promise for translating complex medical information into layman terms.

Objective: This paper aims to evaluate the accuracy, helpfulness, and readability of GPT-4o in explaining MRI reports of patients with thoracolumbar fractures.

Methods: MRI reports of 20 patients presenting with thoracic or lumbar vertebral body fractures were obtained. GPT-4o was prompted to explain the MRI report in layman's terms. The generated explanations were then presented to 7 board-certified spine surgeons for evaluation on the reports' helpfulness and accuracy. The MRI report text and GPT-4o explanations were then analyzed to grade the readability of the texts using the Flesch Readability Ease Score (FRES) and Flesch-Kincaid Grade Level (FKGL) Scale.

Results: The layman explanations provided by GPT-4o were found to be helpful by all surgeons in 17 cases, with 6 of 7 surgeons finding the information helpful in the remaining 3 cases. ChatGPT-generated layman reports were rated as "accurate" by all 7 surgeons in 11/20 cases (55%). In an additional 5/20 cases (25%), 6 out of 7 surgeons agreed on their accuracy. In the remaining 4/20 cases (20%), accuracy ratings varied, with 4 or 5 surgeons considering them accurate. Review of surgeon feedback on inaccuracies revealed that the radiology reports were often insufficiently detailed. The mean FRES score of the MRI reports was significantly lower than the GPT-4o explanations (32.15, SD 15.89 vs 53.9, SD 7.86; P<.001). The mean FKGL score of the MRI reports trended higher compared to the GPT-4o explanations (11th-12th grade vs 10th-11th grade level; P=.11).

Conclusions: Overall helpfulness and readability ratings for AI-generated summaries of MRI reports were high, with few inaccuracies recorded. This study demonstrates the potential of GPT-4o to serve as a valuable tool for enhancing patient comprehension of MRI report findings.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信