评估大型语言模型将腰椎成像报告简化为面向患者的文本的能力:GPT-4的试点研究。

IF 2.2 3区 医学 Q2 ORTHOPEDICS
Rushmin Khazanchi, Austin R Chen, Parth Desai, Daniel Herrera, Jacob R Staub, Matthew A Follett, Mykhaylo Krushelnytskyy, Hanna Kemeny, Wellington K Hsu, Alpesh A Patel, Srikanth N Divi
{"title":"评估大型语言模型将腰椎成像报告简化为面向患者的文本的能力:GPT-4的试点研究。","authors":"Rushmin Khazanchi, Austin R Chen, Parth Desai, Daniel Herrera, Jacob R Staub, Matthew A Follett, Mykhaylo Krushelnytskyy, Hanna Kemeny, Wellington K Hsu, Alpesh A Patel, Srikanth N Divi","doi":"10.1007/s00256-025-05027-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To assess the ability of large language models (LLMs) to accurately simplify lumbar spine magnetic resonance imaging (MRI) reports.</p><p><strong>Materials and methods: </strong>Patients who underwent lumbar decompression and/or fusion surgery in 2022 at one tertiary academic medical center were queried using appropriate CPT codes. We then identified all patients with a preoperative ICD diagnosis of lumbar spondylolisthesis and extracted the latest preoperative spine MRI radiology report text. The GPT-4 API was deployed on deidentified reports with a prompt to produce translations and evaluated for accuracy and readability. An enhanced GPT prompt was constructed using high-scoring reports and evaluated on low-scoring reports.</p><p><strong>Results: </strong>Of 93 included reports, GPT effectively reduced the average reading level (11.47 versus 8.50, p < 0.001). While most reports had no accuracy issues, 34% of translations omitted at least one clinically relevant piece of information, while 6% produced a clinically significant inaccuracy in the translation. An enhanced prompt model using high scoring reports-maintained reading level while significantly improving omission rate (p < 0.0001). However, even in the enhanced prompt model, GPT made several errors regarding location of stenosis, description of prior spine surgery, and description of other spine pathologies.</p><p><strong>Conclusion: </strong>GPT-4 effectively simplifies the reading level of lumbar spine MRI reports. The model tends to omit key information in its translations, which can be mitigated with enhanced prompting. Further validation in the domain of spine radiology needs to be performed to facilitate clinical integration.</p>","PeriodicalId":21783,"journal":{"name":"Skeletal Radiology","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing the ability of large language models to simplify lumbar spine imaging reports into patient-facing text: a pilot study of GPT-4.\",\"authors\":\"Rushmin Khazanchi, Austin R Chen, Parth Desai, Daniel Herrera, Jacob R Staub, Matthew A Follett, Mykhaylo Krushelnytskyy, Hanna Kemeny, Wellington K Hsu, Alpesh A Patel, Srikanth N Divi\",\"doi\":\"10.1007/s00256-025-05027-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To assess the ability of large language models (LLMs) to accurately simplify lumbar spine magnetic resonance imaging (MRI) reports.</p><p><strong>Materials and methods: </strong>Patients who underwent lumbar decompression and/or fusion surgery in 2022 at one tertiary academic medical center were queried using appropriate CPT codes. We then identified all patients with a preoperative ICD diagnosis of lumbar spondylolisthesis and extracted the latest preoperative spine MRI radiology report text. The GPT-4 API was deployed on deidentified reports with a prompt to produce translations and evaluated for accuracy and readability. An enhanced GPT prompt was constructed using high-scoring reports and evaluated on low-scoring reports.</p><p><strong>Results: </strong>Of 93 included reports, GPT effectively reduced the average reading level (11.47 versus 8.50, p < 0.001). While most reports had no accuracy issues, 34% of translations omitted at least one clinically relevant piece of information, while 6% produced a clinically significant inaccuracy in the translation. An enhanced prompt model using high scoring reports-maintained reading level while significantly improving omission rate (p < 0.0001). However, even in the enhanced prompt model, GPT made several errors regarding location of stenosis, description of prior spine surgery, and description of other spine pathologies.</p><p><strong>Conclusion: </strong>GPT-4 effectively simplifies the reading level of lumbar spine MRI reports. The model tends to omit key information in its translations, which can be mitigated with enhanced prompting. Further validation in the domain of spine radiology needs to be performed to facilitate clinical integration.</p>\",\"PeriodicalId\":21783,\"journal\":{\"name\":\"Skeletal Radiology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Skeletal Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00256-025-05027-9\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ORTHOPEDICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Skeletal Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00256-025-05027-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

摘要

目的:评估大语言模型(LLMs)准确简化腰椎磁共振成像(MRI)报告的能力。材料和方法:使用适当的CPT代码对2022年在某三级学术医疗中心接受腰椎减压和/或融合手术的患者进行查询。然后,我们确定了所有术前ICD诊断为腰椎滑脱的患者,并提取了最新的术前脊柱MRI放射报告文本。GPT-4 API部署在未识别的报告上,提示生成翻译并评估准确性和可读性。使用高分报告构建了一个增强的GPT提示,并对低分报告进行了评估。结果:在纳入的93份报告中,GPT有效降低了平均阅读水平(11.47比8.50,p)。结论:GPT-4有效简化了腰椎MRI报告的阅读水平。该模型倾向于忽略翻译中的关键信息,这可以通过增强提示来缓解。需要在脊柱放射学领域进行进一步的验证,以促进临床整合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Assessing the ability of large language models to simplify lumbar spine imaging reports into patient-facing text: a pilot study of GPT-4.

Objective: To assess the ability of large language models (LLMs) to accurately simplify lumbar spine magnetic resonance imaging (MRI) reports.

Materials and methods: Patients who underwent lumbar decompression and/or fusion surgery in 2022 at one tertiary academic medical center were queried using appropriate CPT codes. We then identified all patients with a preoperative ICD diagnosis of lumbar spondylolisthesis and extracted the latest preoperative spine MRI radiology report text. The GPT-4 API was deployed on deidentified reports with a prompt to produce translations and evaluated for accuracy and readability. An enhanced GPT prompt was constructed using high-scoring reports and evaluated on low-scoring reports.

Results: Of 93 included reports, GPT effectively reduced the average reading level (11.47 versus 8.50, p < 0.001). While most reports had no accuracy issues, 34% of translations omitted at least one clinically relevant piece of information, while 6% produced a clinically significant inaccuracy in the translation. An enhanced prompt model using high scoring reports-maintained reading level while significantly improving omission rate (p < 0.0001). However, even in the enhanced prompt model, GPT made several errors regarding location of stenosis, description of prior spine surgery, and description of other spine pathologies.

Conclusion: GPT-4 effectively simplifies the reading level of lumbar spine MRI reports. The model tends to omit key information in its translations, which can be mitigated with enhanced prompting. Further validation in the domain of spine radiology needs to be performed to facilitate clinical integration.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Skeletal Radiology
Skeletal Radiology 医学-核医学
CiteScore
4.40
自引率
9.50%
发文量
253
审稿时长
3-8 weeks
期刊介绍: Skeletal Radiology provides a forum for the dissemination of current knowledge and information dealing with disorders of the musculoskeletal system including the spine. While emphasizing the radiological aspects of the many varied skeletal abnormalities, the journal also adopts an interdisciplinary approach, reflecting the membership of the International Skeletal Society. Thus, the anatomical, pathological, physiological, clinical, metabolic and epidemiological aspects of the many entities affecting the skeleton receive appropriate consideration. This is the Journal of the International Skeletal Society and the Official Journal of the Society of Skeletal Radiology and the Australasian Musculoskelelal Imaging Group.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信