James J. Butler , Michael C. Harrington , Yixuan Tong , Andrew J. Rosenbaum , Alan P. Samsonov , Raymond J. Walls , John G. Kennedy
{"title":"From jargon to clarity: Improving the readability of foot and ankle radiology reports with an artificial intelligence large language model","authors":"James J. Butler , Michael C. Harrington , Yixuan Tong , Andrew J. Rosenbaum , Alan P. Samsonov , Raymond J. Walls , John G. Kennedy","doi":"10.1016/j.fas.2024.01.008","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p><span>The purpose of this study was to evaluate the efficacy of an Artificial Intelligence Large Language Model (AI-LLM) at improving the readability foot and ankle orthopedic </span>radiology reports.</p></div><div><h3>Methods</h3><p>The radiology reports from 100 foot or ankle X-Rays, 100 computed tomography (CT) scans and 100 magnetic resonance imaging (MRI) scans were randomly sampled from the institution’s database. The following prompt command was inserted into the AI-LLM: “Explain this radiology report to a patient in layman's terms in the second person: [Report Text]”. The mean report length, Flesch reading ease score (FRES) and Flesch-Kincaid reading level (FKRL) were evaluated for both the original radiology report and the AI-LLM generated report. The accuracy of the information contained within the AI-LLM report was assessed via a 5-point Likert scale. Additionally, any “hallucinations” generated by the AI-LLM report were recorded.</p></div><div><h3>Results</h3><p>There was a statistically significant improvement in mean FRES scores in the AI-LLM generated X-Ray report (33.8 ± 6.8 to 72.7 ± 5.4), CT report (27.8 ± 4.6 to 67.5 ± 4.9) and MRI report (20.3 ± 7.2 to 66.9 ± 3.9), all p < 0.001. There was also a statistically significant improvement in mean FKRL scores in the AI-LLM generated X-Ray report (12.2 ± 1.1 to 8.5 ± 0.4), CT report (15.4 ± 2.0 to 8.4 ± 0.6) and MRI report (14.1 ± 1.6 to 8.5 ± 0.5), all p < 0.001. Superior FRES scores were observed in the AI-LLM generated X-Ray report compared to the AI-LLM generated CT report and MRI report, p < 0.001. The mean Likert score for the AI-LLM generated X-Ray report, CT report and MRI report was 4.0 ± 0.3, 3.9 ± 0.4, and 3.9 ± 0.4, respectively. The rate of hallucinations in the AI-LLM generated X-Ray report, CT report and MRI report was 4%, 7% and 6%, respectively.</p></div><div><h3>Conclusion</h3><p>AI-LLM was an efficacious tool for improving the readability of foot and ankle radiological reports across multiple imaging modalities. Superior FRES scores together with superior Likert scores were observed in the X-Ray AI-LLM reports compared to the CT and MRI AI-LLM reports. This study demonstrates the potential use of AI-LLMs as a new patient-centric approach for enhancing patient understanding of their foot and ankle radiology reports. Jel Classifications: IV</p></div>","PeriodicalId":48743,"journal":{"name":"Foot and Ankle Surgery","volume":null,"pages":null},"PeriodicalIF":1.9000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foot and Ankle Surgery","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1268773124000262","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
The purpose of this study was to evaluate the efficacy of an Artificial Intelligence Large Language Model (AI-LLM) at improving the readability foot and ankle orthopedic radiology reports.
Methods
The radiology reports from 100 foot or ankle X-Rays, 100 computed tomography (CT) scans and 100 magnetic resonance imaging (MRI) scans were randomly sampled from the institution’s database. The following prompt command was inserted into the AI-LLM: “Explain this radiology report to a patient in layman's terms in the second person: [Report Text]”. The mean report length, Flesch reading ease score (FRES) and Flesch-Kincaid reading level (FKRL) were evaluated for both the original radiology report and the AI-LLM generated report. The accuracy of the information contained within the AI-LLM report was assessed via a 5-point Likert scale. Additionally, any “hallucinations” generated by the AI-LLM report were recorded.
Results
There was a statistically significant improvement in mean FRES scores in the AI-LLM generated X-Ray report (33.8 ± 6.8 to 72.7 ± 5.4), CT report (27.8 ± 4.6 to 67.5 ± 4.9) and MRI report (20.3 ± 7.2 to 66.9 ± 3.9), all p < 0.001. There was also a statistically significant improvement in mean FKRL scores in the AI-LLM generated X-Ray report (12.2 ± 1.1 to 8.5 ± 0.4), CT report (15.4 ± 2.0 to 8.4 ± 0.6) and MRI report (14.1 ± 1.6 to 8.5 ± 0.5), all p < 0.001. Superior FRES scores were observed in the AI-LLM generated X-Ray report compared to the AI-LLM generated CT report and MRI report, p < 0.001. The mean Likert score for the AI-LLM generated X-Ray report, CT report and MRI report was 4.0 ± 0.3, 3.9 ± 0.4, and 3.9 ± 0.4, respectively. The rate of hallucinations in the AI-LLM generated X-Ray report, CT report and MRI report was 4%, 7% and 6%, respectively.
Conclusion
AI-LLM was an efficacious tool for improving the readability of foot and ankle radiological reports across multiple imaging modalities. Superior FRES scores together with superior Likert scores were observed in the X-Ray AI-LLM reports compared to the CT and MRI AI-LLM reports. This study demonstrates the potential use of AI-LLMs as a new patient-centric approach for enhancing patient understanding of their foot and ankle radiology reports. Jel Classifications: IV
期刊介绍:
Foot and Ankle Surgery is essential reading for everyone interested in the foot and ankle and its disorders. The approach is broad and includes all aspects of the subject from basic science to clinical management. Problems of both children and adults are included, as is trauma and chronic disease. Foot and Ankle Surgery is the official journal of European Foot and Ankle Society.
The aims of this journal are to promote the art and science of ankle and foot surgery, to publish peer-reviewed research articles, to provide regular reviews by acknowledged experts on common problems, and to provide a forum for discussion with letters to the Editors. Reviews of books are also published. Papers are invited for possible publication in Foot and Ankle Surgery on the understanding that the material has not been published elsewhere or accepted for publication in another journal and does not infringe prior copyright.