使用 ChatGPT 提高介入放射学手术描述的可读性。

IF 2.8 3区医学 Q2 CARDIAC & CARDIOVASCULAR SYSTEMS

CardioVascular and Interventional Radiology Pub Date : 2024-08-01 Epub Date: 2024-07-09 DOI:10.1007/s00270-024-03803-z

Hossam A Zaki, Michelle Mai, Hazem Abdel-Megid, Sabrina Q R Liew, Simon Kidanemariam, Abdifatah S Omar, Urvi Tiwari, Jad Hamze, Sun Ho Ahn, Aaron W P Maxwell

{"title":"使用 ChatGPT 提高介入放射学手术描述的可读性。","authors":"Hossam A Zaki, Michelle Mai, Hazem Abdel-Megid, Sabrina Q R Liew, Simon Kidanemariam, Abdifatah S Omar, Urvi Tiwari, Jad Hamze, Sun Ho Ahn, Aaron W P Maxwell","doi":"10.1007/s00270-024-03803-z","DOIUrl":null,"url":null,"abstract":"Purpose: This project examines ChatGPT's potential to enhance the readability of patient educational materials about interventional radiology (IR) procedures.Methods and materials: The descriptions of IR procedures from the Cardiovascular and Interventional Radiological Society of Europe (CIRSE) were used as the original text. Readability scores were calculated using three metrics: Flesch Reading Ease (FRE), Gunning Fog (GF), and the Automated Readability Index (ARI) using an online calculator ( https://readabilityformulas.com ). FRE is scored on a scale of 0-100, where 100 indicates easy-to-read texts, and GF and ARI represent the grade level required to comprehend the text. The DISCERN instrument measured credibility and reliability. ChatGPT was prompted to simplify the texts to a fifth-grade reading level, with subsequent recalculation of readability and DISCERN scores for comparison. Statistical significance was determined using a Wilcoxon Signed-Rank Test. Articles were subsequently organized by subgroups and analyzed.Results: 73 interventional radiology procedures from CIRSE were analyzed. The original FRE score was 47.2 (Difficult), improved to 78.4 (Fairly Easy) by ChatGPT. GF and ARI scores dropped from 14.4 and 11.2 to 7.8 and 5.8, respectively, after simplification, showing significant improvement (p < 0.001). However, the average DISCERN score decreased from 3.73 to 2.99 (p < 0.001) post-ChatGPT simplification.Conclusion: This study shows ChatGPT's ability to make interventional radiology descriptions more readable but highlights its struggle to maintain the original's reliability, suggesting the need for human review and prompt engineering to enhance outcomes.Level of evidence: Level 6.","PeriodicalId":9591,"journal":{"name":"CardioVascular and Interventional Radiology","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using ChatGPT to Improve Readability of Interventional Radiology Procedure Descriptions.\",\"authors\":\"Hossam A Zaki, Michelle Mai, Hazem Abdel-Megid, Sabrina Q R Liew, Simon Kidanemariam, Abdifatah S Omar, Urvi Tiwari, Jad Hamze, Sun Ho Ahn, Aaron W P Maxwell\",\"doi\":\"10.1007/s00270-024-03803-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: This project examines ChatGPT's potential to enhance the readability of patient educational materials about interventional radiology (IR) procedures.Methods and materials: The descriptions of IR procedures from the Cardiovascular and Interventional Radiological Society of Europe (CIRSE) were used as the original text. Readability scores were calculated using three metrics: Flesch Reading Ease (FRE), Gunning Fog (GF), and the Automated Readability Index (ARI) using an online calculator ( https://readabilityformulas.com ). FRE is scored on a scale of 0-100, where 100 indicates easy-to-read texts, and GF and ARI represent the grade level required to comprehend the text. The DISCERN instrument measured credibility and reliability. ChatGPT was prompted to simplify the texts to a fifth-grade reading level, with subsequent recalculation of readability and DISCERN scores for comparison. Statistical significance was determined using a Wilcoxon Signed-Rank Test. Articles were subsequently organized by subgroups and analyzed.Results: 73 interventional radiology procedures from CIRSE were analyzed. The original FRE score was 47.2 (Difficult), improved to 78.4 (Fairly Easy) by ChatGPT. GF and ARI scores dropped from 14.4 and 11.2 to 7.8 and 5.8, respectively, after simplification, showing significant improvement (p < 0.001). However, the average DISCERN score decreased from 3.73 to 2.99 (p < 0.001) post-ChatGPT simplification.Conclusion: This study shows ChatGPT's ability to make interventional radiology descriptions more readable but highlights its struggle to maintain the original's reliability, suggesting the need for human review and prompt engineering to enhance outcomes.Level of evidence: Level 6.\",\"PeriodicalId\":9591,\"journal\":{\"name\":\"CardioVascular and Interventional Radiology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CardioVascular and Interventional Radiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00270-024-03803-z\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/7/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CardioVascular and Interventional Radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00270-024-03803-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/9 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

目的：本项目研究了 ChatGPT 在提高介入放射学（IR）手术患者教育材料可读性方面的潜力：方法：采用欧洲心血管与介入放射学会（CIRSE）对介入放射手术的描述作为原文。使用三个指标计算可读性得分：Flesch Reading Ease (FRE)、Gunning Fog (GF) 和自动可读性指数 (ARI) 使用在线计算器计算 ( https://readabilityformulas.com )。FRE 的评分范围为 0-100，其中 100 表示文本易读，GF 和 ARI 表示理解文本所需的年级水平。DISCERN 工具测量的是可信度和可靠性。ChatGPT 被提示将课文简化到五年级的阅读水平，随后重新计算可读性和 DISCERN 分数以进行比较。统计意义采用 Wilcoxon Signed-Rank 检验法确定。随后按分组对文章进行整理和分析：对 CIRSE 中的 73 篇介入放射学程序进行了分析。最初的 FRE 得分为 47.2（困难），通过 ChatGPT 后提高到 78.4（相当容易）。简化后，GF 和 ARI 分数分别从 14.4 分和 11.2 分降至 7.8 分和 5.8 分，显示出显著的改善（p 结论：ChatGPT 是一种新的放射学工具，可用于放射学手术：这项研究表明，ChatGPT 有能力使介入放射学的描述更加易读，但也强调了它在保持原始描述可靠性方面的困难，这表明需要人工审核和及时的工程设计来提高结果：证据等级：6 级。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Using ChatGPT to Improve Readability of Interventional Radiology Procedure Descriptions.

查看原文本刊更多论文

Using ChatGPT to Improve Readability of Interventional Radiology Procedure Descriptions.

Purpose: This project examines ChatGPT's potential to enhance the readability of patient educational materials about interventional radiology (IR) procedures.

Methods and materials: The descriptions of IR procedures from the Cardiovascular and Interventional Radiological Society of Europe (CIRSE) were used as the original text. Readability scores were calculated using three metrics: Flesch Reading Ease (FRE), Gunning Fog (GF), and the Automated Readability Index (ARI) using an online calculator ( https://readabilityformulas.com ). FRE is scored on a scale of 0-100, where 100 indicates easy-to-read texts, and GF and ARI represent the grade level required to comprehend the text. The DISCERN instrument measured credibility and reliability. ChatGPT was prompted to simplify the texts to a fifth-grade reading level, with subsequent recalculation of readability and DISCERN scores for comparison. Statistical significance was determined using a Wilcoxon Signed-Rank Test. Articles were subsequently organized by subgroups and analyzed.

Results: 73 interventional radiology procedures from CIRSE were analyzed. The original FRE score was 47.2 (Difficult), improved to 78.4 (Fairly Easy) by ChatGPT. GF and ARI scores dropped from 14.4 and 11.2 to 7.8 and 5.8, respectively, after simplification, showing significant improvement (p < 0.001). However, the average DISCERN score decreased from 3.73 to 2.99 (p < 0.001) post-ChatGPT simplification.

Conclusion: This study shows ChatGPT's ability to make interventional radiology descriptions more readable but highlights its struggle to maintain the original's reliability, suggesting the need for human review and prompt engineering to enhance outcomes.

Level of evidence: Level 6.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

CardioVascular and Interventional Radiology 医学-核医学

CiteScore

5.50

自引率

13.80%

发文量

306

审稿时长

3-8 weeks

期刊介绍： CardioVascular and Interventional Radiology (CVIR) is the official journal of the Cardiovascular and Interventional Radiological Society of Europe, and is also the official organ of a number of additional distinguished national and international interventional radiological societies. CVIR publishes double blinded peer-reviewed original research work including clinical and laboratory investigations, technical notes, case reports, works in progress, and letters to the editor, as well as review articles, pictorial essays, editorials, and special invited submissions in the field of vascular and interventional radiology. Beside the communication of the latest research results in this field, it is also the aim of CVIR to support continuous medical education. Articles that are accepted for publication are done so with the understanding that they, or their substantive contents, have not been and will not be submitted to any other publication.