评估人工智能生成的口腔外科知情同意书：ChatGPT-4、Bard gemini advanced 和人工书写同意书的比较研究。

IF 2.1 2区医学 Q2 DENTISTRY, ORAL SURGERY & MEDICINE

Journal of Cranio-Maxillofacial Surgery Pub Date : 2025-01-01 DOI:10.1016/j.jcms.2024.10.002

Luigi Angelo Vaira , Jerome R. Lechien , Antonino Maniaci , Giuseppe Tanda , Vincenzo Abbate , Fabiana Allevi , Antonio Arena , Giada Anna Beltramini , Michela Bergonzani , Alessandro Remigio Bolzoni , Salvatore Crimi , Andrea Frosolini , Guido Gabriele , Fabio Maglitto , Miguel Mayo-Yáñez , Ludovica Orrù , Marzia Petrocelli , Resi Pucci , Alberto Maria Saibene , Stefania Troise , Giacomo De Riu

{"title":"评估人工智能生成的口腔外科知情同意书：ChatGPT-4、Bard gemini advanced 和人工书写同意书的比较研究。","authors":"Luigi Angelo Vaira , Jerome R. Lechien , Antonino Maniaci , Giuseppe Tanda , Vincenzo Abbate , Fabiana Allevi , Antonio Arena , Giada Anna Beltramini , Michela Bergonzani , Alessandro Remigio Bolzoni , Salvatore Crimi , Andrea Frosolini , Guido Gabriele , Fabio Maglitto , Miguel Mayo-Yáñez , Ludovica Orrù , Marzia Petrocelli , Resi Pucci , Alberto Maria Saibene , Stefania Troise , Giacomo De Riu","doi":"10.1016/j.jcms.2024.10.002","DOIUrl":null,"url":null,"abstract":"<div><div>This study evaluates the quality and readability of informed consent documents generated by AI platforms ChatGPT-4 and Bard Gemini Advanced compared to those written by a first-year oral surgery resident for common oral surgery procedures. The evaluation, conducted by 18 experienced oral and maxillofacial surgeons, assessed consents for accuracy, completeness, readability, and overall quality.</div><div>ChatGPT-4 consistently outperformed both Bard and human-written consents. ChatGPT-4 consents had a median accuracy score of 4 [IQR 4-4], compared to Bard's 3 [IQR 3–4] and human's 4 [IQR 3–4]. Completeness scores were higher for ChatGPT-4 (4 [IQR 4–5]) than Bard (3 [IQR 3–4]) and human (4 [IQR 3–4]). Readability was also superior for ChatGPT-4, with a median score of 4 [IQR 4–5] compared to Bard and human consents, both at 4 [IQR 4-4] and 4 [IQR 3–4], respectively. The Gunning Fog Index for ChatGPT-4 was 17.2 [IQR 16.5–18.2], better than Bard's 23.1 [IQR 20.5–24.7] and the human consents' 20 [IQR 19.2–20.9].</div><div>Overall, ChatGPT-4's consents received the highest quality ratings, underscoring AI's potential in enhancing patient communication and the informed consent process. The study suggests AI can reduce misinformation risks and improve patient understanding, but continuous evaluation, oversight, and patient feedback integration are crucial to ensure the effectiveness and appropriateness of AI-generated content in clinical practice.</div></div>","PeriodicalId":54851,"journal":{"name":"Journal of Cranio-Maxillofacial Surgery","volume":"53 1","pages":"Pages 18-23"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating AI-Generated informed consent documents in oral surgery: A comparative study of ChatGPT-4, Bard gemini advanced, and human-written consents\",\"authors\":\"Luigi Angelo Vaira , Jerome R. Lechien , Antonino Maniaci , Giuseppe Tanda , Vincenzo Abbate , Fabiana Allevi , Antonio Arena , Giada Anna Beltramini , Michela Bergonzani , Alessandro Remigio Bolzoni , Salvatore Crimi , Andrea Frosolini , Guido Gabriele , Fabio Maglitto , Miguel Mayo-Yáñez , Ludovica Orrù , Marzia Petrocelli , Resi Pucci , Alberto Maria Saibene , Stefania Troise , Giacomo De Riu\",\"doi\":\"10.1016/j.jcms.2024.10.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study evaluates the quality and readability of informed consent documents generated by AI platforms ChatGPT-4 and Bard Gemini Advanced compared to those written by a first-year oral surgery resident for common oral surgery procedures. The evaluation, conducted by 18 experienced oral and maxillofacial surgeons, assessed consents for accuracy, completeness, readability, and overall quality.</div><div>ChatGPT-4 consistently outperformed both Bard and human-written consents. ChatGPT-4 consents had a median accuracy score of 4 [IQR 4-4], compared to Bard's 3 [IQR 3–4] and human's 4 [IQR 3–4]. Completeness scores were higher for ChatGPT-4 (4 [IQR 4–5]) than Bard (3 [IQR 3–4]) and human (4 [IQR 3–4]). Readability was also superior for ChatGPT-4, with a median score of 4 [IQR 4–5] compared to Bard and human consents, both at 4 [IQR 4-4] and 4 [IQR 3–4], respectively. The Gunning Fog Index for ChatGPT-4 was 17.2 [IQR 16.5–18.2], better than Bard's 23.1 [IQR 20.5–24.7] and the human consents' 20 [IQR 19.2–20.9].</div><div>Overall, ChatGPT-4's consents received the highest quality ratings, underscoring AI's potential in enhancing patient communication and the informed consent process. The study suggests AI can reduce misinformation risks and improve patient understanding, but continuous evaluation, oversight, and patient feedback integration are crucial to ensure the effectiveness and appropriateness of AI-generated content in clinical practice.</div></div>\",\"PeriodicalId\":54851,\"journal\":{\"name\":\"Journal of Cranio-Maxillofacial Surgery\",\"volume\":\"53 1\",\"pages\":\"Pages 18-23\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cranio-Maxillofacial Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S101051822400283X\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cranio-Maxillofacial Surgery","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S101051822400283X","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

摘要

本研究对人工智能平台 ChatGPT-4 和 Bard Gemini Advanced 生成的知情同意书的质量和可读性进行了评估，并与口腔外科一年级住院医师撰写的常见口腔外科手术知情同意书进行了比较。评估由 18 位经验丰富的口腔颌面外科医生进行，对同意书的准确性、完整性、可读性和整体质量进行了评估。ChatGPT-4 的表现始终优于 Bard 和人工书写的同意书。ChatGPT-4 同意书的准确性中位数为 4 [IQR 4-4]，而巴德同意书为 3 [IQR 3-4]，人工同意书为 4 [IQR3-4]。ChatGPT-4 的完整性得分（4 [IQR 4-5]）高于 Bard（3 [IQR 3-4]）和人类（4 [IQR 3-4]）。ChatGPT-4 的可读性也很好，中位数为 4 [IQR 4-5]，而巴德和人类同意书的中位数分别为 4 [IQR 4-4] 和 4 [IQR3-4]。ChatGPT-4 的 Gunning Fog 指数为 17.2 [IQR 16.5-18.2]，优于 Bard 的 23.1 [IQR 20.5-24.7] 和人类同意者的 20 [IQR19.2-20.9]。总体而言，ChatGPT-4 的同意书获得了最高的质量评分，突出了人工智能在加强患者沟通和知情同意程序方面的潜力。这项研究表明，人工智能可以降低误导风险并提高患者的理解能力，但持续的评估、监督和患者反馈整合对于确保人工智能生成的内容在临床实践中的有效性和适当性至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating AI-Generated informed consent documents in oral surgery: A comparative study of ChatGPT-4, Bard gemini advanced, and human-written consents

This study evaluates the quality and readability of informed consent documents generated by AI platforms ChatGPT-4 and Bard Gemini Advanced compared to those written by a first-year oral surgery resident for common oral surgery procedures. The evaluation, conducted by 18 experienced oral and maxillofacial surgeons, assessed consents for accuracy, completeness, readability, and overall quality.

ChatGPT-4 consistently outperformed both Bard and human-written consents. ChatGPT-4 consents had a median accuracy score of 4 [IQR 4-4], compared to Bard's 3 [IQR 3–4] and human's 4 [IQR 3–4]. Completeness scores were higher for ChatGPT-4 (4 [IQR 4–5]) than Bard (3 [IQR 3–4]) and human (4 [IQR 3–4]). Readability was also superior for ChatGPT-4, with a median score of 4 [IQR 4–5] compared to Bard and human consents, both at 4 [IQR 4-4] and 4 [IQR 3–4], respectively. The Gunning Fog Index for ChatGPT-4 was 17.2 [IQR 16.5–18.2], better than Bard's 23.1 [IQR 20.5–24.7] and the human consents' 20 [IQR 19.2–20.9].

Overall, ChatGPT-4's consents received the highest quality ratings, underscoring AI's potential in enhancing patient communication and the informed consent process. The study suggests AI can reduce misinformation risks and improve patient understanding, but continuous evaluation, oversight, and patient feedback integration are crucial to ensure the effectiveness and appropriateness of AI-generated content in clinical practice.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Cranio-Maxillofacial Surgery 医学-外科

CiteScore

5.20

自引率

22.60%

发文量

117

审稿时长

70 days

期刊介绍： The Journal of Cranio-Maxillofacial Surgery publishes articles covering all aspects of surgery of the head, face and jaw. Specific topics covered recently have included: • Distraction osteogenesis • Synthetic bone substitutes • Fibroblast growth factors • Fetal wound healing • Skull base surgery • Computer-assisted surgery • Vascularized bone grafts