人工智能聊天机器人在回答日本植入式乳房再造实用指南临床问题中的表现。

IF 2 3区 医学 Q2 SURGERY
Makoto Shiraishi, Yoshihiro Sowa, Koichi Tomita, Yasunobu Terao, Toshihiko Satake, Mayu Muto, Yuhei Morita, Shino Higai, Yoshihiro Toyohara, Yasue Kurokawa, Ataru Sunaga, Mutsumi Okazaki
{"title":"人工智能聊天机器人在回答日本植入式乳房再造实用指南临床问题中的表现。","authors":"Makoto Shiraishi, Yoshihiro Sowa, Koichi Tomita, Yasunobu Terao, Toshihiko Satake, Mayu Muto, Yuhei Morita, Shino Higai, Yoshihiro Toyohara, Yasue Kurokawa, Ataru Sunaga, Mutsumi Okazaki","doi":"10.1007/s00266-024-04515-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) chatbots, including ChatGPT-4 (GPT-4) and Grok-1 (Grok), have been shown to be potentially useful in several medical fields, but have not been examined in plastic and aesthetic surgery. The aim of this study is to evaluate the responses of these AI chatbots for clinical questions (CQs) related to the guidelines for implant-based breast reconstruction (IBBR) published by the Japan Society of Plastic and Reconstructive Surgery (JSPRS) in 2021.</p><p><strong>Methods: </strong>CQs in the JSPRS guidelines were used as question sources. Responses from two AI chatbots, GPT-4 and Grok, were evaluated for accuracy, informativeness, and readability by five Japanese Board-certified breast reconstruction specialists and five Japanese clinical fellows of plastic surgery.</p><p><strong>Results: </strong>GPT-4 outperformed Grok significantly in terms of accuracy (p < 0.001), informativeness (p < 0.001), and readability (p < 0.001) when evaluated by plastic surgery fellows. Compared to the original guidelines, Grok scored significantly lower in all three areas (all p < 0.001). The accuracy of GPT-4 was rated to be significantly higher based on scores given by plastic surgery fellows compared to those of breast reconstruction specialists (p = 0.012), whereas there was no significant difference between these scores for Grok.</p><p><strong>Conclusions: </strong>The study suggests that GPT-4 has the potential to assist in interpreting and applying clinical guidelines for IBBR but importantly there is still a risk that AI chatbots can misinform. Further studies are needed to understand the broader role of current and future AI chatbots in breast reconstruction surgery.</p><p><strong>Level of evidence iv: </strong>This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine Ratings, please refer to Table of Contents or online Instructions to Authors www.springer.com/00266 .</p>","PeriodicalId":7609,"journal":{"name":"Aesthetic Plastic Surgery","volume":" ","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction.\",\"authors\":\"Makoto Shiraishi, Yoshihiro Sowa, Koichi Tomita, Yasunobu Terao, Toshihiko Satake, Mayu Muto, Yuhei Morita, Shino Higai, Yoshihiro Toyohara, Yasue Kurokawa, Ataru Sunaga, Mutsumi Okazaki\",\"doi\":\"10.1007/s00266-024-04515-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Artificial intelligence (AI) chatbots, including ChatGPT-4 (GPT-4) and Grok-1 (Grok), have been shown to be potentially useful in several medical fields, but have not been examined in plastic and aesthetic surgery. The aim of this study is to evaluate the responses of these AI chatbots for clinical questions (CQs) related to the guidelines for implant-based breast reconstruction (IBBR) published by the Japan Society of Plastic and Reconstructive Surgery (JSPRS) in 2021.</p><p><strong>Methods: </strong>CQs in the JSPRS guidelines were used as question sources. Responses from two AI chatbots, GPT-4 and Grok, were evaluated for accuracy, informativeness, and readability by five Japanese Board-certified breast reconstruction specialists and five Japanese clinical fellows of plastic surgery.</p><p><strong>Results: </strong>GPT-4 outperformed Grok significantly in terms of accuracy (p < 0.001), informativeness (p < 0.001), and readability (p < 0.001) when evaluated by plastic surgery fellows. Compared to the original guidelines, Grok scored significantly lower in all three areas (all p < 0.001). The accuracy of GPT-4 was rated to be significantly higher based on scores given by plastic surgery fellows compared to those of breast reconstruction specialists (p = 0.012), whereas there was no significant difference between these scores for Grok.</p><p><strong>Conclusions: </strong>The study suggests that GPT-4 has the potential to assist in interpreting and applying clinical guidelines for IBBR but importantly there is still a risk that AI chatbots can misinform. Further studies are needed to understand the broader role of current and future AI chatbots in breast reconstruction surgery.</p><p><strong>Level of evidence iv: </strong>This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine Ratings, please refer to Table of Contents or online Instructions to Authors www.springer.com/00266 .</p>\",\"PeriodicalId\":7609,\"journal\":{\"name\":\"Aesthetic Plastic Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-11-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Aesthetic Plastic Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00266-024-04515-y\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aesthetic Plastic Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00266-024-04515-y","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

背景:人工智能(AI)聊天机器人,包括 ChatGPT-4 (GPT-4) 和 Grok-1 (Grok),已被证明在多个医疗领域具有潜在用途,但尚未在整形和美容外科领域进行过研究。本研究的目的是评估这些人工智能聊天机器人对与日本整形外科学会(JSPRS)于 2021 年发布的植入式乳房重建(IBBR)指南相关的临床问题(CQ)的回复情况:方法:将 JSPRS 指南中的 CQ 作为问题来源。由五位日本委员会认证的乳房重建专家和五位日本整形外科临床研究员对两个人工智能聊天机器人 GPT-4 和 Grok 的回复进行了准确性、信息量和可读性评估:结果:在整形外科研究员的评估中,GPT-4 在准确性(p < 0.001)、信息量(p < 0.001)和可读性(p < 0.001)方面明显优于 Grok。与原始指南相比,Grok 在这三个方面的得分都明显较低(均 p < 0.001)。根据整形外科研究员的评分,GPT-4 的准确性明显高于乳房重建专家的评分(p = 0.012),而这些评分对 Grok 的准确性没有明显差异:该研究表明,GPT-4 有助于解释和应用 IBBR 的临床指南,但重要的是,人工智能聊天机器人仍有可能提供错误信息。需要进一步研究以了解当前和未来人工智能聊天机器人在乳房重建手术中的更广泛作用。证据级别iv:本期刊要求作者为每篇文章指定证据级别。有关这些循证医学评级的完整描述,请参阅目录或在线作者须知 www.springer.com/00266 。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance of Artificial Intelligence Chatbots in Answering Clinical Questions on Japanese Practical Guidelines for Implant-based Breast Reconstruction.

Background: Artificial intelligence (AI) chatbots, including ChatGPT-4 (GPT-4) and Grok-1 (Grok), have been shown to be potentially useful in several medical fields, but have not been examined in plastic and aesthetic surgery. The aim of this study is to evaluate the responses of these AI chatbots for clinical questions (CQs) related to the guidelines for implant-based breast reconstruction (IBBR) published by the Japan Society of Plastic and Reconstructive Surgery (JSPRS) in 2021.

Methods: CQs in the JSPRS guidelines were used as question sources. Responses from two AI chatbots, GPT-4 and Grok, were evaluated for accuracy, informativeness, and readability by five Japanese Board-certified breast reconstruction specialists and five Japanese clinical fellows of plastic surgery.

Results: GPT-4 outperformed Grok significantly in terms of accuracy (p < 0.001), informativeness (p < 0.001), and readability (p < 0.001) when evaluated by plastic surgery fellows. Compared to the original guidelines, Grok scored significantly lower in all three areas (all p < 0.001). The accuracy of GPT-4 was rated to be significantly higher based on scores given by plastic surgery fellows compared to those of breast reconstruction specialists (p = 0.012), whereas there was no significant difference between these scores for Grok.

Conclusions: The study suggests that GPT-4 has the potential to assist in interpreting and applying clinical guidelines for IBBR but importantly there is still a risk that AI chatbots can misinform. Further studies are needed to understand the broader role of current and future AI chatbots in breast reconstruction surgery.

Level of evidence iv: This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine Ratings, please refer to Table of Contents or online Instructions to Authors www.springer.com/00266 .

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.40
自引率
25.00%
发文量
479
审稿时长
3 months
期刊介绍: Aesthetic Plastic Surgery is a publication of the International Society of Aesthetic Plastic Surgery and the official journal of the European Association of Societies of Aesthetic Plastic Surgery (EASAPS), Società Italiana di Chirurgia Plastica Ricostruttiva ed Estetica (SICPRE), Vereinigung der Deutschen Aesthetisch Plastischen Chirurgen (VDAPC), the Romanian Aesthetic Surgery Society (RASS), Asociación Española de Cirugía Estética Plástica (AECEP), La Sociedad Argentina de Cirugía Plástica, Estética y Reparadora (SACPER), the Rhinoplasty Society of Europe (RSE), the Iranian Society of Plastic and Aesthetic Surgeons (ISPAS), the Singapore Association of Plastic Surgeons (SAPS), the Australasian Society of Aesthetic Plastic Surgeons (ASAPS), the Egyptian Society of Plastic and Reconstructive Surgeons (ESPRS), and the Sociedad Chilena de Cirugía Plástica, Reconstructiva y Estética (SCCP). Aesthetic Plastic Surgery provides a forum for original articles advancing the art of aesthetic plastic surgery. Many describe surgical craftsmanship; others deal with complications in surgical procedures and methods by which to treat or avoid them. Coverage includes "second thoughts" on established techniques, which might be abandoned, modified, or improved. Also included are case histories; improvements in surgical instruments, pharmaceuticals, and operating room equipment; and discussions of problems such as the role of psychosocial factors in the doctor-patient and the patient-public interrelationships. Aesthetic Plastic Surgery is covered in Current Contents/Clinical Medicine, SciSearch, Research Alert, Index Medicus-Medline, and Excerpta Medica/Embase.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信