评估谷歌Gemini的人工智能对乳房重建后问题的回答

IF 2 3区 医学 Q2 SURGERY
Aaron N. Hendizadeh , Nicholas Schmitz , Annie Fritsch , Lauren Martin , Jaime Pardo Palau , Christodoulos Kaoutzanis , Mamtha Raj , David E. Kurlander , Darren Nin , George Kokosis
{"title":"评估谷歌Gemini的人工智能对乳房重建后问题的回答","authors":"Aaron N. Hendizadeh ,&nbsp;Nicholas Schmitz ,&nbsp;Annie Fritsch ,&nbsp;Lauren Martin ,&nbsp;Jaime Pardo Palau ,&nbsp;Christodoulos Kaoutzanis ,&nbsp;Mamtha Raj ,&nbsp;David E. Kurlander ,&nbsp;Darren Nin ,&nbsp;George Kokosis","doi":"10.1016/j.bjps.2025.04.021","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>The purpose of this study was to evaluate Google Gemini’s responses to common post-operative questions pertaining to breast reconstruction surgery.</div></div><div><h3>Methods</h3><div>Google Gemini AI was prompted with 14 common post-operative questions related to breast reconstruction surgery. Four experienced breast reconstructive surgeons and four advanced practice providers (APPs) evaluated the responses for accuracy, completeness, relevance, and overall quality on a 4-point Likert scale. Median scores were calculated and utilized as the final score. Responses were further categorized as accurate vs inaccurate, complete vs incomplete, relevant vs irrelevant, and high vs low quality. Readability was evaluated using the Fleisch-Kincaid reading scale.</div></div><div><h3>Results</h3><div>Attending surgeons classified 12/14 responses (86%) as accurate, 12/14 (86%) as complete, 13/14 (93%) as relevant, and 12/14 (86%) as high quality. APPs rated 11/14 responses (79%) as accurate, 12/14 (86%) as complete, 14 /14 (100%) as relevant, and 10/14 (71%) as high quality. APPs assigned lower median scores for overall quality than physicians (p=0.003). The mean Flesch-Kincaid readability score was 52.3.</div></div><div><h3>Conclusion</h3><div>Google Gemini provided relevant and complete responses to common post-operative questions pertaining to breast reconstruction. The “fairly difficult” readability score may pose challenges for certain patient populations. Differences in scores between physicians and APPs for “overall quality” may be due to higher levels of experience among physicians, allowing them to evaluate answers in a broader context. While Google Gemini demonstrates potential as a tool for patient education, patients should be advised to consult with clinicians for the most reliable and personalized medical advice.</div></div>","PeriodicalId":50084,"journal":{"name":"Journal of Plastic Reconstructive and Aesthetic Surgery","volume":"105 ","pages":"Pages 185-188"},"PeriodicalIF":2.0000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating Google Gemini’s AI generated responses to questions after breast reconstruction\",\"authors\":\"Aaron N. Hendizadeh ,&nbsp;Nicholas Schmitz ,&nbsp;Annie Fritsch ,&nbsp;Lauren Martin ,&nbsp;Jaime Pardo Palau ,&nbsp;Christodoulos Kaoutzanis ,&nbsp;Mamtha Raj ,&nbsp;David E. Kurlander ,&nbsp;Darren Nin ,&nbsp;George Kokosis\",\"doi\":\"10.1016/j.bjps.2025.04.021\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>The purpose of this study was to evaluate Google Gemini’s responses to common post-operative questions pertaining to breast reconstruction surgery.</div></div><div><h3>Methods</h3><div>Google Gemini AI was prompted with 14 common post-operative questions related to breast reconstruction surgery. Four experienced breast reconstructive surgeons and four advanced practice providers (APPs) evaluated the responses for accuracy, completeness, relevance, and overall quality on a 4-point Likert scale. Median scores were calculated and utilized as the final score. Responses were further categorized as accurate vs inaccurate, complete vs incomplete, relevant vs irrelevant, and high vs low quality. Readability was evaluated using the Fleisch-Kincaid reading scale.</div></div><div><h3>Results</h3><div>Attending surgeons classified 12/14 responses (86%) as accurate, 12/14 (86%) as complete, 13/14 (93%) as relevant, and 12/14 (86%) as high quality. APPs rated 11/14 responses (79%) as accurate, 12/14 (86%) as complete, 14 /14 (100%) as relevant, and 10/14 (71%) as high quality. APPs assigned lower median scores for overall quality than physicians (p=0.003). The mean Flesch-Kincaid readability score was 52.3.</div></div><div><h3>Conclusion</h3><div>Google Gemini provided relevant and complete responses to common post-operative questions pertaining to breast reconstruction. The “fairly difficult” readability score may pose challenges for certain patient populations. Differences in scores between physicians and APPs for “overall quality” may be due to higher levels of experience among physicians, allowing them to evaluate answers in a broader context. While Google Gemini demonstrates potential as a tool for patient education, patients should be advised to consult with clinicians for the most reliable and personalized medical advice.</div></div>\",\"PeriodicalId\":50084,\"journal\":{\"name\":\"Journal of Plastic Reconstructive and Aesthetic Surgery\",\"volume\":\"105 \",\"pages\":\"Pages 185-188\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Plastic Reconstructive and Aesthetic Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1748681525002645\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Plastic Reconstructive and Aesthetic Surgery","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1748681525002645","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0

摘要

本研究的目的是评估谷歌Gemini对与乳房重建手术有关的常见术后问题的反应。方法使用google Gemini AI查询乳房再造术后常见的14个问题。4名经验丰富的乳房重建外科医生和4名高级执业医师(APPs)对回答的准确性、完整性、相关性和整体质量进行了4分李克特评分。计算中位数得分作为最终得分。回答进一步分类为准确与不准确,完整与不完整,相关与不相关,高质量与低质量。采用Fleisch-Kincaid阅读量表评估可读性。结果主治医师对12/14反应的分类为准确(86%)、完整(86%)、相关(93%)和高质量(86%)。app将11/14(79%)的回答评为准确,12/14(86%)为完整,14 /14(100%)为相关,10/14(71%)为高质量。应用程序对整体质量的中位数评分低于医生(p=0.003)。Flesch-Kincaid的平均可读性评分为52.3分。结论google Gemini对乳房再造术后常见问题提供了相关且完整的回答。“相当难”的可读性评分可能对某些患者群体构成挑战。医生和app在“整体质量”方面的得分差异可能是由于医生的经验水平更高,这使他们能够在更广泛的背景下评估答案。虽然谷歌Gemini展示了作为患者教育工具的潜力,但应该建议患者咨询临床医生,以获得最可靠和个性化的医疗建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating Google Gemini’s AI generated responses to questions after breast reconstruction

Background

The purpose of this study was to evaluate Google Gemini’s responses to common post-operative questions pertaining to breast reconstruction surgery.

Methods

Google Gemini AI was prompted with 14 common post-operative questions related to breast reconstruction surgery. Four experienced breast reconstructive surgeons and four advanced practice providers (APPs) evaluated the responses for accuracy, completeness, relevance, and overall quality on a 4-point Likert scale. Median scores were calculated and utilized as the final score. Responses were further categorized as accurate vs inaccurate, complete vs incomplete, relevant vs irrelevant, and high vs low quality. Readability was evaluated using the Fleisch-Kincaid reading scale.

Results

Attending surgeons classified 12/14 responses (86%) as accurate, 12/14 (86%) as complete, 13/14 (93%) as relevant, and 12/14 (86%) as high quality. APPs rated 11/14 responses (79%) as accurate, 12/14 (86%) as complete, 14 /14 (100%) as relevant, and 10/14 (71%) as high quality. APPs assigned lower median scores for overall quality than physicians (p=0.003). The mean Flesch-Kincaid readability score was 52.3.

Conclusion

Google Gemini provided relevant and complete responses to common post-operative questions pertaining to breast reconstruction. The “fairly difficult” readability score may pose challenges for certain patient populations. Differences in scores between physicians and APPs for “overall quality” may be due to higher levels of experience among physicians, allowing them to evaluate answers in a broader context. While Google Gemini demonstrates potential as a tool for patient education, patients should be advised to consult with clinicians for the most reliable and personalized medical advice.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.10
自引率
11.10%
发文量
578
审稿时长
3.5 months
期刊介绍: JPRAS An International Journal of Surgical Reconstruction is one of the world''s leading international journals, covering all the reconstructive and aesthetic aspects of plastic surgery. The journal presents the latest surgical procedures with audit and outcome studies of new and established techniques in plastic surgery including: cleft lip and palate and other heads and neck surgery, hand surgery, lower limb trauma, burns, skin cancer, breast surgery and aesthetic surgery.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信