{"title":"chatgpt - 40、Claude 3.5 Sonnet、谷歌Gemini 2.0 Flash作为上睑成形术患者教育资源的评价","authors":"Suleyman Demir, İsmail Cem Türkeş","doi":"10.1097/SCS.0000000000011608","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to evaluate the effectiveness, accuracy, and readability of the leading large language models (LLMs) from 3 different companies, ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini 2.0 Flash, as patient education resources for upper blepharoplasty.</p><p><strong>Methods: </strong>Twenty frequently asked questions about upper blepharoplasty were posed to the 3 LLMs. Two ophthalmologists recorded the responses to the questions and independently evaluated the accuracy of the LLMs using a 5-point Likert scale with scores ranging from 1 to 5. The readability of the analyzed texts was assessed using the SMOG index and the Coleman-Liau index.</p><p><strong>Results: </strong>All models demonstrated high accuracy, with mean Likert scores exceeding 4.5. No statistically significant difference in Likert scores was observed among the 3 models (P=0.097). Claude 3.5 Sonnet generated the most complex responses (Coleman-Liau index: 17.34; SMOG index: 23.82 points), whereas Google Gemini 2.0 Flash produced the most comprehensible texts (Coleman-Liau index: 13.27; SMOG index: 15.04 points).</p><p><strong>Conclusion: </strong>Large language models hold great promise as tools to educate patients about upper blepharoplasty. Future research should focus on simplifying language models without compromising accuracy, keeping models up-to-date, and minimizing bias to improve patient care and safety.</p>","PeriodicalId":15462,"journal":{"name":"Journal of Craniofacial Surgery","volume":" ","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluation of ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini 2.0 Flash as Patient Education Resources for Upper Blepharoplasty Patients.\",\"authors\":\"Suleyman Demir, İsmail Cem Türkeş\",\"doi\":\"10.1097/SCS.0000000000011608\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>This study aimed to evaluate the effectiveness, accuracy, and readability of the leading large language models (LLMs) from 3 different companies, ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini 2.0 Flash, as patient education resources for upper blepharoplasty.</p><p><strong>Methods: </strong>Twenty frequently asked questions about upper blepharoplasty were posed to the 3 LLMs. Two ophthalmologists recorded the responses to the questions and independently evaluated the accuracy of the LLMs using a 5-point Likert scale with scores ranging from 1 to 5. The readability of the analyzed texts was assessed using the SMOG index and the Coleman-Liau index.</p><p><strong>Results: </strong>All models demonstrated high accuracy, with mean Likert scores exceeding 4.5. No statistically significant difference in Likert scores was observed among the 3 models (P=0.097). Claude 3.5 Sonnet generated the most complex responses (Coleman-Liau index: 17.34; SMOG index: 23.82 points), whereas Google Gemini 2.0 Flash produced the most comprehensible texts (Coleman-Liau index: 13.27; SMOG index: 15.04 points).</p><p><strong>Conclusion: </strong>Large language models hold great promise as tools to educate patients about upper blepharoplasty. Future research should focus on simplifying language models without compromising accuracy, keeping models up-to-date, and minimizing bias to improve patient care and safety.</p>\",\"PeriodicalId\":15462,\"journal\":{\"name\":\"Journal of Craniofacial Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Craniofacial Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1097/SCS.0000000000011608\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Craniofacial Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/SCS.0000000000011608","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
Evaluation of ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini 2.0 Flash as Patient Education Resources for Upper Blepharoplasty Patients.
Purpose: This study aimed to evaluate the effectiveness, accuracy, and readability of the leading large language models (LLMs) from 3 different companies, ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini 2.0 Flash, as patient education resources for upper blepharoplasty.
Methods: Twenty frequently asked questions about upper blepharoplasty were posed to the 3 LLMs. Two ophthalmologists recorded the responses to the questions and independently evaluated the accuracy of the LLMs using a 5-point Likert scale with scores ranging from 1 to 5. The readability of the analyzed texts was assessed using the SMOG index and the Coleman-Liau index.
Results: All models demonstrated high accuracy, with mean Likert scores exceeding 4.5. No statistically significant difference in Likert scores was observed among the 3 models (P=0.097). Claude 3.5 Sonnet generated the most complex responses (Coleman-Liau index: 17.34; SMOG index: 23.82 points), whereas Google Gemini 2.0 Flash produced the most comprehensible texts (Coleman-Liau index: 13.27; SMOG index: 15.04 points).
Conclusion: Large language models hold great promise as tools to educate patients about upper blepharoplasty. Future research should focus on simplifying language models without compromising accuracy, keeping models up-to-date, and minimizing bias to improve patient care and safety.
期刊介绍:
The Journal of Craniofacial Surgery serves as a forum of communication for all those involved in craniofacial surgery, maxillofacial surgery and pediatric plastic surgery. Coverage ranges from practical aspects of craniofacial surgery to the basic science that underlies surgical practice. The journal publishes original articles, scientific reviews, editorials and invited commentary, abstracts and selected articles from international journals, and occasional international bibliographies in craniofacial surgery.