革命性的患者教育：人工智能与眼运动障碍反应专家。

IF 1.3 4区医学 Q3 OPHTHALMOLOGY

Ophthalmic Plastic and Reconstructive Surgery Pub Date : 2025-08-27 DOI:10.1097/IOP.0000000000003046

Daniel Bahir, Morris Hartstein, Cat Burkat, Daniel Ezra, Allan E Wulc, Ofira Zloto, John Holds, Shirin Hamed Azzam

{"title":"革命性的患者教育：人工智能与眼运动障碍反应专家。","authors":"Daniel Bahir, Morris Hartstein, Cat Burkat, Daniel Ezra, Allan E Wulc, Ofira Zloto, John Holds, Shirin Hamed Azzam","doi":"10.1097/IOP.0000000000003046","DOIUrl":null,"url":null,"abstract":"Purpose: Ocular dyskinesia, including dystonic blepharospasm and hemifacial spasm, significantly impacts patient quality of life. This study evaluates the effectiveness of advanced artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) compared with expert ophthalmologists in providing accurate, reliable, and patient-focused answers to common ocular dyskinesia-related questions.Methods: A panel of oculoplastic surgeons developed 13 clinically relevant questions addressing symptoms, treatments, and posttreatment care for ocular dyskinesia. Anonymized responses from 4 artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) and experts were evaluated by a panel of 7 international oculoplastic surgeons for correctness and reliability using a 7-point Likert scale. Statistical analyses were performed to identify differences among groups.Results: ChatGPT-3.5 emerged as the top-performing model, achieving the highest correctness (mean score: 5.80) and reliability score (5.68), surpassing both GPT-4o (5.58/5.38) and the expert panel (5.56/5.31). GPT-4o closely mirrored expert performance, while Gemini and Gemini Advanced consistently lagged, reflecting lower correctness (4.67 and 5.03, respectively) and reliability scores. Statistical analysis confirmed significant differences across groups (p < 0.001).Conclusions: ChatGPT-3.5 demonstrates exceptional potential in transforming patient education regarding ocular dyskinesia, delivering highly accurate and patient-accessible responses. While ChatGPT-4o and experts offer strong, clinically sound insights, the Gemini models require refinement to meet higher benchmarks. These findings underscore the potential role of artificial intelligence in complementing human expertise, paving the way for innovative and collaborative approaches to patient care and education.","PeriodicalId":19588,"journal":{"name":"Ophthalmic Plastic and Reconstructive Surgery","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Revolutionizing Patient Education: Artificial Intelligence Versus Experts in Ocular Dyskinesia Responses.\",\"authors\":\"Daniel Bahir, Morris Hartstein, Cat Burkat, Daniel Ezra, Allan E Wulc, Ofira Zloto, John Holds, Shirin Hamed Azzam\",\"doi\":\"10.1097/IOP.0000000000003046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: Ocular dyskinesia, including dystonic blepharospasm and hemifacial spasm, significantly impacts patient quality of life. This study evaluates the effectiveness of advanced artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) compared with expert ophthalmologists in providing accurate, reliable, and patient-focused answers to common ocular dyskinesia-related questions.Methods: A panel of oculoplastic surgeons developed 13 clinically relevant questions addressing symptoms, treatments, and posttreatment care for ocular dyskinesia. Anonymized responses from 4 artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) and experts were evaluated by a panel of 7 international oculoplastic surgeons for correctness and reliability using a 7-point Likert scale. Statistical analyses were performed to identify differences among groups.Results: ChatGPT-3.5 emerged as the top-performing model, achieving the highest correctness (mean score: 5.80) and reliability score (5.68), surpassing both GPT-4o (5.58/5.38) and the expert panel (5.56/5.31). GPT-4o closely mirrored expert performance, while Gemini and Gemini Advanced consistently lagged, reflecting lower correctness (4.67 and 5.03, respectively) and reliability scores. Statistical analysis confirmed significant differences across groups (p < 0.001).Conclusions: ChatGPT-3.5 demonstrates exceptional potential in transforming patient education regarding ocular dyskinesia, delivering highly accurate and patient-accessible responses. While ChatGPT-4o and experts offer strong, clinically sound insights, the Gemini models require refinement to meet higher benchmarks. These findings underscore the potential role of artificial intelligence in complementing human expertise, paving the way for innovative and collaborative approaches to patient care and education.\",\"PeriodicalId\":19588,\"journal\":{\"name\":\"Ophthalmic Plastic and Reconstructive Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmic Plastic and Reconstructive Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1097/IOP.0000000000003046\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmic Plastic and Reconstructive Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/IOP.0000000000003046","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

目的：眼运动障碍，包括张力障碍眼睑痉挛和面肌痉挛，显著影响患者的生活质量。本研究评估了先进的人工智能模型（ChatGPT-3.5、gpt - 40、Gemini和Gemini advanced）在为常见的眼动障碍相关问题提供准确、可靠和以患者为中心的答案方面的有效性，并与眼科专家进行了比较。方法：一组眼科整形外科医生提出了13个临床相关问题，涉及眼运动障碍的症状、治疗和治疗后护理。来自4个人工智能模型（ChatGPT-3.5、gpt - 40、Gemini和Gemini Advanced）和专家的匿名回复由7名国际眼科整形医生组成的小组使用7分李克特量表评估其正确性和可靠性。进行统计学分析以确定组间差异。结果：ChatGPT-3.5是表现最好的模型，获得了最高的正确性（平均得分：5.80）和可靠性（5.68），超过了gpt - 40（5.58/5.38）和专家组（5.56/5.31）。gpt - 40几乎反映了专家的表现，而Gemini和Gemini Advanced一直落后，反映出较低的正确性（分别为4.67和5.03）和可靠性得分。统计学分析证实组间差异显著（p < 0.001）。结论：ChatGPT-3.5在改变关于眼运动障碍的患者教育方面显示出非凡的潜力，提供高度准确和患者可获得的反应。虽然chatgpt - 40和专家提供了强有力的、临床合理的见解，但双子座模型需要改进以满足更高的基准。这些发现强调了人工智能在补充人类专业知识方面的潜在作用，为患者护理和教育的创新和协作方法铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Revolutionizing Patient Education: Artificial Intelligence Versus Experts in Ocular Dyskinesia Responses.

Purpose: Ocular dyskinesia, including dystonic blepharospasm and hemifacial spasm, significantly impacts patient quality of life. This study evaluates the effectiveness of advanced artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) compared with expert ophthalmologists in providing accurate, reliable, and patient-focused answers to common ocular dyskinesia-related questions.

Methods: A panel of oculoplastic surgeons developed 13 clinically relevant questions addressing symptoms, treatments, and posttreatment care for ocular dyskinesia. Anonymized responses from 4 artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) and experts were evaluated by a panel of 7 international oculoplastic surgeons for correctness and reliability using a 7-point Likert scale. Statistical analyses were performed to identify differences among groups.

Results: ChatGPT-3.5 emerged as the top-performing model, achieving the highest correctness (mean score: 5.80) and reliability score (5.68), surpassing both GPT-4o (5.58/5.38) and the expert panel (5.56/5.31). GPT-4o closely mirrored expert performance, while Gemini and Gemini Advanced consistently lagged, reflecting lower correctness (4.67 and 5.03, respectively) and reliability scores. Statistical analysis confirmed significant differences across groups (p < 0.001).

Conclusions: ChatGPT-3.5 demonstrates exceptional potential in transforming patient education regarding ocular dyskinesia, delivering highly accurate and patient-accessible responses. While ChatGPT-4o and experts offer strong, clinically sound insights, the Gemini models require refinement to meet higher benchmarks. These findings underscore the potential role of artificial intelligence in complementing human expertise, paving the way for innovative and collaborative approaches to patient care and education.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Ophthalmic Plastic and Reconstructive Surgery 医学-外科

CiteScore

2.50

自引率

10.00%

发文量

322

审稿时长

3-8 weeks

期刊介绍： Ophthalmic Plastic and Reconstructive Surgery features original articles and reviews on topics such as ptosis, eyelid reconstruction, orbital diagnosis and surgery, lacrimal problems, and eyelid malposition. Update reports on diagnostic techniques, surgical equipment and instrumentation, and medical therapies are included, as well as detailed analyses of recent research findings and their clinical applications.