Daniel Bahir, Morris Hartstein, Cat Burkat, Daniel Ezra, Allan E Wulc, Ofira Zloto, John Holds, Shirin Hamed Azzam
{"title":"革命性的患者教育:人工智能与眼运动障碍反应专家。","authors":"Daniel Bahir, Morris Hartstein, Cat Burkat, Daniel Ezra, Allan E Wulc, Ofira Zloto, John Holds, Shirin Hamed Azzam","doi":"10.1097/IOP.0000000000003046","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Ocular dyskinesia, including dystonic blepharospasm and hemifacial spasm, significantly impacts patient quality of life. This study evaluates the effectiveness of advanced artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) compared with expert ophthalmologists in providing accurate, reliable, and patient-focused answers to common ocular dyskinesia-related questions.</p><p><strong>Methods: </strong>A panel of oculoplastic surgeons developed 13 clinically relevant questions addressing symptoms, treatments, and posttreatment care for ocular dyskinesia. Anonymized responses from 4 artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) and experts were evaluated by a panel of 7 international oculoplastic surgeons for correctness and reliability using a 7-point Likert scale. Statistical analyses were performed to identify differences among groups.</p><p><strong>Results: </strong>ChatGPT-3.5 emerged as the top-performing model, achieving the highest correctness (mean score: 5.80) and reliability score (5.68), surpassing both GPT-4o (5.58/5.38) and the expert panel (5.56/5.31). GPT-4o closely mirrored expert performance, while Gemini and Gemini Advanced consistently lagged, reflecting lower correctness (4.67 and 5.03, respectively) and reliability scores. Statistical analysis confirmed significant differences across groups (p < 0.001).</p><p><strong>Conclusions: </strong>ChatGPT-3.5 demonstrates exceptional potential in transforming patient education regarding ocular dyskinesia, delivering highly accurate and patient-accessible responses. While ChatGPT-4o and experts offer strong, clinically sound insights, the Gemini models require refinement to meet higher benchmarks. These findings underscore the potential role of artificial intelligence in complementing human expertise, paving the way for innovative and collaborative approaches to patient care and education.</p>","PeriodicalId":19588,"journal":{"name":"Ophthalmic Plastic and Reconstructive Surgery","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Revolutionizing Patient Education: Artificial Intelligence Versus Experts in Ocular Dyskinesia Responses.\",\"authors\":\"Daniel Bahir, Morris Hartstein, Cat Burkat, Daniel Ezra, Allan E Wulc, Ofira Zloto, John Holds, Shirin Hamed Azzam\",\"doi\":\"10.1097/IOP.0000000000003046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Ocular dyskinesia, including dystonic blepharospasm and hemifacial spasm, significantly impacts patient quality of life. This study evaluates the effectiveness of advanced artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) compared with expert ophthalmologists in providing accurate, reliable, and patient-focused answers to common ocular dyskinesia-related questions.</p><p><strong>Methods: </strong>A panel of oculoplastic surgeons developed 13 clinically relevant questions addressing symptoms, treatments, and posttreatment care for ocular dyskinesia. Anonymized responses from 4 artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) and experts were evaluated by a panel of 7 international oculoplastic surgeons for correctness and reliability using a 7-point Likert scale. Statistical analyses were performed to identify differences among groups.</p><p><strong>Results: </strong>ChatGPT-3.5 emerged as the top-performing model, achieving the highest correctness (mean score: 5.80) and reliability score (5.68), surpassing both GPT-4o (5.58/5.38) and the expert panel (5.56/5.31). GPT-4o closely mirrored expert performance, while Gemini and Gemini Advanced consistently lagged, reflecting lower correctness (4.67 and 5.03, respectively) and reliability scores. Statistical analysis confirmed significant differences across groups (p < 0.001).</p><p><strong>Conclusions: </strong>ChatGPT-3.5 demonstrates exceptional potential in transforming patient education regarding ocular dyskinesia, delivering highly accurate and patient-accessible responses. While ChatGPT-4o and experts offer strong, clinically sound insights, the Gemini models require refinement to meet higher benchmarks. These findings underscore the potential role of artificial intelligence in complementing human expertise, paving the way for innovative and collaborative approaches to patient care and education.</p>\",\"PeriodicalId\":19588,\"journal\":{\"name\":\"Ophthalmic Plastic and Reconstructive Surgery\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmic Plastic and Reconstructive Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1097/IOP.0000000000003046\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmic Plastic and Reconstructive Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/IOP.0000000000003046","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
Revolutionizing Patient Education: Artificial Intelligence Versus Experts in Ocular Dyskinesia Responses.
Purpose: Ocular dyskinesia, including dystonic blepharospasm and hemifacial spasm, significantly impacts patient quality of life. This study evaluates the effectiveness of advanced artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) compared with expert ophthalmologists in providing accurate, reliable, and patient-focused answers to common ocular dyskinesia-related questions.
Methods: A panel of oculoplastic surgeons developed 13 clinically relevant questions addressing symptoms, treatments, and posttreatment care for ocular dyskinesia. Anonymized responses from 4 artificial intelligence models (ChatGPT-3.5, GPT-4o, Gemini, and Gemini Advanced) and experts were evaluated by a panel of 7 international oculoplastic surgeons for correctness and reliability using a 7-point Likert scale. Statistical analyses were performed to identify differences among groups.
Results: ChatGPT-3.5 emerged as the top-performing model, achieving the highest correctness (mean score: 5.80) and reliability score (5.68), surpassing both GPT-4o (5.58/5.38) and the expert panel (5.56/5.31). GPT-4o closely mirrored expert performance, while Gemini and Gemini Advanced consistently lagged, reflecting lower correctness (4.67 and 5.03, respectively) and reliability scores. Statistical analysis confirmed significant differences across groups (p < 0.001).
Conclusions: ChatGPT-3.5 demonstrates exceptional potential in transforming patient education regarding ocular dyskinesia, delivering highly accurate and patient-accessible responses. While ChatGPT-4o and experts offer strong, clinically sound insights, the Gemini models require refinement to meet higher benchmarks. These findings underscore the potential role of artificial intelligence in complementing human expertise, paving the way for innovative and collaborative approaches to patient care and education.
期刊介绍:
Ophthalmic Plastic and Reconstructive Surgery features original articles and reviews on topics such as ptosis, eyelid reconstruction, orbital diagnosis and surgery, lacrimal problems, and eyelid malposition. Update reports on diagnostic techniques, surgical equipment and instrumentation, and medical therapies are included, as well as detailed analyses of recent research findings and their clinical applications.