Conversational Guide for Cataract Surgery Complications: A Comparative Study of Surgeons versus Large Language Model-Based Chatbot Generated Instructions for Patient Interaction.
{"title":"Conversational Guide for Cataract Surgery Complications: A Comparative Study of Surgeons versus Large Language Model-Based Chatbot Generated Instructions for Patient Interaction.","authors":"Sathishkumar Sundaramoorthy, Vineet Ratra, Vijay Shankar, Ramesh Dorairajan, Quresh Maskati, T Nirmal Fredrick, Aashna Ratra, Dhanashree Ratra","doi":"10.1080/09286586.2025.2484772","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>It is difficult to explain the complications of surgery to patients. Care has to be taken to convey the facts clearly and objectively while expressing concern for their wellbeing. This study compared responses from surgeons with responses from a large language model (LLM)-based chatbot.</p><p><strong>Methods: </strong>We presented 10 common scenarios of cataract surgery complications to seven senior surgeons and a chatbot. The responses were graded by two independent graders for comprehension, readability, and complexity of language using previously validated indices. The responses were analyzed for accuracy and completeness. Honesty and empathy were graded for both groups. Scores were averaged and tabulated.</p><p><strong>Results: </strong>The readability scores for the surgeons (10.64) were significantly less complex than the chatbot (12.54) (<i>p</i> < 0.001). The responses from the surgeons were shorter, whereas the chatbot tended to give more detailed answers. The average accuracy and completeness score of chatbot-generated conversations was 2.36 (0.55), which was similar to the surgeons' score of 2.58 (0.36) (<i>p</i> = 0.164). The responses from the chatbot were more generalized, lacking specific alternative measures. While empathy scores were higher for surgeons (1.81 vs. 1.20, <i>p</i> = 0.041), honesty scores showed no significant difference.</p><p><strong>Conclusions: </strong>The LLM-based chatbot gave a detailed description of the complication but was less specific about the alternative measures. The surgeons had a more in-depth understanding of the situation. The chatbot showed complete honesty but scored less for empathy. With more training using complex real-world scenarios and specialized ophthalmologic data, the chatbots could be used to assist the surgeons in counselling patients for postoperative complications.</p>","PeriodicalId":19607,"journal":{"name":"Ophthalmic epidemiology","volume":" ","pages":"1-8"},"PeriodicalIF":1.7000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmic epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/09286586.2025.2484772","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: It is difficult to explain the complications of surgery to patients. Care has to be taken to convey the facts clearly and objectively while expressing concern for their wellbeing. This study compared responses from surgeons with responses from a large language model (LLM)-based chatbot.
Methods: We presented 10 common scenarios of cataract surgery complications to seven senior surgeons and a chatbot. The responses were graded by two independent graders for comprehension, readability, and complexity of language using previously validated indices. The responses were analyzed for accuracy and completeness. Honesty and empathy were graded for both groups. Scores were averaged and tabulated.
Results: The readability scores for the surgeons (10.64) were significantly less complex than the chatbot (12.54) (p < 0.001). The responses from the surgeons were shorter, whereas the chatbot tended to give more detailed answers. The average accuracy and completeness score of chatbot-generated conversations was 2.36 (0.55), which was similar to the surgeons' score of 2.58 (0.36) (p = 0.164). The responses from the chatbot were more generalized, lacking specific alternative measures. While empathy scores were higher for surgeons (1.81 vs. 1.20, p = 0.041), honesty scores showed no significant difference.
Conclusions: The LLM-based chatbot gave a detailed description of the complication but was less specific about the alternative measures. The surgeons had a more in-depth understanding of the situation. The chatbot showed complete honesty but scored less for empathy. With more training using complex real-world scenarios and specialized ophthalmologic data, the chatbots could be used to assist the surgeons in counselling patients for postoperative complications.
期刊介绍:
Ophthalmic Epidemiology is dedicated to the publication of original research into eye and vision health in the fields of epidemiology, public health and the prevention of blindness. Ophthalmic Epidemiology publishes editorials, original research reports, systematic reviews and meta-analysis articles, brief communications and letters to the editor on all subjects related to ophthalmic epidemiology. A broad range of topics is suitable, such as: evaluating the risk of ocular diseases, general and specific study designs, screening program implementation and evaluation, eye health care access, delivery and outcomes, therapeutic efficacy or effectiveness, disease prognosis and quality of life, cost-benefit analysis, biostatistical theory and risk factor analysis. We are looking to expand our engagement with reports of international interest, including those regarding problems affecting developing countries, although reports from all over the world potentially are suitable. Clinical case reports, small case series (not enough for a cohort analysis) articles and animal research reports are not appropriate for this journal.