评估ChatGPT-4在解决角膜溃疡问题中的新作用：人工智能驱动的见解。

IF 1.4 4区医学 Q3 OPHTHALMOLOGY

European Journal of Ophthalmology Pub Date : 2025-09-01 Epub Date: 2025-04-28 DOI:10.1177/11206721251337290

Bharat Gurnani, Kirandeep Kaur, Prasanth Gireesh, Logesh Balakrishnan, Chitaranjan Mishra

{"title":"评估ChatGPT-4在解决角膜溃疡问题中的新作用：人工智能驱动的见解。","authors":"Bharat Gurnani, Kirandeep Kaur, Prasanth Gireesh, Logesh Balakrishnan, Chitaranjan Mishra","doi":"10.1177/11206721251337290","DOIUrl":null,"url":null,"abstract":"PurposeChatGPT-4, a natural language processing-based AI model, is increasingly being applied in healthcare, facilitating education, research, and clinical decision-making support. This study explores ChatGPT-4's capability to deliver accurate and detailed information on corneal ulcers, assessing its application in medical education and clinical decision-making.MethodsThe study engaged ChatGPT-4 with 12 structured questions across different categories related to corneal ulcers. For each inquiry, five unique ChatGPT-4 sessions were initiated, ensuring that the output was not affected by previous queries. A panel of five ophthalmology experts including optometry teaching and research staff assessed the responses using a Likert scale (1-5) (1: very poor; 2: poor; 3: acceptable; 4: good; 5: very good) for quality and accuracy. Median scores were calculated, and inter-rater reliability was assessed to gauge consistency among evaluators.ResultsThe evaluation of ChatGPT-4's responses to corneal ulcer-related questions revealed varied performance across categories. Median scores were consistently high (4.0) for risk factors, etiology, symptoms, treatment, complications, and prognosis, with narrow IQRs (3.0-4.0), reflecting strong agreement. However, classification and investigations scored slightly lower (median 3.0). Signs of corneal ulcers had a median of 2.0, showing significant variability. Of 300 responses, 45% were rated 'good,' 41.7% 'acceptable,' 10% 'poor,' and only 3.3% 'very good,' highlighting areas for improvement. Notably, Evaluator 2 gave 35 'good' ratings, while Evaluators 1 and 3 assigned 10 'poor' ratings each. Inter-evaluator variability, along with gaps in diagnostic precision, underscores the need for refining AI responses. Continuous feedback and targeted adjustments could boost ChatGPT-4's utility in delivering high-quality ophthalmic education.ConclusionChatGPT-4 shows promising utility in providing educational content on corneal ulcers. Despite the variance in evaluator ratings, the numerical analysis suggests that with further refinement, ChatGPT-4 could be a valuable tool in ophthalmological education and clinical support.","PeriodicalId":12000,"journal":{"name":"European Journal of Ophthalmology","volume":" ","pages":"1531-1541"},"PeriodicalIF":1.4000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the novel role of ChatGPT-4 in addressing corneal ulcer queries: An AI-powered insight.\",\"authors\":\"Bharat Gurnani, Kirandeep Kaur, Prasanth Gireesh, Logesh Balakrishnan, Chitaranjan Mishra\",\"doi\":\"10.1177/11206721251337290\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeChatGPT-4, a natural language processing-based AI model, is increasingly being applied in healthcare, facilitating education, research, and clinical decision-making support. This study explores ChatGPT-4's capability to deliver accurate and detailed information on corneal ulcers, assessing its application in medical education and clinical decision-making.MethodsThe study engaged ChatGPT-4 with 12 structured questions across different categories related to corneal ulcers. For each inquiry, five unique ChatGPT-4 sessions were initiated, ensuring that the output was not affected by previous queries. A panel of five ophthalmology experts including optometry teaching and research staff assessed the responses using a Likert scale (1-5) (1: very poor; 2: poor; 3: acceptable; 4: good; 5: very good) for quality and accuracy. Median scores were calculated, and inter-rater reliability was assessed to gauge consistency among evaluators.ResultsThe evaluation of ChatGPT-4's responses to corneal ulcer-related questions revealed varied performance across categories. Median scores were consistently high (4.0) for risk factors, etiology, symptoms, treatment, complications, and prognosis, with narrow IQRs (3.0-4.0), reflecting strong agreement. However, classification and investigations scored slightly lower (median 3.0). Signs of corneal ulcers had a median of 2.0, showing significant variability. Of 300 responses, 45% were rated 'good,' 41.7% 'acceptable,' 10% 'poor,' and only 3.3% 'very good,' highlighting areas for improvement. Notably, Evaluator 2 gave 35 'good' ratings, while Evaluators 1 and 3 assigned 10 'poor' ratings each. Inter-evaluator variability, along with gaps in diagnostic precision, underscores the need for refining AI responses. Continuous feedback and targeted adjustments could boost ChatGPT-4's utility in delivering high-quality ophthalmic education.ConclusionChatGPT-4 shows promising utility in providing educational content on corneal ulcers. Despite the variance in evaluator ratings, the numerical analysis suggests that with further refinement, ChatGPT-4 could be a valuable tool in ophthalmological education and clinical support.\",\"PeriodicalId\":12000,\"journal\":{\"name\":\"European Journal of Ophthalmology\",\"volume\":\" \",\"pages\":\"1531-1541\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Ophthalmology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1177/11206721251337290\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/4/28 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Ophthalmology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/11206721251337290","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/28 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

基于自然语言处理的人工智能模型gpt -4在医疗保健领域的应用越来越广泛，为教育、研究和临床决策提供支持。本研究探讨了ChatGPT-4提供准确和详细的角膜溃疡信息的能力，评估了其在医学教育和临床决策中的应用。该研究使用ChatGPT-4进行了12个与角膜溃疡相关的不同类别的结构化问题。对于每个查询，将启动五个唯一的ChatGPT-4会话，以确保输出不受先前查询的影响。由五名眼科专家组成的小组，包括验光教学和研究人员，使用李克特量表（1-5）对回答进行评估(1：非常差；2:贫穷;3:可以接受;4:好;5：非常好)质量和精度。计算中位数得分，评估评估者之间的信度，以衡量评估者之间的一致性。结果对ChatGPT-4对角膜溃疡相关问题的反应的评估显示了不同类别的不同表现。在危险因素、病因、症状、治疗、并发症和预后方面，中位评分一直很高（4.0分），IQRs较窄（3.0-4.0分），反映出高度一致。然而，分类和调查得分略低（中位数3.0）。角膜溃疡的中位数为2.0，表现出显著的可变性。在300份回复中，45%被评为“良好”，41.7%被评为“可接受”，10%被评为“较差”，只有3.3%被评为“非常好”，突出了有待改进的领域。值得注意的是，评价者2给出了35个“好”评级，而评价者1和3分别给出了10个“差”评级。评估者之间的差异，以及诊断精度上的差距，凸显了改进人工智能反应的必要性。持续的反馈和有针对性的调整可以提高ChatGPT-4在提供高质量眼科教育方面的效用。结论chatgpt -4在提供角膜溃疡教育内容方面具有良好的应用前景。尽管评估者评分存在差异，但数值分析表明，通过进一步改进，ChatGPT-4可能成为眼科教育和临床支持的有价值的工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Evaluating the novel role of ChatGPT-4 in addressing corneal ulcer queries: An AI-powered insight.

PurposeChatGPT-4, a natural language processing-based AI model, is increasingly being applied in healthcare, facilitating education, research, and clinical decision-making support. This study explores ChatGPT-4's capability to deliver accurate and detailed information on corneal ulcers, assessing its application in medical education and clinical decision-making.MethodsThe study engaged ChatGPT-4 with 12 structured questions across different categories related to corneal ulcers. For each inquiry, five unique ChatGPT-4 sessions were initiated, ensuring that the output was not affected by previous queries. A panel of five ophthalmology experts including optometry teaching and research staff assessed the responses using a Likert scale (1-5) (1: very poor; 2: poor; 3: acceptable; 4: good; 5: very good) for quality and accuracy. Median scores were calculated, and inter-rater reliability was assessed to gauge consistency among evaluators.ResultsThe evaluation of ChatGPT-4's responses to corneal ulcer-related questions revealed varied performance across categories. Median scores were consistently high (4.0) for risk factors, etiology, symptoms, treatment, complications, and prognosis, with narrow IQRs (3.0-4.0), reflecting strong agreement. However, classification and investigations scored slightly lower (median 3.0). Signs of corneal ulcers had a median of 2.0, showing significant variability. Of 300 responses, 45% were rated 'good,' 41.7% 'acceptable,' 10% 'poor,' and only 3.3% 'very good,' highlighting areas for improvement. Notably, Evaluator 2 gave 35 'good' ratings, while Evaluators 1 and 3 assigned 10 'poor' ratings each. Inter-evaluator variability, along with gaps in diagnostic precision, underscores the need for refining AI responses. Continuous feedback and targeted adjustments could boost ChatGPT-4's utility in delivering high-quality ophthalmic education.ConclusionChatGPT-4 shows promising utility in providing educational content on corneal ulcers. Despite the variance in evaluator ratings, the numerical analysis suggests that with further refinement, ChatGPT-4 could be a valuable tool in ophthalmological education and clinical support.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Journal of Ophthalmology 医学-眼科学

CiteScore

3.60

自引率

0.00%

发文量

372

审稿时长

3-8 weeks

期刊介绍： The European Journal of Ophthalmology was founded in 1991 and is issued in print bi-monthly. It publishes only peer-reviewed original research reporting clinical observations and laboratory investigations with clinical relevance focusing on new diagnostic and surgical techniques, instrument and therapy updates, results of clinical trials and research findings.