评估chatgpt - 40对基于患者的圆锥角膜问题的回答的准确性和可读性。

IF 1.7 4区 医学 Q3 OPHTHALMOLOGY
Ali Safa Balci, Semih Çakmak
{"title":"评估chatgpt - 40对基于患者的圆锥角膜问题的回答的准确性和可读性。","authors":"Ali Safa Balci, Semih Çakmak","doi":"10.1080/09286586.2025.2484760","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>This study aimed to evaluate the accuracy and readability of responses generated by ChatGPT-4o, an advanced large language model, to frequently asked patient-centered questions about keratoconus.</p><p><strong>Methods: </strong>A cross-sectional, observational study was conducted using ChatGPT-4o to answer 30 potential questions that could be asked by patients with keratoconus. The accuracy of the responses was evaluated by two board-certified ophthalmologists and scored on a scale of 1 to 5. Readability was assessed using the Simple Measure of Gobbledygook (SMOG), Flesch-Kincaid Grade Level (FKGL), and Flesch Reading Ease (FRE) scores. Descriptive, treatment-related, and follow-up-related questions were analyzed, and statistical comparisons between these categories were performed.</p><p><strong>Results: </strong>The mean accuracy score for the responses was 4.48 ± 0.57 on a 5-point Likert scale. The interrater reliability, with an intraclass correlation coefficient of 0.769, indicated a strong level of agreement. Readability scores revealed a SMOG score of 15.49 ± 1.74, an FKGL score of 14.95 ± 1.95, and an FRE score of 27.41 ± 9.71, indicating that a high level of education is required to comprehend the responses. There was no significant difference in accuracy among the different question categories (<i>p</i> = 0.161), but readability varied significantly, with treatment-related questions being the easiest to understand.</p><p><strong>Conclusion: </strong>ChatGPT-4o provides highly accurate responses to patient-centered questions about keratoconus, though the complexity of its language may limit accessibility for the general population. Further development is needed to enhance the readability of AI-generated medical content.</p>","PeriodicalId":19607,"journal":{"name":"Ophthalmic epidemiology","volume":" ","pages":"1-6"},"PeriodicalIF":1.7000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Accuracy and Readability of ChatGPT-4o's Responses to Patient-Based Questions about Keratoconus.\",\"authors\":\"Ali Safa Balci, Semih Çakmak\",\"doi\":\"10.1080/09286586.2025.2484760\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>This study aimed to evaluate the accuracy and readability of responses generated by ChatGPT-4o, an advanced large language model, to frequently asked patient-centered questions about keratoconus.</p><p><strong>Methods: </strong>A cross-sectional, observational study was conducted using ChatGPT-4o to answer 30 potential questions that could be asked by patients with keratoconus. The accuracy of the responses was evaluated by two board-certified ophthalmologists and scored on a scale of 1 to 5. Readability was assessed using the Simple Measure of Gobbledygook (SMOG), Flesch-Kincaid Grade Level (FKGL), and Flesch Reading Ease (FRE) scores. Descriptive, treatment-related, and follow-up-related questions were analyzed, and statistical comparisons between these categories were performed.</p><p><strong>Results: </strong>The mean accuracy score for the responses was 4.48 ± 0.57 on a 5-point Likert scale. The interrater reliability, with an intraclass correlation coefficient of 0.769, indicated a strong level of agreement. Readability scores revealed a SMOG score of 15.49 ± 1.74, an FKGL score of 14.95 ± 1.95, and an FRE score of 27.41 ± 9.71, indicating that a high level of education is required to comprehend the responses. There was no significant difference in accuracy among the different question categories (<i>p</i> = 0.161), but readability varied significantly, with treatment-related questions being the easiest to understand.</p><p><strong>Conclusion: </strong>ChatGPT-4o provides highly accurate responses to patient-centered questions about keratoconus, though the complexity of its language may limit accessibility for the general population. Further development is needed to enhance the readability of AI-generated medical content.</p>\",\"PeriodicalId\":19607,\"journal\":{\"name\":\"Ophthalmic epidemiology\",\"volume\":\" \",\"pages\":\"1-6\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Ophthalmic epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1080/09286586.2025.2484760\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ophthalmic epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/09286586.2025.2484760","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:本研究旨在评估chatgpt - 40(一种先进的大型语言模型)对以患者为中心的圆锥角膜常见问题的回答的准确性和可读性。方法:采用chatgpt - 40进行横断面观察性研究,回答圆锥角膜患者可能提出的30个潜在问题。回答的准确性由两名委员会认证的眼科医生进行评估,并在1到5的范围内得分。可读性采用简单的官样书量表(SMOG)、Flesch- kincaid等级水平(FKGL)和Flesch阅读难度(FRE)评分进行评估。对描述性、治疗相关和随访相关的问题进行分析,并对这些类别进行统计比较。结果:5点李克特量表的平均准确度得分为4.48±0.57。组内相关系数为0.769,显示出较强的一致性。可读性(SMOG)评分为15.49±1.74分,FKGL评分为14.95±1.95分,FRE评分为27.41±9.71分。不同问题类别的准确性差异无统计学意义(p = 0.161),但可读性差异显著,治疗相关问题最容易理解。结论:chatgpt - 40对以患者为中心的圆锥角膜问题提供了高度准确的回答,尽管其语言的复杂性可能限制了普通人群的可及性。需要进一步发展以提高人工智能生成的医疗内容的可读性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating the Accuracy and Readability of ChatGPT-4o's Responses to Patient-Based Questions about Keratoconus.

Purpose: This study aimed to evaluate the accuracy and readability of responses generated by ChatGPT-4o, an advanced large language model, to frequently asked patient-centered questions about keratoconus.

Methods: A cross-sectional, observational study was conducted using ChatGPT-4o to answer 30 potential questions that could be asked by patients with keratoconus. The accuracy of the responses was evaluated by two board-certified ophthalmologists and scored on a scale of 1 to 5. Readability was assessed using the Simple Measure of Gobbledygook (SMOG), Flesch-Kincaid Grade Level (FKGL), and Flesch Reading Ease (FRE) scores. Descriptive, treatment-related, and follow-up-related questions were analyzed, and statistical comparisons between these categories were performed.

Results: The mean accuracy score for the responses was 4.48 ± 0.57 on a 5-point Likert scale. The interrater reliability, with an intraclass correlation coefficient of 0.769, indicated a strong level of agreement. Readability scores revealed a SMOG score of 15.49 ± 1.74, an FKGL score of 14.95 ± 1.95, and an FRE score of 27.41 ± 9.71, indicating that a high level of education is required to comprehend the responses. There was no significant difference in accuracy among the different question categories (p = 0.161), but readability varied significantly, with treatment-related questions being the easiest to understand.

Conclusion: ChatGPT-4o provides highly accurate responses to patient-centered questions about keratoconus, though the complexity of its language may limit accessibility for the general population. Further development is needed to enhance the readability of AI-generated medical content.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Ophthalmic epidemiology
Ophthalmic epidemiology 医学-眼科学
CiteScore
3.70
自引率
5.60%
发文量
61
审稿时长
6-12 weeks
期刊介绍: Ophthalmic Epidemiology is dedicated to the publication of original research into eye and vision health in the fields of epidemiology, public health and the prevention of blindness. Ophthalmic Epidemiology publishes editorials, original research reports, systematic reviews and meta-analysis articles, brief communications and letters to the editor on all subjects related to ophthalmic epidemiology. A broad range of topics is suitable, such as: evaluating the risk of ocular diseases, general and specific study designs, screening program implementation and evaluation, eye health care access, delivery and outcomes, therapeutic efficacy or effectiveness, disease prognosis and quality of life, cost-benefit analysis, biostatistical theory and risk factor analysis. We are looking to expand our engagement with reports of international interest, including those regarding problems affecting developing countries, although reports from all over the world potentially are suitable. Clinical case reports, small case series (not enough for a cohort analysis) articles and animal research reports are not appropriate for this journal.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信