{"title":"Reliability of AI-generated responses on frequently-posed questions by patients with chronic kidney disease.","authors":"Emi Furukawa, Tsuyoshi Okuhara, Hiroko Okada, Yuriko Nishiie, Takahiro Kiuchi","doi":"10.1177/14604582251381996","DOIUrl":null,"url":null,"abstract":"<p><p>BackgroundAI tools are becoming primary information sources for patients with chronic kidney disease (CKD). However, as AI sometimes generates factual or inaccurate information, the reliability of information must be assessed.MethodsThis study assessed the AI-generated responses to frequently asked questions on CKD. We entered Japanese prompts with top CKD-related keywords into ChatGPT, Copilot, and Gemini. The Quality Analysis of Medical Artificial Intelligence (QAMAI) tool was used to evaluate the reliability of the information.ResultsWe included 207 AI responses from 23 prompts. The AI tools generated reliable information, with a median QAMAI score of 23 (interquartile range: 7) out of 30. However, information accuracy and resource availability varied (median (IQR): ChatGPT versus Copilot versus Gemini = 18 (2) versus 25 (3) versus 24 (5), <i>p</i> < 0.01). Among AI tools, ChatGPT provided the least accurate information and did not provide any resources.ConclusionThe quality of AI responses on CKD was generally acceptable. While most information provided was reliable and comprehensive, some information lacked accuracy and references.</p>","PeriodicalId":55069,"journal":{"name":"Health Informatics Journal","volume":"31 3","pages":"14604582251381996"},"PeriodicalIF":2.3000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Informatics Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/14604582251381996","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/9/23 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
BackgroundAI tools are becoming primary information sources for patients with chronic kidney disease (CKD). However, as AI sometimes generates factual or inaccurate information, the reliability of information must be assessed.MethodsThis study assessed the AI-generated responses to frequently asked questions on CKD. We entered Japanese prompts with top CKD-related keywords into ChatGPT, Copilot, and Gemini. The Quality Analysis of Medical Artificial Intelligence (QAMAI) tool was used to evaluate the reliability of the information.ResultsWe included 207 AI responses from 23 prompts. The AI tools generated reliable information, with a median QAMAI score of 23 (interquartile range: 7) out of 30. However, information accuracy and resource availability varied (median (IQR): ChatGPT versus Copilot versus Gemini = 18 (2) versus 25 (3) versus 24 (5), p < 0.01). Among AI tools, ChatGPT provided the least accurate information and did not provide any resources.ConclusionThe quality of AI responses on CKD was generally acceptable. While most information provided was reliable and comprehensive, some information lacked accuracy and references.
dai工具正在成为慢性肾脏疾病(CKD)患者的主要信息来源。然而,由于人工智能有时会产生真实或不准确的信息,因此必须评估信息的可靠性。方法本研究评估了人工智能对CKD常见问题的回答。我们在ChatGPT、Copilot和Gemini中输入了与ckd相关的热门关键词的日语提示。使用医疗人工智能质量分析(QAMAI)工具评估信息的可靠性。结果我们从23个提示中纳入了207个AI响应。人工智能工具生成了可靠的信息,QAMAI得分中位数为23分(四分位数范围为7分)。然而,信息准确性和资源可用性各不相同(中位数(IQR): ChatGPT vs Copilot vs Gemini = 18 (2) vs 25 (3) vs 24 (5), p < 0.01)。在人工智能工具中,ChatGPT提供的信息最不准确,没有提供任何资源。结论人工智能治疗CKD的质量总体上可以接受。虽然所提供的大多数信息是可靠和全面的,但有些信息缺乏准确性和参考价值。
期刊介绍:
Health Informatics Journal is an international peer-reviewed journal. All papers submitted to Health Informatics Journal are subject to peer review by members of a carefully appointed editorial board. The journal operates a conventional single-blind reviewing policy in which the reviewer’s name is always concealed from the submitting author.