Quality of information on hypospadias from artificial intelligence chatbots: How safe is AI for patient and family information?

IF 1.9 3区医学 Q2 PEDIATRICS

Journal of Pediatric Urology Pub Date : 2025-09-02 DOI:10.1016/j.jpurol.2025.08.029

Peter Stapleton, Jordan Santucci, Monica Thet, Nathan Lawrentschuk, Lachlan Dodds, Thomas Cundy, Niranjan Sathianathen

{"title":"Quality of information on hypospadias from artificial intelligence chatbots: How safe is AI for patient and family information?","authors":"Peter Stapleton, Jordan Santucci, Monica Thet, Nathan Lawrentschuk, Lachlan Dodds, Thomas Cundy, Niranjan Sathianathen","doi":"10.1016/j.jpurol.2025.08.029","DOIUrl":null,"url":null,"abstract":"Introduction: Hypospadias is the most prevalent congenital anomaly of the penis, with an estimated incidence of 0.4-8.2 cases per 1000 live births (1). However, most of the parents and families of those with hypospadias experience anxiety and uncertainty regarding the information about hypospadias (2, 3). Leading to many families conduct their own independent internet search for information to better understand a diagnosis. The reliability and quality of this information for patients and families has not previously been formally assessed. The objective of this study is to assess the ability of AI chatbots to provide accurate and readable information to patients and families on hypospadias.Methods: AI chatbot inputs were sourced from google trends and healthcare organisations. Google trends was used to identify the top 10 google search terms relating to 'Hypospadias' based on search volume. Royal Children Hospital in Melbourne (RCH) and the Urology Care Foundation American Urology Association - Hypospadias (AUA) headers were used as healthcare related hypospadias inputs4 different AI chatbot programs ChatGPT version 4.0, Perplexity, Chat Sonic, and Bing AI. Three urology consultants blinded to the AI chatbots assessed responses for accuracy and safety and a further two trained investigators, blinded to AI chatbot type and each other's evaluation scores, assessed AI chatbot responses using various evaluation instruments including PEMAT, DISCERN, misinfomration and Flesch-Kincaid readability formula as well as word count and citation.Results: As demonstrated in the 4 AI chatbots assessed contained high quality health consumer information median DISCERN 4 (IQR 3-5). The degree of misinformation was low overall and across all AI chatbot responses, with a median of 1 (IQR 1-1). The PEMAT Understandability scores was high overall with a median of 91.7 % (IQR 80-92.3). However, all AIs performed poorly in the actionability of their responses with an overall median of 40 % (20-80). The median word count per AI chatbot response was 213 (IQR 141-273).Conclusion: AI chatbots provided understandable, high level and accurate health information relating to hypospadias. However, the information was delivered at a reading level which may limit its use in a paediatric or general public setting, and only one chatbot gave clearly actionable interventions or direction. Overall, AI chatbots are a clinically safe and appropriate adjunct to face to face consultation for healthcare information delivery and will likely take a more prominent domain as technology advances.","PeriodicalId":16747,"journal":{"name":"Journal of Pediatric Urology","volume":" ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pediatric Urology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jpurol.2025.08.029","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PEDIATRICS","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction: Hypospadias is the most prevalent congenital anomaly of the penis, with an estimated incidence of 0.4-8.2 cases per 1000 live births (1). However, most of the parents and families of those with hypospadias experience anxiety and uncertainty regarding the information about hypospadias (2, 3). Leading to many families conduct their own independent internet search for information to better understand a diagnosis. The reliability and quality of this information for patients and families has not previously been formally assessed. The objective of this study is to assess the ability of AI chatbots to provide accurate and readable information to patients and families on hypospadias.

Methods: AI chatbot inputs were sourced from google trends and healthcare organisations. Google trends was used to identify the top 10 google search terms relating to 'Hypospadias' based on search volume. Royal Children Hospital in Melbourne (RCH) and the Urology Care Foundation American Urology Association - Hypospadias (AUA) headers were used as healthcare related hypospadias inputs4 different AI chatbot programs ChatGPT version 4.0, Perplexity, Chat Sonic, and Bing AI. Three urology consultants blinded to the AI chatbots assessed responses for accuracy and safety and a further two trained investigators, blinded to AI chatbot type and each other's evaluation scores, assessed AI chatbot responses using various evaluation instruments including PEMAT, DISCERN, misinfomration and Flesch-Kincaid readability formula as well as word count and citation.

Results: As demonstrated in the 4 AI chatbots assessed contained high quality health consumer information median DISCERN 4 (IQR 3-5). The degree of misinformation was low overall and across all AI chatbot responses, with a median of 1 (IQR 1-1). The PEMAT Understandability scores was high overall with a median of 91.7 % (IQR 80-92.3). However, all AIs performed poorly in the actionability of their responses with an overall median of 40 % (20-80). The median word count per AI chatbot response was 213 (IQR 141-273).

Conclusion: AI chatbots provided understandable, high level and accurate health information relating to hypospadias. However, the information was delivered at a reading level which may limit its use in a paediatric or general public setting, and only one chatbot gave clearly actionable interventions or direction. Overall, AI chatbots are a clinically safe and appropriate adjunct to face to face consultation for healthcare information delivery and will likely take a more prominent domain as technology advances.

查看原文本刊更多论文

人工智能聊天机器人关于尿道下裂的信息质量：人工智能对患者和家庭信息的安全性如何？

简介：尿道下裂是最常见的阴茎先天性异常，估计每1000例活产婴儿中有0.4-8.2例(1)。然而，大多数尿道下裂患者的父母和家庭对尿道下裂的信息感到焦虑和不确定（2,3）。导致许多家庭自己进行独立的互联网搜索信息，以更好地了解诊断。这一信息对患者和家属的可靠性和质量以前没有进行过正式评估。本研究的目的是评估人工智能聊天机器人为尿道下裂患者和家属提供准确可读信息的能力。方法：人工智能聊天机器人输入来自谷歌趋势和医疗机构。谷歌trends用于根据搜索量确定与“尿道下裂”相关的谷歌搜索词排名前10位。墨尔本皇家儿童医院（RCH）和泌尿外科护理基金会美国泌尿外科协会-尿道下裂（AUA）标题被用作医疗保健相关的尿道下裂输入4种不同的人工智能聊天机器人程序ChatGPT version 4.0, Perplexity， Chat Sonic和Bing AI。三名对人工智能聊天机器人不知情的泌尿科顾问评估了回答的准确性和安全性，另外两名训练有素的调查人员对人工智能聊天机器人的类型和彼此的评估分数不知情，他们使用各种评估工具评估人工智能聊天机器人的回答，包括PEMAT、DISCERN、错误信息和Flesch-Kincaid可读性公式以及字数和引用。结果表明，被评估的4个AI聊天机器人包含高质量的健康消费者信息，中位数为DISCERN 4 （IQR 3-5）。在所有人工智能聊天机器人的回答中，错误信息的总体程度很低，中位数为1 （IQR 1-1）。PEMAT可理解性评分总体上较高，中位数为91.7% （IQR 80-92.3）。然而，所有人工智能在其反应的可操作性方面表现不佳，总体中位数为40%（20-80）。每个AI聊天机器人回复的中位数字数为213 （IQR 141-273）。结论：人工智能聊天机器人提供了可理解、高水平、准确的尿道下裂相关健康信息。然而，这些信息的阅读水平可能会限制其在儿科或一般公共环境中的使用，而且只有一个聊天机器人给出了明确的可操作干预或指导。总的来说，人工智能聊天机器人在临床上是安全的，也是面对面咨询医疗信息传递的适当辅助手段，随着技术的进步，它可能会占据更重要的领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Pediatric Urology PEDIATRICS-UROLOGY & NEPHROLOGY

CiteScore

3.70

自引率

15.00%

发文量

330

审稿时长

4-8 weeks

期刊介绍： The Journal of Pediatric Urology publishes submitted research and clinical articles relating to Pediatric Urology which have been accepted after adequate peer review. It publishes regular articles that have been submitted after invitation, that cover the curriculum of Pediatric Urology, and enable trainee surgeons to attain theoretical competence of the sub-specialty. It publishes regular reviews of pediatric urological articles appearing in other journals. It publishes invited review articles by recognised experts on modern or controversial aspects of the sub-specialty. It enables any affiliated society to advertise society events or information in the journal without charge and will publish abstracts of papers to be read at society meetings.