Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases

Pediatric Discovery Pub Date : 2023-11-20 DOI:10.1002/pdi3.42

Qiuhong Wei, Yanqin Wang, Zhengxiong Yao, Ying Cui, Bo Wei, Tingyu Li, Ximing Xu

{"title":"Evaluation of ChatGPT's performance in providing treatment recommendations for pediatric diseases","authors":"Qiuhong Wei, Yanqin Wang, Zhengxiong Yao, Ying Cui, Bo Wei, Tingyu Li, Ximing Xu","doi":"10.1002/pdi3.42","DOIUrl":null,"url":null,"abstract":"With the advance of artificial intelligence technology, large language models such as ChatGPT are drawing substantial interest in the healthcare field. A growing body of research has evaluated ChatGPT's performance in various medical departments, yet its potential in pediatrics remains under‐studied. In this study, we presented ChatGPT with a total of 4160 clinical consultation questions in both English and Chinese, covering 104 pediatric conditions, and repeated each question independently 10 times to assess the accuracy of its responses in pediatric disease treatment recommendations. ChatGPT achieved an overall accuracy of 82.2% (95% CI: 81.0%–83.4%), with superior performance in addressing common diseases (84.4%, 95% CI: 83.2%–85.7%), offering general treatment advice (83.5%, 95% CI: 81.9%–85.1%), and responding in English (93.0%, 95% CI: 91.9%–94.1%). However, it was prone to errors in disease definitions, medications, and surgical treatment. In conclusion, while ChatGPT shows promise in pediatric treatment recommendations with notable accuracy, cautious optimism is warranted regarding the potential application of large language models in enhancing patient care.","PeriodicalId":209836,"journal":{"name":"Pediatric Discovery","volume":"6 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pediatric Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/pdi3.42","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

With the advance of artificial intelligence technology, large language models such as ChatGPT are drawing substantial interest in the healthcare field. A growing body of research has evaluated ChatGPT's performance in various medical departments, yet its potential in pediatrics remains under‐studied. In this study, we presented ChatGPT with a total of 4160 clinical consultation questions in both English and Chinese, covering 104 pediatric conditions, and repeated each question independently 10 times to assess the accuracy of its responses in pediatric disease treatment recommendations. ChatGPT achieved an overall accuracy of 82.2% (95% CI: 81.0%–83.4%), with superior performance in addressing common diseases (84.4%, 95% CI: 83.2%–85.7%), offering general treatment advice (83.5%, 95% CI: 81.9%–85.1%), and responding in English (93.0%, 95% CI: 91.9%–94.1%). However, it was prone to errors in disease definitions, medications, and surgical treatment. In conclusion, while ChatGPT shows promise in pediatric treatment recommendations with notable accuracy, cautious optimism is warranted regarding the potential application of large language models in enhancing patient care.

查看原文本刊更多论文

评估 ChatGPT 在提供儿科疾病治疗建议方面的表现

随着人工智能技术的发展，大型语言模型（如 ChatGPT）在医疗保健领域引起了极大的兴趣。越来越多的研究评估了 ChatGPT 在不同医疗部门的表现，但其在儿科领域的潜力仍未得到充分研究。在本研究中，我们向 ChatGPT 提供了共 4160 个中英文临床咨询问题，涵盖 104 种儿科疾病，并将每个问题独立重复 10 次，以评估其在儿科疾病治疗建议方面的准确性。ChatGPT 的总体准确率为 82.2%（95% CI：81.0%-83.4%），在处理常见疾病（84.4%，95% CI：83.2%-85.7%）、提供一般治疗建议（83.5%，95% CI：81.9%-85.1%）和英语回复（93.0%，95% CI：91.9%-94.1%）方面表现出色。但在疾病定义、药物治疗和手术治疗方面容易出现错误。总之，虽然 ChatGPT 在儿科治疗建议方面显示出了显著的准确性，但对于大型语言模型在加强患者护理方面的潜在应用，我们还是应该持谨慎乐观的态度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pediatric Discovery

自引率

0.00%

发文量