Syed Furrukh Jamil, Nada N Alshathri, Seham S Alsalamah, Nura A Almansour, Faris S Alsalamah, Tahir K Hameed, Jubran T Alqanatish
{"title":"Leveraging large language models to inform paediatric chronic condition care: a cross-sectional study.","authors":"Syed Furrukh Jamil, Nada N Alshathri, Seham S Alsalamah, Nura A Almansour, Faris S Alsalamah, Tahir K Hameed, Jubran T Alqanatish","doi":"10.1136/bmjpo-2025-003742","DOIUrl":null,"url":null,"abstract":"<p><p>This study assessed how ChatGPT 3.5, ChatGPT 4.0 and Google Gemini perform in providing educational content about coeliac disease and type 1 diabetes mellitus. We analysed 76 frequently asked questions for accuracy, comprehensiveness, readability and consistency. The models delivered highly accurate and comprehensive responses across the board. While ChatGPT 4.0 offered the most readable content, all models struggled with overall readability. Each model maintained consistent performance throughout testing. These results indicate that large language models show promise as supplementary tools for patient education in chronic paediatric conditions, though improvements in readability are needed to enhance accessibility.</p>","PeriodicalId":9069,"journal":{"name":"BMJ Paediatrics Open","volume":"9 1","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12352204/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ Paediatrics Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/bmjpo-2025-003742","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PEDIATRICS","Score":null,"Total":0}
引用次数: 0
Abstract
This study assessed how ChatGPT 3.5, ChatGPT 4.0 and Google Gemini perform in providing educational content about coeliac disease and type 1 diabetes mellitus. We analysed 76 frequently asked questions for accuracy, comprehensiveness, readability and consistency. The models delivered highly accurate and comprehensive responses across the board. While ChatGPT 4.0 offered the most readable content, all models struggled with overall readability. Each model maintained consistent performance throughout testing. These results indicate that large language models show promise as supplementary tools for patient education in chronic paediatric conditions, though improvements in readability are needed to enhance accessibility.