Assessing the quality and readability of online patient information: ENT UK patient information e-leaflets vs responses by a Generative Artificial Intelligence.
Eamon Shamil,Tsz Ki Ko,Ka Siu Fan,James Schuster-Bruce,Mustafa Jaafar,Sadie Khwaja,Nicholas Eynon-Lewis,Alwyn Ray D'Souza,Peter Andrews
{"title":"Assessing the quality and readability of online patient information: ENT UK patient information e-leaflets vs responses by a Generative Artificial Intelligence.","authors":"Eamon Shamil,Tsz Ki Ko,Ka Siu Fan,James Schuster-Bruce,Mustafa Jaafar,Sadie Khwaja,Nicholas Eynon-Lewis,Alwyn Ray D'Souza,Peter Andrews","doi":"10.1055/a-2413-3675","DOIUrl":null,"url":null,"abstract":"BACKGROUND\r\nThe evolution of artificial intelligence has introduced new ways to disseminate health information, including natural language processing models like ChatGPT. However, the quality and readability of such digitally-generated information remains understudied. This study is the first to compare the quality and readability of digitally-generated health information against leaflets produced by professionals.\r\n\r\nMETHODOLOGY\r\nPatient information leaflets for five ENT UK leaflets and their corresponding ChatGPT responses were extracted from the Internet. Assessors with various degree of medical knowledge evaluated the content using the Ensuring Quality Information for Patients (EQIP) tool and readability tools including the Flesch-Kincaid Grade Level (FKGL). Statistical analysis was performed to identify differences between leaflets, assessors, and sources of information.\r\n\r\nRESULTS\r\nENT UK leaflets were of moderate quality, scoring a median EQIP of 23. Statistically significant differences in overall EQIP score were identified between ENT UK leaflets but ChatGPT responses were of uniform quality. Non-specialist doctors rated the highest EQIP scores while medical students scored the lowest. The mean readability of ENT UK leaflets was higher than ChatGPT responses. The information metrics of ENT UK leaflets were moderate and varied between topics. Equivalent ChatGPT information provided comparable content quality, but with reduced readability.\r\n\r\nCONCLUSIONS\r\nChatGPT patient information and professionally-produced leaflets had comparable content, but LLM content were required a higher reading age. With the increasing use of online health resources, this study highlights the need for a balanced approach that considers optimises both the quality and readability of patient education materials.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2413-3675","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
BACKGROUND
The evolution of artificial intelligence has introduced new ways to disseminate health information, including natural language processing models like ChatGPT. However, the quality and readability of such digitally-generated information remains understudied. This study is the first to compare the quality and readability of digitally-generated health information against leaflets produced by professionals.
METHODOLOGY
Patient information leaflets for five ENT UK leaflets and their corresponding ChatGPT responses were extracted from the Internet. Assessors with various degree of medical knowledge evaluated the content using the Ensuring Quality Information for Patients (EQIP) tool and readability tools including the Flesch-Kincaid Grade Level (FKGL). Statistical analysis was performed to identify differences between leaflets, assessors, and sources of information.
RESULTS
ENT UK leaflets were of moderate quality, scoring a median EQIP of 23. Statistically significant differences in overall EQIP score were identified between ENT UK leaflets but ChatGPT responses were of uniform quality. Non-specialist doctors rated the highest EQIP scores while medical students scored the lowest. The mean readability of ENT UK leaflets was higher than ChatGPT responses. The information metrics of ENT UK leaflets were moderate and varied between topics. Equivalent ChatGPT information provided comparable content quality, but with reduced readability.
CONCLUSIONS
ChatGPT patient information and professionally-produced leaflets had comparable content, but LLM content were required a higher reading age. With the increasing use of online health resources, this study highlights the need for a balanced approach that considers optimises both the quality and readability of patient education materials.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.