Adrian H Y Siu, Damien P Gibson, Chris Chiu, Allan Kwok, Matt Irwin, Adam Christie, Cherry E Koh, Anil Keshava, Mifanwy Reece, Michael Suen, Matthew J F X Rickard
{"title":"ChatGPT as a patient education tool in colorectal cancer-An in-depth assessment of efficacy, quality and readability.","authors":"Adrian H Y Siu, Damien P Gibson, Chris Chiu, Allan Kwok, Matt Irwin, Adam Christie, Cherry E Koh, Anil Keshava, Mifanwy Reece, Michael Suen, Matthew J F X Rickard","doi":"10.1111/codi.17267","DOIUrl":null,"url":null,"abstract":"<p><strong>Aim: </strong>Artificial intelligence (AI) chatbots such as Chat Generative Pretrained Transformer-4 (ChatGPT-4) have made significant strides in generating human-like responses. Trained on an extensive corpus of medical literature, ChatGPT-4 has the potential to augment patient education materials. These chatbots may be beneficial to populations considering a diagnosis of colorectal cancer (CRC). However, the accuracy and quality of patient education materials are crucial for informed decision-making. Given workforce demands impacting holistic care, AI chatbots can bridge gaps in CRC information, reaching wider demographics and crossing language barriers. However, rigorous evaluation is essential to ensure accuracy, quality and readability. Therefore, this study aims to evaluate the efficacy, quality and readability of answers generated by ChatGPT-4 on CRC, utilizing patient-style question prompts.</p><p><strong>Method: </strong>To evaluate ChatGPT-4, eight CRC-related questions were derived using peer-reviewed literature and Google Trends. Eight colorectal surgeons evaluated AI responses for accuracy, safety, appropriateness, actionability and effectiveness. Quality was assessed using validated tools: the Patient Education Materials Assessment Tool (PEMAT-AI), modified DISCERN (DISCERN-AI) and Global Quality Score (GQS). A number of readability assessments were measured including Flesch Reading Ease (FRE) and the Gunning Fog Index (GFI).</p><p><strong>Results: </strong>The responses were generally accurate (median 4.00), safe (4.25), appropriate (4.00), actionable (4.00) and effective (4.00). Quality assessments rated PEMAT-AI as 'very good' (71.43), DISCERN-AI as 'fair' (12.00) and GQS as 'high' (4.00). Readability scores indicated difficulty (FRE 47.00, GFI 12.40), suggesting a higher educational level was required.</p><p><strong>Conclusion: </strong>This study concludes that ChatGPT-4 is capable of providing safe but nonspecific medical information, suggesting its potential as a patient education aid. However, enhancements in readability through contextual prompting and fine-tuning techniques are required before considering implementation into clinical practice.</p>","PeriodicalId":10512,"journal":{"name":"Colorectal Disease","volume":" ","pages":"e17267"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Colorectal Disease","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/codi.17267","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/17 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"GASTROENTEROLOGY & HEPATOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Aim: Artificial intelligence (AI) chatbots such as Chat Generative Pretrained Transformer-4 (ChatGPT-4) have made significant strides in generating human-like responses. Trained on an extensive corpus of medical literature, ChatGPT-4 has the potential to augment patient education materials. These chatbots may be beneficial to populations considering a diagnosis of colorectal cancer (CRC). However, the accuracy and quality of patient education materials are crucial for informed decision-making. Given workforce demands impacting holistic care, AI chatbots can bridge gaps in CRC information, reaching wider demographics and crossing language barriers. However, rigorous evaluation is essential to ensure accuracy, quality and readability. Therefore, this study aims to evaluate the efficacy, quality and readability of answers generated by ChatGPT-4 on CRC, utilizing patient-style question prompts.
Method: To evaluate ChatGPT-4, eight CRC-related questions were derived using peer-reviewed literature and Google Trends. Eight colorectal surgeons evaluated AI responses for accuracy, safety, appropriateness, actionability and effectiveness. Quality was assessed using validated tools: the Patient Education Materials Assessment Tool (PEMAT-AI), modified DISCERN (DISCERN-AI) and Global Quality Score (GQS). A number of readability assessments were measured including Flesch Reading Ease (FRE) and the Gunning Fog Index (GFI).
Results: The responses were generally accurate (median 4.00), safe (4.25), appropriate (4.00), actionable (4.00) and effective (4.00). Quality assessments rated PEMAT-AI as 'very good' (71.43), DISCERN-AI as 'fair' (12.00) and GQS as 'high' (4.00). Readability scores indicated difficulty (FRE 47.00, GFI 12.40), suggesting a higher educational level was required.
Conclusion: This study concludes that ChatGPT-4 is capable of providing safe but nonspecific medical information, suggesting its potential as a patient education aid. However, enhancements in readability through contextual prompting and fine-tuning techniques are required before considering implementation into clinical practice.
期刊介绍:
Diseases of the colon and rectum are common and offer a number of exciting challenges. Clinical, diagnostic and basic science research is expanding rapidly. There is increasing demand from purchasers of health care and patients for clinicians to keep abreast of the latest research and developments, and to translate these into routine practice. Technological advances in diagnosis, surgical technique, new pharmaceuticals, molecular genetics and other basic sciences have transformed many aspects of how these diseases are managed. Such progress will accelerate.
Colorectal Disease offers a real benefit to subscribers and authors. It is first and foremost a vehicle for publishing original research relating to the demanding, rapidly expanding field of colorectal diseases.
Essential for surgeons, pathologists, oncologists, gastroenterologists and health professionals caring for patients with a disease of the lower GI tract, Colorectal Disease furthers education and inter-professional development by including regular review articles and discussions of current controversies.
Note that the journal does not usually accept paediatric surgical papers.