Min Hyeong Jo, Min-Jun Kim, Heung-Kwon Oh, Mi Jeong Choi, Hye-Rim Shin, Tae-Gyun Lee, Hong-Min Ahn, Duck-Woo Kim, Sung-Bum Kang
{"title":"生成式人工智能在回答患者有关结直肠癌手术的询问时的交流能力。","authors":"Min Hyeong Jo, Min-Jun Kim, Heung-Kwon Oh, Mi Jeong Choi, Hye-Rim Shin, Tae-Gyun Lee, Hong-Min Ahn, Duck-Woo Kim, Sung-Bum Kang","doi":"10.1007/s00384-024-04670-3","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To examine the ability of generative artificial intelligence (GAI) to answer patients' questions regarding colorectal cancer (CRC).</p><p><strong>Methods: </strong>Ten clinically relevant questions about CRC were selected from top-rated hospitals' websites and patient surveys and presented to three GAI tools (Chatbot Generative Pre-Trained Transformer [GPT-4], Google Bard, and CLOVA X). Their responses were compared with answers from the CRC information book. Response evaluation was performed by two groups, each consisting of five healthcare professionals (HCP) and patients. Each question was scored on a 1-5 Likert scale based on four evaluation criteria (maximum score, 20 points/question).</p><p><strong>Results: </strong>In an analysis including only HCPs, the information book scored 11.8 ± 1.2, GPT-4 scored 13.5 ± 1.1, Google Bard scored 11.5 ± 0.7, and CLOVA X scored 12.2 ± 1.4 (P = 0.001). The score of GPT-4 was significantly higher than those of the information book (P = 0.020) and Google Bard (P = 0.001). In an analysis including only patients, the information book scored 14.1 ± 1.4, GPT-4 scored 15.2 ± 1.8, Google Bard scored 15.5 ± 1.8, and CLOVA X scored 14.4 ± 1.8, without significant differences (P = 0.234). When both groups of evaluators were included, the information book scored 13.0 ± 0.9, GPT-4 scored 14.4 ± 1.2, Google Bard scored 13.5 ± 1.0, and CLOVA X scored 13.3 ± 1.5 (P = 0.070).</p><p><strong>Conclusion: </strong>The three GAIs demonstrated similar or better communicative competence than the information book regarding questions related to CRC surgery in Korean. If high-quality medical information provided by GAI is supervised properly by HCPs and published as an information book, it could be helpful for patients to obtain accurate information and make informed decisions.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11189990/pdf/","citationCount":"0","resultStr":"{\"title\":\"Communicative competence of generative artificial intelligence in responding to patient queries about colorectal cancer surgery.\",\"authors\":\"Min Hyeong Jo, Min-Jun Kim, Heung-Kwon Oh, Mi Jeong Choi, Hye-Rim Shin, Tae-Gyun Lee, Hong-Min Ahn, Duck-Woo Kim, Sung-Bum Kang\",\"doi\":\"10.1007/s00384-024-04670-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To examine the ability of generative artificial intelligence (GAI) to answer patients' questions regarding colorectal cancer (CRC).</p><p><strong>Methods: </strong>Ten clinically relevant questions about CRC were selected from top-rated hospitals' websites and patient surveys and presented to three GAI tools (Chatbot Generative Pre-Trained Transformer [GPT-4], Google Bard, and CLOVA X). Their responses were compared with answers from the CRC information book. Response evaluation was performed by two groups, each consisting of five healthcare professionals (HCP) and patients. Each question was scored on a 1-5 Likert scale based on four evaluation criteria (maximum score, 20 points/question).</p><p><strong>Results: </strong>In an analysis including only HCPs, the information book scored 11.8 ± 1.2, GPT-4 scored 13.5 ± 1.1, Google Bard scored 11.5 ± 0.7, and CLOVA X scored 12.2 ± 1.4 (P = 0.001). The score of GPT-4 was significantly higher than those of the information book (P = 0.020) and Google Bard (P = 0.001). In an analysis including only patients, the information book scored 14.1 ± 1.4, GPT-4 scored 15.2 ± 1.8, Google Bard scored 15.5 ± 1.8, and CLOVA X scored 14.4 ± 1.8, without significant differences (P = 0.234). When both groups of evaluators were included, the information book scored 13.0 ± 0.9, GPT-4 scored 14.4 ± 1.2, Google Bard scored 13.5 ± 1.0, and CLOVA X scored 13.3 ± 1.5 (P = 0.070).</p><p><strong>Conclusion: </strong>The three GAIs demonstrated similar or better communicative competence than the information book regarding questions related to CRC surgery in Korean. If high-quality medical information provided by GAI is supervised properly by HCPs and published as an information book, it could be helpful for patients to obtain accurate information and make informed decisions.</p>\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11189990/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00384-024-04670-3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00384-024-04670-3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
Communicative competence of generative artificial intelligence in responding to patient queries about colorectal cancer surgery.
Purpose: To examine the ability of generative artificial intelligence (GAI) to answer patients' questions regarding colorectal cancer (CRC).
Methods: Ten clinically relevant questions about CRC were selected from top-rated hospitals' websites and patient surveys and presented to three GAI tools (Chatbot Generative Pre-Trained Transformer [GPT-4], Google Bard, and CLOVA X). Their responses were compared with answers from the CRC information book. Response evaluation was performed by two groups, each consisting of five healthcare professionals (HCP) and patients. Each question was scored on a 1-5 Likert scale based on four evaluation criteria (maximum score, 20 points/question).
Results: In an analysis including only HCPs, the information book scored 11.8 ± 1.2, GPT-4 scored 13.5 ± 1.1, Google Bard scored 11.5 ± 0.7, and CLOVA X scored 12.2 ± 1.4 (P = 0.001). The score of GPT-4 was significantly higher than those of the information book (P = 0.020) and Google Bard (P = 0.001). In an analysis including only patients, the information book scored 14.1 ± 1.4, GPT-4 scored 15.2 ± 1.8, Google Bard scored 15.5 ± 1.8, and CLOVA X scored 14.4 ± 1.8, without significant differences (P = 0.234). When both groups of evaluators were included, the information book scored 13.0 ± 0.9, GPT-4 scored 14.4 ± 1.2, Google Bard scored 13.5 ± 1.0, and CLOVA X scored 13.3 ± 1.5 (P = 0.070).
Conclusion: The three GAIs demonstrated similar or better communicative competence than the information book regarding questions related to CRC surgery in Korean. If high-quality medical information provided by GAI is supervised properly by HCPs and published as an information book, it could be helpful for patients to obtain accurate information and make informed decisions.