Sonal Admane, Min Ji Kim, Akhila Reddy, Michael Tang, Yuchieh Kathryn Chang, Kao-Swi Karina Shih, Maxine De La Cruz, Sammuel Jumary Cepeda, Eduardo Bruera, David Hui
{"title":"Performance of Three Conversational Artificial Intelligence Agents in Defining End-of-Life Care Terms.","authors":"Sonal Admane, Min Ji Kim, Akhila Reddy, Michael Tang, Yuchieh Kathryn Chang, Kao-Swi Karina Shih, Maxine De La Cruz, Sammuel Jumary Cepeda, Eduardo Bruera, David Hui","doi":"10.1089/jpm.2024.0526","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Background:</i></b> Conversational artificial intelligence agents, or chatbots, are a transformational technology understudied in end-of-life care. <b><i>Methods:</i></b> OpenAI's ChatGPT, Google's Bard, and Microsoft's Bing were asked to define \"terminally ill,\" \"end of life,\" \"transitions of care,\" \"actively dying,\" and provide three references. Outputs were scored by six physicians on a scale of 0-10 for accuracy, comprehensiveness, and credibility. Flesch-Kincaid Grade Level and Flesch Reading Ease (FRE) were used to calculate readability. <b><i>Results:</i></b> Mean (standard deviation) scores for accuracy were 9 (1.9) for ChatGPT, 7.5 (2.4) for Bard, and 8.3 (2.4) for Bing. Comprehensiveness scores averaged 8.5 (1.7) for ChatGPT, 7.3 (2.1) for Bard, and 6.5 (2.3) for Bing. Credibility was low with a mean score of 3 (1.8). The mean FRE score was 41.7, and the mean grade level was 14.1, indicating low readability. <b><i>Conclusion:</i></b> Chatbot outputs had important deficiencies that necessitated clinician oversight to prevent misinformation.</p>","PeriodicalId":16656,"journal":{"name":"Journal of palliative medicine","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of palliative medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1089/jpm.2024.0526","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Conversational artificial intelligence agents, or chatbots, are a transformational technology understudied in end-of-life care. Methods: OpenAI's ChatGPT, Google's Bard, and Microsoft's Bing were asked to define "terminally ill," "end of life," "transitions of care," "actively dying," and provide three references. Outputs were scored by six physicians on a scale of 0-10 for accuracy, comprehensiveness, and credibility. Flesch-Kincaid Grade Level and Flesch Reading Ease (FRE) were used to calculate readability. Results: Mean (standard deviation) scores for accuracy were 9 (1.9) for ChatGPT, 7.5 (2.4) for Bard, and 8.3 (2.4) for Bing. Comprehensiveness scores averaged 8.5 (1.7) for ChatGPT, 7.3 (2.1) for Bard, and 6.5 (2.3) for Bing. Credibility was low with a mean score of 3 (1.8). The mean FRE score was 41.7, and the mean grade level was 14.1, indicating low readability. Conclusion: Chatbot outputs had important deficiencies that necessitated clinician oversight to prevent misinformation.
期刊介绍:
Journal of Palliative Medicine is the premier peer-reviewed journal covering medical, psychosocial, policy, and legal issues in end-of-life care and relief of suffering for patients with intractable pain. The Journal presents essential information for professionals in hospice/palliative medicine, focusing on improving quality of life for patients and their families, and the latest developments in drug and non-drug treatments.
The companion biweekly eNewsletter, Briefings in Palliative Medicine, delivers the latest breaking news and information to keep clinicians and health care providers continuously updated.