Assessing prognosis in depression: comparing perspectives of AI models, mental health professionals and the general public.

IF 4.3 3区医学 Q1 PRIMARY HEALTH CARE

Family Medicine and Community Health Pub Date : 2024-01-09 DOI:10.1136/fmch-2023-002583

Zohar Elyoseph, Inbar Levkovich, Shiri Shinan-Altman

{"title":"Assessing prognosis in depression: comparing perspectives of AI models, mental health professionals and the general public.","authors":"Zohar Elyoseph, Inbar Levkovich, Shiri Shinan-Altman","doi":"10.1136/fmch-2023-002583","DOIUrl":null,"url":null,"abstract":"Background: Artificial intelligence (AI) has rapidly permeated various sectors, including healthcare, highlighting its potential to facilitate mental health assessments. This study explores the underexplored domain of AI's role in evaluating prognosis and long-term outcomes in depressive disorders, offering insights into how AI large language models (LLMs) compare with human perspectives.Methods: Using case vignettes, we conducted a comparative analysis involving different LLMs (ChatGPT-3.5, ChatGPT-4, Claude and Bard), mental health professionals (general practitioners, psychiatrists, clinical psychologists and mental health nurses), and the general public that reported previously. We evaluate the LLMs ability to generate prognosis, anticipated outcomes with and without professional intervention, and envisioned long-term positive and negative consequences for individuals with depression.Results: In most of the examined cases, the four LLMs consistently identified depression as the primary diagnosis and recommended a combined treatment of psychotherapy and antidepressant medication. ChatGPT-3.5 exhibited a significantly pessimistic prognosis distinct from other LLMs, professionals and the public. ChatGPT-4, Claude and Bard aligned closely with mental health professionals and the general public perspectives, all of whom anticipated no improvement or worsening without professional help. Regarding long-term outcomes, ChatGPT 3.5, Claude and Bard consistently projected significantly fewer negative long-term consequences of treatment than ChatGPT-4.Conclusions: This study underscores the potential of AI to complement the expertise of mental health professionals and promote a collaborative paradigm in mental healthcare. The observation that three of the four LLMs closely mirrored the anticipations of mental health experts in scenarios involving treatment underscores the technology's prospective value in offering professional clinical forecasts. The pessimistic outlook presented by ChatGPT 3.5 is concerning, as it could potentially diminish patients' drive to initiate or continue depression therapy. In summary, although LLMs show potential in enhancing healthcare services, their utilisation requires thorough verification and a seamless integration with human judgement and skills.","PeriodicalId":44590,"journal":{"name":"Family Medicine and Community Health","volume":"12 Suppl 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10806564/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Family Medicine and Community Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/fmch-2023-002583","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PRIMARY HEALTH CARE","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Artificial intelligence (AI) has rapidly permeated various sectors, including healthcare, highlighting its potential to facilitate mental health assessments. This study explores the underexplored domain of AI's role in evaluating prognosis and long-term outcomes in depressive disorders, offering insights into how AI large language models (LLMs) compare with human perspectives.

Methods: Using case vignettes, we conducted a comparative analysis involving different LLMs (ChatGPT-3.5, ChatGPT-4, Claude and Bard), mental health professionals (general practitioners, psychiatrists, clinical psychologists and mental health nurses), and the general public that reported previously. We evaluate the LLMs ability to generate prognosis, anticipated outcomes with and without professional intervention, and envisioned long-term positive and negative consequences for individuals with depression.

Results: In most of the examined cases, the four LLMs consistently identified depression as the primary diagnosis and recommended a combined treatment of psychotherapy and antidepressant medication. ChatGPT-3.5 exhibited a significantly pessimistic prognosis distinct from other LLMs, professionals and the public. ChatGPT-4, Claude and Bard aligned closely with mental health professionals and the general public perspectives, all of whom anticipated no improvement or worsening without professional help. Regarding long-term outcomes, ChatGPT 3.5, Claude and Bard consistently projected significantly fewer negative long-term consequences of treatment than ChatGPT-4.

Conclusions: This study underscores the potential of AI to complement the expertise of mental health professionals and promote a collaborative paradigm in mental healthcare. The observation that three of the four LLMs closely mirrored the anticipations of mental health experts in scenarios involving treatment underscores the technology's prospective value in offering professional clinical forecasts. The pessimistic outlook presented by ChatGPT 3.5 is concerning, as it could potentially diminish patients' drive to initiate or continue depression therapy. In summary, although LLMs show potential in enhancing healthcare services, their utilisation requires thorough verification and a seamless integration with human judgement and skills.

Abstract Image

查看原文本刊更多论文

评估抑郁症的预后：比较人工智能模型、心理健康专业人员和普通大众的观点。

背景：人工智能（AI）已迅速渗透到包括医疗保健在内的各个领域，凸显了其在促进心理健康评估方面的潜力。本研究探讨了人工智能在评估抑郁障碍的预后和长期结果方面所起的作用这一尚未充分探索的领域，为人工智能大型语言模型（LLMs）如何与人类视角进行比较提供了见解：我们使用案例小故事进行了一项比较分析，涉及不同的 LLM（ChatGPT-3.5、ChatGPT-4、Claude 和 Bard）、心理健康专业人员（全科医生、精神病医生、临床心理学家和心理健康护士）以及之前报告过的普通大众。我们评估了 LLM 生成预后的能力、有无专业干预的预期结果，以及对抑郁症患者的长期积极和消极影响的设想：结果：在大多数受检病例中，四种 LLM 始终将抑郁症作为主要诊断，并建议采用心理治疗和抗抑郁药物治疗相结合的方法。ChatGPT-3.5 与其他地方语言学家、专业人士和公众相比，表现出明显的悲观预后。ChatGPT-4、克劳德和巴德与心理健康专业人士和公众的观点非常一致，他们都预计在没有专业帮助的情况下不会有任何改善或恶化。在长期结果方面，ChatGPT3.5、Claude 和 Bard 预测的长期治疗的负面影响明显少于 ChatGPT-4：本研究强调了人工智能在补充心理健康专业人员的专业知识和促进心理保健合作模式方面的潜力。观察发现，在涉及治疗的场景中，四种 LLM 中的三种与心理健康专家的预测密切相关，这凸显了该技术在提供专业临床预测方面的前景价值。ChatGPT 3.5 所呈现的悲观前景令人担忧，因为它可能会削弱患者开始或继续抑郁治疗的动力。总之，尽管 LLM 在提高医疗服务方面显示出了潜力，但其使用还需要全面验证，并与人类的判断和技能完美结合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Family Medicine and Community Health PRIMARY HEALTH CARE-

CiteScore

9.70

自引率

0.00%

发文量

审稿时长

19 weeks

期刊介绍： Family Medicine and Community Health (FMCH) is a peer-reviewed, open-access journal focusing on the topics of family medicine, general practice and community health. FMCH strives to be a leading international journal that promotes ‘Health Care for All’ through disseminating novel knowledge and best practices in primary care, family medicine, and community health. FMCH publishes original research, review, methodology, commentary, reflection, and case-study from the lens of population health. FMCH’s Asian Focus section features reports of family medicine development in the Asia-pacific region. FMCH aims to be an exemplary forum for the timely communication of medical knowledge and skills with the goal of promoting improved health care through the practice of family and community-based medicine globally. FMCH aims to serve a diverse audience including researchers, educators, policymakers and leaders of family medicine and community health. We also aim to provide content relevant for researchers working on population health, epidemiology, public policy, disease control and management, preventative medicine and disease burden. FMCH does not impose any article processing charges (APC) or submission charges.