Large Language Models for Mental Health Applications: Systematic Review.

IF 4.8 2区医学 Q1 PSYCHIATRY

Jmir Mental Health Pub Date : 2024-10-18 DOI:10.2196/57400

Zhijun Guo, Alvina Lai, Johan H Thygesen, Joseph Farrington, Thomas Keen, Kezhi Li

{"title":"Large Language Models for Mental Health Applications: Systematic Review.","authors":"Zhijun Guo, Alvina Lai, Johan H Thygesen, Joseph Farrington, Thomas Keen, Kezhi Li","doi":"10.2196/57400","DOIUrl":null,"url":null,"abstract":"Background: Large language models (LLMs) are advanced artificial neural networks trained on extensive datasets to accurately understand and generate natural language. While they have received much attention and demonstrated potential in digital health, their application in mental health, particularly in clinical settings, has generated considerable debate.Objective: This systematic review aims to critically assess the use of LLMs in mental health, specifically focusing on their applicability and efficacy in early screening, digital interventions, and clinical settings. By systematically collating and assessing the evidence from current studies, our work analyzes models, methodologies, data sources, and outcomes, thereby highlighting the potential of LLMs in mental health, the challenges they present, and the prospects for their clinical use.Methods: Adhering to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, this review searched 5 open-access databases: MEDLINE (accessed by PubMed), IEEE Xplore, Scopus, JMIR, and ACM Digital Library. Keywords used were (mental health OR mental illness OR mental disorder OR psychiatry) AND (large language models). This study included articles published between January 1, 2017, and April 30, 2024, and excluded articles published in languages other than English.Results: In total, 40 articles were evaluated, including 15 (38%) articles on mental health conditions and suicidal ideation detection through text analysis, 7 (18%) on the use of LLMs as mental health conversational agents, and 18 (45%) on other applications and evaluations of LLMs in mental health. LLMs show good effectiveness in detecting mental health issues and providing accessible, destigmatized eHealth services. However, assessments also indicate that the current risks associated with clinical use might surpass their benefits. These risks include inconsistencies in generated text; the production of hallucinations; and the absence of a comprehensive, benchmarked ethical framework.Conclusions: This systematic review examines the clinical applications of LLMs in mental health, highlighting their potential and inherent risks. The study identifies several issues: the lack of multilingual datasets annotated by experts, concerns regarding the accuracy and reliability of generated content, challenges in interpretability due to the \"black box\" nature of LLMs, and ongoing ethical dilemmas. These ethical concerns include the absence of a clear, benchmarked ethical framework; data privacy issues; and the potential for overreliance on LLMs by both physicians and patients, which could compromise traditional medical practices. As a result, LLMs should not be considered substitutes for professional mental health services. However, the rapid development of LLMs underscores their potential as valuable clinical aids, emphasizing the need for continued research and development in this area.Trial registration: PROSPERO CRD42024508617; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=508617.","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"11 ","pages":"e57400"},"PeriodicalIF":4.8000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11530718/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jmir Mental Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/57400","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Large language models (LLMs) are advanced artificial neural networks trained on extensive datasets to accurately understand and generate natural language. While they have received much attention and demonstrated potential in digital health, their application in mental health, particularly in clinical settings, has generated considerable debate.

Objective: This systematic review aims to critically assess the use of LLMs in mental health, specifically focusing on their applicability and efficacy in early screening, digital interventions, and clinical settings. By systematically collating and assessing the evidence from current studies, our work analyzes models, methodologies, data sources, and outcomes, thereby highlighting the potential of LLMs in mental health, the challenges they present, and the prospects for their clinical use.

Methods: Adhering to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, this review searched 5 open-access databases: MEDLINE (accessed by PubMed), IEEE Xplore, Scopus, JMIR, and ACM Digital Library. Keywords used were (mental health OR mental illness OR mental disorder OR psychiatry) AND (large language models). This study included articles published between January 1, 2017, and April 30, 2024, and excluded articles published in languages other than English.

Results: In total, 40 articles were evaluated, including 15 (38%) articles on mental health conditions and suicidal ideation detection through text analysis, 7 (18%) on the use of LLMs as mental health conversational agents, and 18 (45%) on other applications and evaluations of LLMs in mental health. LLMs show good effectiveness in detecting mental health issues and providing accessible, destigmatized eHealth services. However, assessments also indicate that the current risks associated with clinical use might surpass their benefits. These risks include inconsistencies in generated text; the production of hallucinations; and the absence of a comprehensive, benchmarked ethical framework.

Conclusions: This systematic review examines the clinical applications of LLMs in mental health, highlighting their potential and inherent risks. The study identifies several issues: the lack of multilingual datasets annotated by experts, concerns regarding the accuracy and reliability of generated content, challenges in interpretability due to the "black box" nature of LLMs, and ongoing ethical dilemmas. These ethical concerns include the absence of a clear, benchmarked ethical framework; data privacy issues; and the potential for overreliance on LLMs by both physicians and patients, which could compromise traditional medical practices. As a result, LLMs should not be considered substitutes for professional mental health services. However, the rapid development of LLMs underscores their potential as valuable clinical aids, emphasizing the need for continued research and development in this area.

Trial registration: PROSPERO CRD42024508617; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=508617.

查看原文本刊更多论文

用于心理健康应用的大型语言模型：系统回顾。

背景：大型语言模型（LLMs）是一种先进的人工神经网络，通过在大量数据集上进行训练，可以准确理解和生成自然语言。虽然大语言模型在数字健康领域受到广泛关注并展现出巨大潜力，但其在心理健康领域的应用，尤其是在临床环境中的应用，却引发了大量争论：本系统性综述旨在批判性地评估 LLM 在心理健康中的应用，尤其关注其在早期筛查、数字干预和临床环境中的适用性和有效性。通过系统整理和评估当前研究的证据，我们的工作分析了模型、方法、数据来源和结果，从而强调了LLMs在心理健康领域的潜力、面临的挑战以及临床应用的前景：本综述遵循 PRISMA（系统综述和元分析首选报告项目）指南，检索了 5 个开放存取数据库：MEDLINE（通过 PubMed 访问）、IEEE Xplore、Scopus、JMIR 和 ACM 数字图书馆。使用的关键词为（心理健康或心理疾病或心理障碍或精神病学）和（大型语言模型）。本研究收录了 2017 年 1 月 1 日至 2024 年 4 月 30 日期间发表的文章，并排除了以英语以外的语言发表的文章：共评估了 40 篇文章，其中 15 篇（38%）是关于通过文本分析检测精神健康状况和自杀意念的文章，7 篇（18%）是关于将 LLMs 用作精神健康对话代理的文章，18 篇（45%）是关于 LLMs 在精神健康领域的其他应用和评估的文章。在检测心理健康问题和提供无障碍、去污名化的电子健康服务方面，LLMs 显示出良好的有效性。不过，评估也表明，目前与临床使用相关的风险可能会超过其益处。这些风险包括：生成的文本不一致；产生幻觉；缺乏全面的、基准化的伦理框架：本系统性综述研究了 LLMs 在心理健康领域的临床应用，强调了其潜在和固有的风险。研究发现了几个问题：缺乏由专家注释的多语言数据集、对生成内容的准确性和可靠性的担忧、由于 LLMs 的 "黑盒 "性质而在可解释性方面面临的挑战，以及持续存在的伦理困境。这些伦理问题包括缺乏明确的、基准化的伦理框架；数据隐私问题；以及医生和患者过度依赖 LLMs 的可能性，这可能会损害传统的医疗实践。因此，不应将 LLM 视为专业心理健康服务的替代品。然而，LLM 的快速发展凸显了其作为有价值的临床辅助工具的潜力，强调了在这一领域继续研究和开发的必要性：ProCORD42024508617; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=508617.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Jmir Mental Health Medicine-Psychiatry and Mental Health

CiteScore

10.80

自引率

3.80%

发文量

104

审稿时长

16 weeks

期刊介绍： JMIR Mental Health (JMH, ISSN 2368-7959) is a PubMed-indexed, peer-reviewed sister journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175). JMIR Mental Health focusses on digital health and Internet interventions, technologies and electronic innovations (software and hardware) for mental health, addictions, online counselling and behaviour change. This includes formative evaluation and system descriptions, theoretical papers, review papers, viewpoint/vision papers, and rigorous evaluations.