Empathic Conversational Agent Platform Designs and Their Evaluation in the Context of Mental Health: Systematic Review

IF 4.8 2区医学 Q1 PSYCHIATRY

Jmir Mental Health Pub Date : 2024-09-09 DOI:10.2196/58974

Ruvini Sanjeewa, Ravi Iyer, Pragalathan Apputhurai, Nilmini Wickramasinghe, Denny Meyer

{"title":"Empathic Conversational Agent Platform Designs and Their Evaluation in the Context of Mental Health: Systematic Review","authors":"Ruvini Sanjeewa, Ravi Iyer, Pragalathan Apputhurai, Nilmini Wickramasinghe, Denny Meyer","doi":"10.2196/58974","DOIUrl":null,"url":null,"abstract":"Background: The demand for mental health (MH) services in the community continues to exceed supply. At the same time, technological developments make the use of artificial intelligence–empowered conversational agents (CAs) a real possibility to help fill this gap. Objective: The objective of this review was to identify existing empathic CA design architectures within the MH care sector and to assess their technical performance in detecting and responding to user emotions in terms of classification accuracy. In addition, the approaches used to evaluate empathic CAs within the MH care sector in terms of their acceptability to users were considered. Finally, this review aimed to identify limitations and future directions for empathic CAs in MH care. Methods: A systematic literature search was conducted across 6 academic databases to identify journal articles and conference proceedings using search terms covering 3 topics: “conversational agents,” “mental health,” and “empathy.” Only studies discussing CA interventions for the MH care domain were eligible for this review, with both textual and vocal characteristics considered as possible data inputs. Quality was assessed using appropriate risk of bias and quality tools. Results: A total of 19 articles met all inclusion criteria. Most (12/19, 63%) of these empathic CA designs in MH care were machine learning (ML) based, with 26% (5/19) hybrid engines and 11% (2/19) rule-based systems. Among the ML-based CAs, 47% (9/19) used neural networks, with transformer-based architectures being well represented (7/19, 37%). The remaining 16% (3/19) of the ML models were unspecified. Technical assessments of these CAs focused on response accuracies and their ability to recognize, predict, and classify user emotions. While single-engine CAs demonstrated good accuracy, the hybrid engines achieved higher accuracy and provided more nuanced responses. Of the 19 studies, human evaluations were conducted in 16 (84%), with only 5 (26%) focusing directly on the CA’s empathic features. All these papers used self-reports for measuring empathy, including single or multiple (scale) ratings or qualitative feedback from in-depth interviews. Only 1 (5%) paper included evaluations by both CA users and experts, adding more value to the process. Conclusions: The integration of CA design and its evaluation is crucial to produce empathic CAs. Future studies should focus on using a clear definition of empathy and standardized scales for empathy measurement, ideally including expert assessment. In addition, the diversity in measures used for technical assessment and evaluation poses a challenge for comparing CA performances, which future research should also address. However, CAs with good technical and empathic performance are already available to users of MH care services, showing promise for new applications, such as helpline services.","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"42 1","pages":""},"PeriodicalIF":4.8000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jmir Mental Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/58974","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The demand for mental health (MH) services in the community continues to exceed supply. At the same time, technological developments make the use of artificial intelligence–empowered conversational agents (CAs) a real possibility to help fill this gap. Objective: The objective of this review was to identify existing empathic CA design architectures within the MH care sector and to assess their technical performance in detecting and responding to user emotions in terms of classification accuracy. In addition, the approaches used to evaluate empathic CAs within the MH care sector in terms of their acceptability to users were considered. Finally, this review aimed to identify limitations and future directions for empathic CAs in MH care. Methods: A systematic literature search was conducted across 6 academic databases to identify journal articles and conference proceedings using search terms covering 3 topics: “conversational agents,” “mental health,” and “empathy.” Only studies discussing CA interventions for the MH care domain were eligible for this review, with both textual and vocal characteristics considered as possible data inputs. Quality was assessed using appropriate risk of bias and quality tools. Results: A total of 19 articles met all inclusion criteria. Most (12/19, 63%) of these empathic CA designs in MH care were machine learning (ML) based, with 26% (5/19) hybrid engines and 11% (2/19) rule-based systems. Among the ML-based CAs, 47% (9/19) used neural networks, with transformer-based architectures being well represented (7/19, 37%). The remaining 16% (3/19) of the ML models were unspecified. Technical assessments of these CAs focused on response accuracies and their ability to recognize, predict, and classify user emotions. While single-engine CAs demonstrated good accuracy, the hybrid engines achieved higher accuracy and provided more nuanced responses. Of the 19 studies, human evaluations were conducted in 16 (84%), with only 5 (26%) focusing directly on the CA’s empathic features. All these papers used self-reports for measuring empathy, including single or multiple (scale) ratings or qualitative feedback from in-depth interviews. Only 1 (5%) paper included evaluations by both CA users and experts, adding more value to the process. Conclusions: The integration of CA design and its evaluation is crucial to produce empathic CAs. Future studies should focus on using a clear definition of empathy and standardized scales for empathy measurement, ideally including expert assessment. In addition, the diversity in measures used for technical assessment and evaluation poses a challenge for comparing CA performances, which future research should also address. However, CAs with good technical and empathic performance are already available to users of MH care services, showing promise for new applications, such as helpline services.

查看原文本刊更多论文

心理健康背景下的移情对话代理平台设计及其评估：系统回顾

背景：社区心理健康（MH）服务仍然供不应求。与此同时，技术的发展使得使用人工智能会话代理（CA）来帮助填补这一缺口成为现实。目标：本综述旨在确定医疗保健领域现有的移情会话代理设计架构，并从分类准确性的角度评估其在检测和响应用户情绪方面的技术性能。此外，还考虑了用于评估医疗保健领域移情 CA 的用户可接受性的方法。最后，本综述旨在确定移情 CA 在精神疾病护理领域的局限性和未来发展方向。方法：在 6 个学术数据库中进行了系统的文献检索，以确定期刊论文和会议论文集，检索词涵盖 3 个主题："对话代理"、"心理健康 "和 "移情"。只有讨论针对心理健康护理领域的 CA 干预措施的研究才有资格参与本次综述，文本和声音特征都被视为可能的数据输入。研究质量采用适当的偏倚风险和质量工具进行评估。结果：共有 19 篇文章符合所有纳入标准。在这些用于医疗保健的移情CA设计中，大多数（12/19，63%）是基于机器学习（ML）的，26%（5/19）是混合引擎，11%（2/19）是基于规则的系统。在基于机器学习的 CA 中，47%（9/19）使用了神经网络，其中基于变压器的架构占很大比例（7/19，37%）。其余 16%（3/19）的 ML 模型未作说明。对这些 CA 的技术评估主要集中在响应准确度及其识别、预测和分类用户情绪的能力上。虽然单引擎 CA 显示出了良好的准确性，但混合引擎实现了更高的准确性，并提供了更细致入微的响应。在 19 项研究中，16 项（84%）进行了人类评估，只有 5 项（26%）直接关注 CA 的移情特征。所有这些论文都使用了自我报告来衡量移情能力，包括单一或多重（量表）评级或深入访谈的定性反馈。只有 1 篇（5%）论文同时包含了 CA 用户和专家的评价，为这一过程增添了更多价值。结论：将 CA 设计与评估结合起来对于制作出具有共鸣的 CA 至关重要。未来的研究应重点关注移情的明确定义和移情测量的标准化量表，最好包括专家评估。此外，用于技术评估和评价的措施多种多样，这给比较 CA 性能带来了挑战，未来的研究也应解决这一问题。不过，具有良好的技术和移情性能的 CA 已经可以为 MH 护理服务的用户所用，这为新的应用（如求助热线服务）带来了希望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Jmir Mental Health Medicine-Psychiatry and Mental Health

CiteScore

10.80

自引率

3.80%

发文量

104

审稿时长

16 weeks

期刊介绍： JMIR Mental Health (JMH, ISSN 2368-7959) is a PubMed-indexed, peer-reviewed sister journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175). JMIR Mental Health focusses on digital health and Internet interventions, technologies and electronic innovations (software and hardware) for mental health, addictions, online counselling and behaviour change. This includes formative evaluation and system descriptions, theoretical papers, review papers, viewpoint/vision papers, and rigorous evaluations.