Introducing CounseLLMe: A dataset of simulated mental health dialogues for comparing LLMs like Haiku, LLaMAntino and ChatGPT against humans

Edoardo Sebastiano De Duro, Riccardo Improta, Massimo Stella
{"title":"Introducing CounseLLMe: A dataset of simulated mental health dialogues for comparing LLMs like Haiku, LLaMAntino and ChatGPT against humans","authors":"Edoardo Sebastiano De Duro,&nbsp;Riccardo Improta,&nbsp;Massimo Stella","doi":"10.1016/j.etdah.2025.100170","DOIUrl":null,"url":null,"abstract":"<div><div>We introduce CounseLLMe as a multilingual, multi-model dataset of 400 simulated mental health counselling dialogues between two state-of-the-art Large Language Models (LLMs). These conversations - of 20 quips each - were generated either in English (using OpenAI’s GPT 3.5 and Claude-3’s Haiku) or Italian (with Claude-3’s Haiku and LLaMAntino) and with prompts tuned with the help of a professional in psychotherapy. We investigate the resulting conversations through comparison against human mental health conversations on the same topic of depression. To compare linguistic features, knowledge structure and emotional content between LLMs and humans, we employed textual forma mentis networks, i.e. cognitive networks where nodes represent concepts and links indicate syntactic or semantic relationships between concepts in the dialogues’ quips. We find that the emotional structure of LLM-LLM English conversations matches the one of humans in terms of patient-therapist trust exchanges, i.e. 1 in 5 LLM-LLM quips contain trust along 10 conversational turns versus the <span><math><mrow><mn>24</mn><mo>%</mo></mrow></math></span> rate found in humans. ChatGPT and Haiku’s simulated English patients can also reproduce human feelings of conflict and pessimism. However, human patients display non-negligible levels of anger/frustration that is missing in LLMs. Italian LLMs’ conversations are worse in reproducing human patterns. All LLM-LLM conversations reproduced human syntactic patterns of increased absolutist pronoun usage in patients and second-person, trust-inducing, pronoun usage in therapists. Our results indicate that LLMs can realistically reproduce several aspects of human patient-therapist conversations and we thusly release CounseLLMe as a public dataset for novel data-informed opportunities in mental health and machine psychology.</div></div>","PeriodicalId":72899,"journal":{"name":"Emerging trends in drugs, addictions, and health","volume":"5 ","pages":"Article 100170"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Emerging trends in drugs, addictions, and health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667118225000017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We introduce CounseLLMe as a multilingual, multi-model dataset of 400 simulated mental health counselling dialogues between two state-of-the-art Large Language Models (LLMs). These conversations - of 20 quips each - were generated either in English (using OpenAI’s GPT 3.5 and Claude-3’s Haiku) or Italian (with Claude-3’s Haiku and LLaMAntino) and with prompts tuned with the help of a professional in psychotherapy. We investigate the resulting conversations through comparison against human mental health conversations on the same topic of depression. To compare linguistic features, knowledge structure and emotional content between LLMs and humans, we employed textual forma mentis networks, i.e. cognitive networks where nodes represent concepts and links indicate syntactic or semantic relationships between concepts in the dialogues’ quips. We find that the emotional structure of LLM-LLM English conversations matches the one of humans in terms of patient-therapist trust exchanges, i.e. 1 in 5 LLM-LLM quips contain trust along 10 conversational turns versus the 24% rate found in humans. ChatGPT and Haiku’s simulated English patients can also reproduce human feelings of conflict and pessimism. However, human patients display non-negligible levels of anger/frustration that is missing in LLMs. Italian LLMs’ conversations are worse in reproducing human patterns. All LLM-LLM conversations reproduced human syntactic patterns of increased absolutist pronoun usage in patients and second-person, trust-inducing, pronoun usage in therapists. Our results indicate that LLMs can realistically reproduce several aspects of human patient-therapist conversations and we thusly release CounseLLMe as a public dataset for novel data-informed opportunities in mental health and machine psychology.
我们介绍的 CounseLLMe 是一个多语言、多模型数据集,包含两个最先进的大型语言模型(LLM)之间的 400 个模拟心理健康咨询对话。这些对话(每组 20 个短语)是用英语(使用 OpenAI 的 GPT 3.5 和 Claude-3 的 Haiku)或意大利语(使用 Claude-3 的 Haiku 和 LLaMAntino)生成的,并在心理治疗专家的帮助下根据提示进行了调整。我们将由此产生的对话与人类心理健康对话进行比较,研究抑郁症这一相同主题。为了比较 LLM 与人类之间的语言特点、知识结构和情感内容,我们采用了文本形式网络(textual forma mentis networks),即认知网络,其中节点代表概念,链接表示对话调侃中概念之间的句法或语义关系。我们发现,就患者与治疗师之间的信任交流而言,LLM-LLM 英语对话的情感结构与人类的情感结构相吻合,即在 10 个对话回合中,每 5 个 LLM-LLM 调侃中就有 1 个包含信任,而人类的这一比例仅为 24%。ChatGPT 和 Haiku 的模拟英语患者也能再现人类的冲突和悲观情绪。不过,人类患者会表现出不可忽视的愤怒/沮丧情绪,而 LLMs 却没有这种情绪。意大利语 LLM 的对话在再现人类模式方面更差。所有 LLM-LLM 对话都再现了人类的句法模式,即患者增加了绝对主义代词的使用,治疗师增加了第二人称、信任诱导代词的使用。我们的研究结果表明,LLM 可以真实地再现人类患者与治疗师对话的多个方面,因此我们将 CounseLLMe 作为公共数据集发布,为心理健康和机器心理学领域提供新颖的数据信息机会。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Emerging trends in drugs, addictions, and health
Emerging trends in drugs, addictions, and health Pharmacology, Psychiatry and Mental Health, Forensic Medicine, Drug Discovery, Pharmacology, Toxicology and Pharmaceutics (General)
CiteScore
2.40
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信