Stav Yosef, Moreah Zisquit, Ben Cohen, Anat Brunstein Klomek, Kfir Bar, Doron Friedman
{"title":"微调llm对数字化患者评估的自动化治疗质量的影响。","authors":"Stav Yosef, Moreah Zisquit, Ben Cohen, Anat Brunstein Klomek, Kfir Bar, Doron Friedman","doi":"10.1038/s44184-025-00159-1","DOIUrl":null,"url":null,"abstract":"<p><p>The use of generative large language models (LLMs) in mental health applications is gaining traction, with some proposals even suggesting LLM-based automated therapists. In this study, we assess the impact of fine-tuning therapist LLMs to improve the quality of therapy sessions, addressing a critical question in LLM-based mental health research. Specifically, we demonstrate that fine-tuning with datasets focused on specific therapeutic techniques significantly enhances the performance of LLM therapists. To facilitate this assessment, we introduce a novel evaluation system based on digital patients, powered by LLMs, which engage in text-based therapy sessions and provide session evaluations through questionnaires designed for human patients. This method addresses the inadequacies of traditional text-similarity metrics, which are insufficient for assessing the quality of therapeutic interactions. This study centers on motivational interviewing (MI), a structured and goal-oriented therapeutic approach. However, our digital therapists and patients can be adapted to work in other forms of therapy. We believe that our digital therapists offer a standardized method for assessing automated therapists and showcasing the potential of LLMs in mental health care.</p>","PeriodicalId":74321,"journal":{"name":"Npj mental health research","volume":"4 1","pages":"43"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12433451/pdf/","citationCount":"0","resultStr":"{\"title\":\"The impact of fine-tuning LLMs on the quality of automated therapy assessed by digital patients.\",\"authors\":\"Stav Yosef, Moreah Zisquit, Ben Cohen, Anat Brunstein Klomek, Kfir Bar, Doron Friedman\",\"doi\":\"10.1038/s44184-025-00159-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The use of generative large language models (LLMs) in mental health applications is gaining traction, with some proposals even suggesting LLM-based automated therapists. In this study, we assess the impact of fine-tuning therapist LLMs to improve the quality of therapy sessions, addressing a critical question in LLM-based mental health research. Specifically, we demonstrate that fine-tuning with datasets focused on specific therapeutic techniques significantly enhances the performance of LLM therapists. To facilitate this assessment, we introduce a novel evaluation system based on digital patients, powered by LLMs, which engage in text-based therapy sessions and provide session evaluations through questionnaires designed for human patients. This method addresses the inadequacies of traditional text-similarity metrics, which are insufficient for assessing the quality of therapeutic interactions. This study centers on motivational interviewing (MI), a structured and goal-oriented therapeutic approach. However, our digital therapists and patients can be adapted to work in other forms of therapy. We believe that our digital therapists offer a standardized method for assessing automated therapists and showcasing the potential of LLMs in mental health care.</p>\",\"PeriodicalId\":74321,\"journal\":{\"name\":\"Npj mental health research\",\"volume\":\"4 1\",\"pages\":\"43\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12433451/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Npj mental health research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1038/s44184-025-00159-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Npj mental health research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s44184-025-00159-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The impact of fine-tuning LLMs on the quality of automated therapy assessed by digital patients.
The use of generative large language models (LLMs) in mental health applications is gaining traction, with some proposals even suggesting LLM-based automated therapists. In this study, we assess the impact of fine-tuning therapist LLMs to improve the quality of therapy sessions, addressing a critical question in LLM-based mental health research. Specifically, we demonstrate that fine-tuning with datasets focused on specific therapeutic techniques significantly enhances the performance of LLM therapists. To facilitate this assessment, we introduce a novel evaluation system based on digital patients, powered by LLMs, which engage in text-based therapy sessions and provide session evaluations through questionnaires designed for human patients. This method addresses the inadequacies of traditional text-similarity metrics, which are insufficient for assessing the quality of therapeutic interactions. This study centers on motivational interviewing (MI), a structured and goal-oriented therapeutic approach. However, our digital therapists and patients can be adapted to work in other forms of therapy. We believe that our digital therapists offer a standardized method for assessing automated therapists and showcasing the potential of LLMs in mental health care.