{"title":"Open-source small language models for personal medical assistant chatbots","authors":"Matteo Magnini , Gianluca Aguzzi , Sara Montagna","doi":"10.1016/j.ibmed.2024.100197","DOIUrl":null,"url":null,"abstract":"<div><div>Medical chatbots are becoming essential components of telemedicine applications as tools to assist patients in the self-management of their conditions. This trend is particularly driven by advancements in natural language processing techniques with pre-trained language models (LMs). However, the integration of LMs into clinical environments faces challenges related to reliability and privacy concerns.</div><div>This study seeks to address these issues by exploiting a <em>privacy by design</em> architectural solution that utilises the fully local deployment of open-source LMs. Specifically, to mitigate any risk of information leakage, we focus on evaluating the performance of open-source language models (SLMs) that can be deployed on personal devices, such as smartphones or laptops, without stringent hardware requirements.</div><div>We assess the effectiveness of this solution adopting hypertension management as a case study. Models are evaluated across various tasks, including intent recognition and empathetic conversation, using Gemini Pro 1.5 as a benchmark. The results indicate that, for certain tasks such as intent recognition, Gemini outperforms other models. However, by employing the “large language model (LLM) as a judge” approach for semantic evaluation of response correctness, we found several models that demonstrate a close alignment with the ground truth. In conclusion, this study highlights the potential of locally deployed SLMs as components of medical chatbots, while addressing critical concerns related to privacy and reliability.</div></div>","PeriodicalId":73399,"journal":{"name":"Intelligence-based medicine","volume":"11 ","pages":"Article 100197"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence-based medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666521224000644","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Medical chatbots are becoming essential components of telemedicine applications as tools to assist patients in the self-management of their conditions. This trend is particularly driven by advancements in natural language processing techniques with pre-trained language models (LMs). However, the integration of LMs into clinical environments faces challenges related to reliability and privacy concerns.
This study seeks to address these issues by exploiting a privacy by design architectural solution that utilises the fully local deployment of open-source LMs. Specifically, to mitigate any risk of information leakage, we focus on evaluating the performance of open-source language models (SLMs) that can be deployed on personal devices, such as smartphones or laptops, without stringent hardware requirements.
We assess the effectiveness of this solution adopting hypertension management as a case study. Models are evaluated across various tasks, including intent recognition and empathetic conversation, using Gemini Pro 1.5 as a benchmark. The results indicate that, for certain tasks such as intent recognition, Gemini outperforms other models. However, by employing the “large language model (LLM) as a judge” approach for semantic evaluation of response correctness, we found several models that demonstrate a close alignment with the ground truth. In conclusion, this study highlights the potential of locally deployed SLMs as components of medical chatbots, while addressing critical concerns related to privacy and reliability.
医疗聊天机器人正在成为远程医疗应用的重要组成部分,作为帮助患者自我管理病情的工具。这一趋势尤其受到自然语言处理技术与预训练语言模型(LMs)的进步的推动。然而,将LMs集成到临床环境中面临着与可靠性和隐私问题相关的挑战。本研究试图通过利用开源LMs的完全本地部署来利用隐私设计架构解决方案来解决这些问题。具体来说,为了减少信息泄露的风险,我们着重于评估可以部署在个人设备(如智能手机或笔记本电脑)上的开源语言模型(slm)的性能,而不需要严格的硬件要求。我们以高血压管理为例来评估这种解决方案的有效性。模型在各种任务中进行评估,包括意图识别和移情对话,使用Gemini Pro 1.5作为基准。结果表明,对于某些任务,如意图识别,Gemini优于其他模型。然而,通过采用“大型语言模型(LLM)作为判断”的方法来对响应正确性进行语义评估,我们发现了几个与基本事实密切一致的模型。总之,本研究强调了本地部署的slm作为医疗聊天机器人组件的潜力,同时解决了与隐私和可靠性相关的关键问题。