Roma Shusterman, Allison C. Waters, Shannon O’Neill, Marshall Bangs, Phan Luu, Don M. Tucker
{"title":"An active inference strategy for prompting reliable responses from large language models in medical practice","authors":"Roma Shusterman, Allison C. Waters, Shannon O’Neill, Marshall Bangs, Phan Luu, Don M. Tucker","doi":"10.1038/s41746-025-01516-2","DOIUrl":null,"url":null,"abstract":"<p>Continuing advances in Large Language Models (LLMs) are transforming medical knowledge access across education, training, and treatment. Early literature cautions their non-determinism, potential for harmful responses, and lack of quality control. To address these issues, we propose a domain-specific, validated dataset for LLM training and an actor–critic prompting protocol grounded in active inference. A Therapist agent generates initial responses to patient queries, while a Supervisor agent refines them. In a blind validation study, experienced cognitive behavior therapy for insomnia (CBT-I) therapists evaluated 100 patient queries. For each query, they were given either the LLM’s response or one of two therapist-crafted responses—one appropriate and one deliberately inappropriate—and asked to rate the quality and accuracy of each reply. The LLM often received higher ratings than the appropriate responses, indicating effective alignment with expert standards. This structured approach lays the foundation for safely integrating advanced LLM technology into medical applications.</p>","PeriodicalId":19349,"journal":{"name":"NPJ Digital Medicine","volume":"27 1","pages":""},"PeriodicalIF":12.4000,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Digital Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1038/s41746-025-01516-2","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Continuing advances in Large Language Models (LLMs) are transforming medical knowledge access across education, training, and treatment. Early literature cautions their non-determinism, potential for harmful responses, and lack of quality control. To address these issues, we propose a domain-specific, validated dataset for LLM training and an actor–critic prompting protocol grounded in active inference. A Therapist agent generates initial responses to patient queries, while a Supervisor agent refines them. In a blind validation study, experienced cognitive behavior therapy for insomnia (CBT-I) therapists evaluated 100 patient queries. For each query, they were given either the LLM’s response or one of two therapist-crafted responses—one appropriate and one deliberately inappropriate—and asked to rate the quality and accuracy of each reply. The LLM often received higher ratings than the appropriate responses, indicating effective alignment with expert standards. This structured approach lays the foundation for safely integrating advanced LLM technology into medical applications.
期刊介绍:
npj Digital Medicine is an online open-access journal that focuses on publishing peer-reviewed research in the field of digital medicine. The journal covers various aspects of digital medicine, including the application and implementation of digital and mobile technologies in clinical settings, virtual healthcare, and the use of artificial intelligence and informatics.
The primary goal of the journal is to support innovation and the advancement of healthcare through the integration of new digital and mobile technologies. When determining if a manuscript is suitable for publication, the journal considers four important criteria: novelty, clinical relevance, scientific rigor, and digital innovation.