Youstina Demetry, Per Carlbring, Gerhard Andersson
{"title":"Cultural Relevance and Acceptability of Cognitive Behavioral Therapy Techniques Adapted by AI or a Human Psychologist: Experimental Study.","authors":"Youstina Demetry, Per Carlbring, Gerhard Andersson","doi":"10.2196/91056","DOIUrl":"https://doi.org/10.2196/91056","url":null,"abstract":"<p><strong>Background: </strong>Evidence-based psychological interventions are usually not accessed by marginalized groups such as refugees. Culturally adapted psychological interventions have reported larger effect sizes than nonadapted psychological interventions. However, the cultural adaptation of interventions is a lengthy process, entailing a challenge. One potential solution to overcome this challenge is the use of artificial intelligence (AI).</p><p><strong>Objective: </strong>The aim of this study was to investigate and compare the perceived cultural relevance and acceptability of 2 common cognitive behavioral therapy (CBT) techniques when translated and culturally adapted by AI versus a human psychologist.</p><p><strong>Methods: </strong>In a 2×2 factorial design, the text generator type (AI vs human psychologist) and the CBT technique (cognitive restructuring vs behavior modification) were compared. CBT technique texts translated and culturally adapted either by AI or by a human psychologist were blindly rated using the Cultural Relevance Questionnaire and the Theoretical Framework of Acceptability. Raters were Arabic-speaking refugees and immigrants, aged between 18 and 69 years, residing in Sweden, Denmark, and Germany. Raters were randomly allocated to 1 of 4 conditions. Each condition consisted of 2 stimuli. Two-factor between-subject design analyses were used to analyze the data.</p><p><strong>Results: </strong>A significant main effect of the text generator domain type (P=.02; η²=0.045) was found in the first rating, with texts adapted by the AI domain perceived as more culturally relevant than those adapted by the human domain. No significant main effect of the CBT technique was found in the first rating (P=.10; η²=0.022). There were no differences in the second rating. Regarding acceptability, no significant main effects of text generator domain type (P=.09; η²=0.024) or the CBT technique (P=.88; η²=0.001) were found in either of the ratings.</p><p><strong>Conclusions: </strong>CBT technique materials adapted by AI may be perceived as similarly culturally relevant as those adapted by a human psychologist. This finding implies the potential to accelerate the cultural adaptation of psychological interventions. However, AI still needs to be used with caution and in accordance with rigorous safety standards and robust frameworks.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e91056"},"PeriodicalIF":2.0,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13138788/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147838172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bridianne O'Dea, Roisin McNamara, Ivan Ck Ma, Amanda Gu, Fiona Jiang, Justin Milesi, Jeanne Ogilvie, Gywn D Evelyn
{"title":"Addressing the Psychological Needs of Adolescents During the Wait Time for Mental Health Treatment: Service Design Study.","authors":"Bridianne O'Dea, Roisin McNamara, Ivan Ck Ma, Amanda Gu, Fiona Jiang, Justin Milesi, Jeanne Ogilvie, Gywn D Evelyn","doi":"10.2196/87067","DOIUrl":"10.2196/87067","url":null,"abstract":"<p><strong>Background: </strong>Adolescents waiting for mental health treatment often experience significant unmet psychological needs, including severe psychological distress, increased use of maladaptive coping strategies, and feelings of abandonment. However, current wait time support offerings across the mental health sector are sparse and lack clear evidence of effectiveness.</p><p><strong>Objective: </strong>Using design thinking, this early report describes the development of a service blueprint for a new model of care (While We Wait) designed to address the psychological needs of adolescents during the wait time for mental health treatment in Australia through targeted support from general practitioners (GPs) and brief, self-directed digital interventions.</p><p><strong>Methods: </strong>In partnership with health service designers from Deloitte Digital Australia, we conducted a rapid 6-week health service design sprint. This industry-led methodology involved iterative weekly activities, including the development of service user personas and service experience principles, consultation sessions with 12 youth with lived experience experts (aged 18 to 20 years) and 15 GPs, insight synthesis, and service blueprint development.</p><p><strong>Results: </strong>The design sprint produced a service blueprint anchored in 5 service experience principles: \"I'm never alone,\" \"It's for me,\" \"I'm in control,\" \"It's easy to use,\" and \"It lifts me up.\" The proposed service model incorporated a five-stage service journey: (1) recognition (the adolescent acknowledges the need for support), (2) initial consultation and onboarding with the GP, (3) support and monitoring, (4) preparation for treatment, and (5) transition to specialist care and follow-up. Key adolescent service outcomes included uptake, acceptability, self-advocacy, mental health and well-being, perceived quality of care, and help seeking intentions and behaviors. For GPs, outcomes included uptake, feasibility, acceptability, and confidence in supporting adolescents during the wait time.</p><p><strong>Conclusions: </strong>This work demonstrates that a rapid, industry-led design thinking approach may help identify priorities for developing services that address adolescents' needs during the wait time for mental health treatment. The project also highlights the value of co-designing mental health services with lived experience experts and service providers. Together, these findings suggest that the wait time may represent an important opportunity for early therapeutic engagement rather than a passive delay before treatment.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":" ","pages":"e87067"},"PeriodicalIF":2.0,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147638896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emma Therése Eliasson, Sara Sutori, Francesca Mura, Victor Ortiz, Vincenzo Catrambone, Gergö Hadlaczky, Ivo Todorov, Antonio Luca Alfeo, Valentina Cardi, Mario G C A Cimino, Giovanna Mioni, Mariano Alcañiz Raya, Gaetano Valenza, Vladimir Carli, Claudio Gentili
{"title":"Curiosity in a Novel Virtual Reality Scenario and Its Association With Symptoms of Depression: Observational Pilot Investigation.","authors":"Emma Therése Eliasson, Sara Sutori, Francesca Mura, Victor Ortiz, Vincenzo Catrambone, Gergö Hadlaczky, Ivo Todorov, Antonio Luca Alfeo, Valentina Cardi, Mario G C A Cimino, Giovanna Mioni, Mariano Alcañiz Raya, Gaetano Valenza, Vladimir Carli, Claudio Gentili","doi":"10.2196/80120","DOIUrl":"https://doi.org/10.2196/80120","url":null,"abstract":"<p><strong>Background: </strong>Curiosity plays a fundamental role in human learning, development, and motivation, and emerging evidence suggests that reduced curiosity is linked to poorer mental health outcomes, including depressive symptoms (DS). However, to date, the majority of curiosity research relies on self-report assessments and thus risks biased reporting. Virtual reality (VR), a novel tool increasingly used within mental health research and treatment, might represent a potent tool for offering ecologically valid insights into curiosity-driven behaviors while circumventing issues related to self-report assessments, including demand characteristics and recall bias.</p><p><strong>Objective: </strong>The study aimed to enhance the assessment of curiosity by using a novel VR environment and to examine its relevance to DS. Specifically, we tested 2 hypotheses using a novel VR environment: first, that curiosity, as assessed through spontaneous exploratory interactions and behaviors in VR, positively correlates with self-reported curiosity, and second, that VR-based curiosity is inversely associated with DS.</p><p><strong>Methods: </strong>This exploratory study used an observational design that included 100 volunteers. All participants completed self-reported assessments of DS and curiosity before engaging in a novel VR scenario. Although progression in the virtual environment required solving cognitive tasks, these were embedded as structural elements rather than framed as the primary objective. Instead, participants' free explorations and interactions with objects formed the basis for the 4 curiosity metrics used in this study. After VR exposure, participants completed a questionnaire assessing cybersickness symptoms.</p><p><strong>Results: </strong>Hypothesis 1 was not supported, as only one curiosity metric, namely object interactions, was positively associated with one aspect of curiosity relating to motivation to seek new knowledge and experiences. Further, diminishing significance after correction for multiple testing warranted caution. Results relating to hypothesis 2 indicated partial support, in that object interaction was significantly associated with DS while controlling for age, sex, and cybersickness levels. Sensitivity analyses showed no associations between object interactions and self-reported anxiety and stress symptoms.</p><p><strong>Conclusions: </strong>VR may be a potent tool for assessing exploratory behaviors in a controlled, yet ecologically valid, environment that avoids issues related to self-report. However, whether such motivations translate to established curiosity constructs warrants further research. This study also provided preliminary insights into how assessing exploratory interactions in VR may be a promising avenue that could enhance the understanding of the etiology and assessment of DS-particularly its early stages.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e80120"},"PeriodicalIF":2.0,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13138711/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147838223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lorena Macias-Navarro, Nalini Ranjit, Gregory W Bounds, Brendon A Providence, Yesmeena Shmaitelly, Naomi M Tice, Shreela V Sharma
{"title":"A Bilingual AI-Based Chatbot for Nutrition Education in a Food Is Medicine Intervention for High-Risk Pregnant Women: Design and Development Study.","authors":"Lorena Macias-Navarro, Nalini Ranjit, Gregory W Bounds, Brendon A Providence, Yesmeena Shmaitelly, Naomi M Tice, Shreela V Sharma","doi":"10.2196/85292","DOIUrl":"https://doi.org/10.2196/85292","url":null,"abstract":"<p><strong>Background: </strong>Conversational agents (artificial intelligence [AI]-based chatbots) offer a novel approach to health interventions by providing personalized, adaptive interactions that improve over time based on user engagement. In nutrition education, given the wide variation in knowledge, skills, and abilities across participants, AI-based chatbots have the potential to enhance accessibility, engagement, and behavior change. Food is Medicine (FIM) interventions, which aim to improve food security and diet quality among multicultural, at-risk populations, often face challenges related to sustained engagement and use.</p><p><strong>Objective: </strong>This paper describes the design, development, and iterative refinement of a bilingual AI-driven nutrition chatbot integrated into an FIM intervention for high-risk pregnant women receiving care at obstetric clinics in Houston, Texas.</p><p><strong>Methods: </strong>The chatbot was developed using an iterative process informed by behavioral theory, human-centered design (HCD), and plan-do-study-act (PDSA) quality improvement cycles. The conversational agent was embedded within an ongoing 2-arm randomized controlled trial (N=200) comparing standard FIM nutrition education to FIM plus AI-driven nutrition chatbot support. HCD activities took place prior to deployment and involved community advisory group members and implementation stakeholders. Postdeployment refinements were guided by 2 PDSA cycles and informal question-and-answer sessions conducted with intervention arm participants. Qualitative feedback was collected using structured scripts to identify facilitators of and barriers to chatbot engagement.</p><p><strong>Results: </strong>The chatbot was developed using the GPT-3.5 Turbo application programming interface. An initial prototype built in Python using Gradio enabled rapid testing but lacked flexibility for modifications. To improve scalability and logging capabilities, the system was rebuilt using PHP, HTML, JavaScript, and SQL. To further understand usage patterns, participants who interacted with the chatbot at least once or not at all (classified as low users; n=32) were engaged in question-and-answer sessions. Of these participants, all were female (32/32, 100%), 88% (28/32) identified as Hispanic or Latino, and 90% (29/32) preferred Spanish. Two PDSA cycles guided iterative refinements. Cycle 1 identified low initial engagement, whereas cycle 2 focused on improving content clarity and cultural relevance through physical reminder prompts. Qualitative findings identified key barriers to engagement, including high cooking self-efficacy with perceived lack of need for support, low technology self-efficacy, and low urgency due to competing priorities.</p><p><strong>Conclusions: </strong>Embedding a bilingual AI-driven nutrition chatbot within an FIM intervention was feasible and featured critical design and implementation considerations for engaging high-risk pregnant popula","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e85292"},"PeriodicalIF":2.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13134822/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Archana Krishnan, Pallavi Khurana, Alexandra R Stankus, Samy J Galvez, Jorge Sanchez, Frederick L Altice
{"title":"An Ecological Momentary Assessment Smartphone App for High-Risk HIV Populations: Development and Usability Study.","authors":"Archana Krishnan, Pallavi Khurana, Alexandra R Stankus, Samy J Galvez, Jorge Sanchez, Frederick L Altice","doi":"10.2196/85108","DOIUrl":"https://doi.org/10.2196/85108","url":null,"abstract":"<p><strong>Background: </strong>HIV incidence has continued to increase among men who have sex with men (MSM) in Peru, despite intervention efforts. Addressing stigma, risky behaviors, and low medication adherence is key to reducing incidence rates. Ecological momentary assessment (EMA) allows for collection of discrete, real-time data on stigmatized, risky behaviors while reducing recall bias.</p><p><strong>Objective: </strong>The aim of this study was to develop and assess the usability of an EMA smartphone app among MSM with HIV in Peru, which tracks daily health risk behaviors to determine ease of use, usefulness, and satisfaction with the app.</p><p><strong>Methods: </strong>A mixed-method 3-phase study was conducted with 10 MSM with HIV, which included a usability test, 10-day field testing, and a debriefing focus group. Quantitative survey data and user analytics allowed for assessments of acceptability and user compliance. Qualitative interview and focus group data were thematically analyzed for in-depth assessments of user satisfaction.</p><p><strong>Results: </strong>Acceptability of the EMA app was high, with a mean usability rating of 6.4 of 7.0 (SD 0.62), indicating high user satisfaction, ease of use, and usefulness. A 10-day field test demonstrated a high average compliance rate of 93% (93/100), which suggests high feasibility of the app for daily tracking of health risk behaviors among MSM with HIV. Interview and focus group findings indicated that the app was navigable, time-efficient, and holds promise for long-term use, particularly with the inclusion of daily reminders and incentives for prolonged use.</p><p><strong>Conclusions: </strong>EMA apps can provide valuable real-time data while protecting users' privacy. This formative work lays the foundation for future larger-scale EMAs of substance use and sexual risk behaviors among high-risk HIV populations, and for the development of just-in-time interventions to address stigma, improve medication adherence, and reduce risky behaviors.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e85108"},"PeriodicalIF":2.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13134826/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chunyi Xian, Aihua Yan, Yaxian Wang, Eileen Yuk Ha Tsang, Lei Huang, David Jingjun Xu
{"title":"Applicable Scenarios, Desired Features, and Risks of AI Psychotherapists in Depression Treatment From the Patient's Perspective: Exploratory Qualitative Study.","authors":"Chunyi Xian, Aihua Yan, Yaxian Wang, Eileen Yuk Ha Tsang, Lei Huang, David Jingjun Xu","doi":"10.2196/85138","DOIUrl":"https://doi.org/10.2196/85138","url":null,"abstract":"<p><strong>Background: </strong>Depression is a pervasive global mental health issue, yet access to trained professionals remains severely limited. With the rapid advancement of artificial intelligence (AI), digital tools are increasingly seen as a viable way to address this shortage. However, questions remain about how digital platforms for mental health care can be effectively designed.</p><p><strong>Objective: </strong>This study aimed to investigate, from an end user's (patient's) perspective, the potential use scenarios, desired features, and perceived risks of AI psychotherapists in depression treatment, providing design guidelines for their development.</p><p><strong>Methods: </strong>A grounded theory approach was applied to analyze qualitative responses from 452 individuals recruited via Amazon Mechanical Turk. Data were collected through a scenario-based online survey on AI-assisted depression treatment administered between March 2023 and May 2023. Participants responded to 3 open-ended questions regarding the potential use of AI in treating depression, the characteristics expected from an AI psychotherapist, and the associated perceived risks, along with demographic, control, and contextual measures. The open-ended responses were inductively coded into themes, with intercoder reliability established (Cohen κ=0.80). In addition, variations in themes were further examined across participant profiles, including social stigma, current depression severity, trust in an AI psychotherapist, and privacy awareness.</p><p><strong>Results: </strong>Participants envisioned AI psychotherapists across 5 primary scenarios: diagnosis, treatment, consultation, self-management, and companionship. Key desired features include professionalism, warmth, precision care, empathy, remote services, active listener, personalization, flexible treatment options, patience, trustworthiness, and basic treatment alternative, while critical concerns include diagnostic inaccuracy, treatment errors, privacy breach, lack of human interaction, technical malfunctions, and lack of emotional engagement. Based on these findings, a general MoSCoW (must have, should have, could have, and won't have) prioritization framework was proposed to serve as a conceptual starting point for future AI system design and empirical validation in mental health care. Notably, feature prioritization varied across user profiles: individuals with higher stigma placed greater emphasis on privacy protection, those with more severe depression prioritized precision care and timely access, low-trust users de-emphasized remote services, and privacy-sensitive individuals showed reduced preference for features requiring extensive data disclosure. These patterns highlight the need for context-sensitive design.</p><p><strong>Conclusions: </strong>This study provides a patient-centered framework for designing AI psychotherapists and complements the existing literature by highlighting the importance of balancin","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e85138"},"PeriodicalIF":2.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13134827/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Craig J Bryan, Jarrod Hay, Noah Treangen, Lauren R Khazem
{"title":"OTX-202 Smartphone App to Reduce Suicidal Ideation Among High-Risk Transition-Age Youth: Open-Label, Single-Arm, Phase 1 Clinical Trial.","authors":"Craig J Bryan, Jarrod Hay, Noah Treangen, Lauren R Khazem","doi":"10.2196/89248","DOIUrl":"https://doi.org/10.2196/89248","url":null,"abstract":"<p><strong>Background: </strong>The transition from adolescence to adulthood (18 to 25 years) is associated with an increased risk of suicidal ideation and behaviors. Suicide-focused cognitive behavioral therapies (CBTs) have been shown to significantly reduce suicidal ideation and behaviors but are not widely available to high-risk individuals. Digital therapeutics could improve access to these treatments.</p><p><strong>Objective: </strong>This study aimed to evaluate the acceptability, safety, and potential efficacy of OTX-202 among transition-age youth (18 to 25 years) receiving mental health care outside an inpatient hospital setting.</p><p><strong>Methods: </strong>In this phase 1 single-arm clinical trial, 59 transition-age youth with recent suicidal ideation or suicide attempts used OTX-202, a smartphone app designed to deliver suicide-focused CBT, concurrently with usual outpatient mental health care. After baseline, eligible patients completed 12 weekly assessments of suicidal ideation, depression, and anxiety.</p><p><strong>Results: </strong>From baseline to week 12, participants reported statistically significant, large reductions in suicidal ideation (mean difference -5.1, 95% CI -6.5 to -3.7; d=0.95). In total, 3 (5.1%; 95% CI 0%-11.2%) participants reported suicide attempts. Reductions in suicidal ideation and suicide attempt rates were consistent with results from previously published randomized clinical trials of suicide-focused CBTs. Participants rated OTX-202 in the 97th percentile of usability and completed a mean of 9.0 (SD 3.5) of 12 app modules, supporting the app's acceptability. There were no patient deaths, device-related events, or severe adverse events, supporting the app's safety.</p><p><strong>Conclusions: </strong>Results support the safety, acceptability, and potential efficacy of OTX-202 for reducing suicide risk among transition-age youth.</p><p><strong>Trial registration: </strong>ClinicalTrials.gov NCT06008132; https://clinicaltrials.gov/study/NCT06008132.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e89248"},"PeriodicalIF":2.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saeyoun Choi, Donghyun Kim, Ji-Hwan Jeon, Minji Kim, Dong Hun Lee, DaeHwan Ahn, Eu Sun Lee, Yoon Ji Kim, Hyun Youk
{"title":"Korean Medical Consultation With Open-Weight Large Language Models: Pilot Comparative Evaluation of Retrieval-Augmented Generation With Metadata Filtering.","authors":"Saeyoun Choi, Donghyun Kim, Ji-Hwan Jeon, Minji Kim, Dong Hun Lee, DaeHwan Ahn, Eu Sun Lee, Yoon Ji Kim, Hyun Youk","doi":"10.2196/72604","DOIUrl":"https://doi.org/10.2196/72604","url":null,"abstract":"<p><strong>Background: </strong>This study develops an open-source large language model-based chatbot tailored for Korean health consultations. The chatbot was implemented using the retrieval-augmented generation (RAG) technique alongside metadata filtering to enhance its performance.</p><p><strong>Objective: </strong>This study aims to analyze and compare the performance of a RAG-based chatbot with other leading language models in the context of Korean health consultations.</p><p><strong>Methods: </strong>A 10.4 GB Korean medical document corpus (487,277 segments) was constructed from official websites of major Korean hospitals, public health sources, and medical textbooks. This study quantitatively compared 5 open-source large language models (Qwen3:4B, Mistral:7B, Llama-3.1:8B, Gpt-Oss:20B, and Gemma3:27B) in 3 configurations: baseline (model only), RAG-only, and RAG with metadata filtering. The RAG system used a specialized Korean embedding model (upskyy/bge-m3-korean) and an Elasticsearch store. Performance was assessed by an emergency medicine specialist using a validation set of 226 questions across 7 common diseases and scoring responses based on accuracy, safety, and helpfulness.</p><p><strong>Results: </strong>The application of RAG alone failed to yield statistically significant performance improvements and, in some cases (Llama 3.1: 8B and Gemma 3: 27B), resulted in decreased scores. However, the combination of RAG with metadata filtering yielded statistically significant (P<.05) performance increases in most models. Notably, the average score for Mistral:7B increased from 3.79, SD 0.08, to 4.10, SD 0.10, and Gpt-Oss:20B increased from 4.43, SD 0.05, to 4.51, SD 0.04, with the latter achieving the highest safety score (4.61, SD 0.03). The Gemma3:27B model, which possessed a high baseline performance (4.42, SD 0.03), was an exception, exhibiting no significant improvement (P=.14) even with filtering.</p><p><strong>Conclusions: </strong>The effectiveness of RAG for specialized domains such as Korean medical consultation is highly dependent on a metadata filtering process that controls the quality of retrieved information; simple information augmentation is insufficient. Furthermore, the benefit of RAG is limited when a model's intrinsic knowledge (eg, Gemma3:27B) already meets or exceeds the quality of the external knowledge base. This finding indicates that performance enhancement strategies must account for both the retrieval mechanism's quality and the model's preexisting capabilities.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e72604"},"PeriodicalIF":2.0,"publicationDate":"2026-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13132483/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gamification of Cognitive Behavioral Therapy Homework: Therapist Concept Mapping Approach.","authors":"Ho Ming Lau, Patricia van Oppen, Heleen Riper","doi":"10.2196/50923","DOIUrl":"https://doi.org/10.2196/50923","url":null,"abstract":"<p><strong>Background: </strong>Greater homework adherence in cognitive behavioral therapy (CBT) is associated with positive treatment outcomes. However, the problems emerging from CBT homework use are common and affect adherence. In recent years, gamification has been explored to increase intervention adherence, but not yet in relation specifically to homework assignments.</p><p><strong>Objective: </strong>In this study, the aim was to gain a better understanding of obstacles to CBT homework and the use of gamification to overcome these.</p><p><strong>Methods: </strong>Concept mapping, a method to organize related information visually, was used in this study. For the 1-day face-to-face concept mapping session, 7 therapists (32 to 55 y, 6 females) participated and generated items based on 2 focal questions of interest. The generated items were grouped on perceived similarity, and each individual item was rated on (1) severity and difficulty (focal question 1) and (2) importance, acceptance by therapist, and acceptance by patient (focal question 2). The item groups on perceived similarity were inserted into computer software. Based on multidimensional scaling and hierarchical cluster analyses, item clusters were generated by the computer software and were presented to the therapists. The therapists were asked for their preference for the number of items a cluster should contain.</p><p><strong>Results: </strong>Through brainstorming, the therapists collectively generated a list of 29 possible reasons for not doing homework by patients. In the same manner, a list of 38 game design elements that could help patients make CBT homework was generated. External factors (eg, no time due to crisis situations) and lack of motivation (eg, not aspiring to a therapy goal) were perceived as the most important reasons for patients not to do homework. External and symptoms-unrelated internal factors were considered by therapists as the most difficult for patients to change for improved homework adherence. The game design elements, facilitation, and rewards were rated as most important to help patients do homework. These elements were also seen as most accepted by therapists.</p><p><strong>Conclusions: </strong>Facilitation of doing homework and rewards seem to have the potential to tackle some of the external factors and lack of motivation to make CBT homework that patients could have. Conclusions were limited by the small number of participating therapists. Future research is needed on the effects of specific game design elements, the number of these elements, their combinations, and patients' preferences.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e50923"},"PeriodicalIF":2.0,"publicationDate":"2026-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13132486/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haleigh Noelle West-Page, Kevin McGoff, Harrison Latimer, Isaac Olufadewa, Shi Chen
{"title":"Evaluating Biomedical Feature Fusion on Machine Learning's Predictability and Interpretability of COVID-19 Severity Types: Model Development, Interpretation, and Validation.","authors":"Haleigh Noelle West-Page, Kevin McGoff, Harrison Latimer, Isaac Olufadewa, Shi Chen","doi":"10.2196/76542","DOIUrl":"https://doi.org/10.2196/76542","url":null,"abstract":"<p><strong>Background: </strong>Accurately differentiating severe from nonsevere COVID-19 clinical types is critical for the health care system to optimize workflow. Current techniques lack the ability to accurately classify COVID-19 clinical types in patients, especially as SARS-CoV-2 continues to mutate.</p><p><strong>Objective: </strong>We explore the predictability and interpretability of multiple state-of-the-art machine learning (ML) techniques trained and tested under different biomedical data types and SARS-CoV-2 variants.</p><p><strong>Methods: </strong>Comprehensive patient-level data were collected from 362 patients (severe COVID-19: n=148; nonsevere COVID-19: n=214) infected with the original SARS-CoV-2 strain in 2020 and 1000 patients (severe COVID-19: n=500; nonsevere COVID-19: n=500) infected with the Omicron variant in 2022-2023. The data included 26 biochemical features from blood testing and 26 clinical features from patients' clinical characteristics and medical history. Different ML techniques, including penalized logistic regression, random forest, k-nearest neighbors, and support vector machines, were applied to build predictive classification models based on each data modality separately and together for each variant. Fifty randomized train-test splits were conducted per scenario, and performance results were recorded.</p><p><strong>Results: </strong>The fusion (hybrid) characteristic modality yielded the highest mean area under the curve (AUC) in this study, achieving 0.915, while the biochemical and clinical modalities had AUCs of 0.862 and 0.818, respectively. All ML models performed similarly under different testing scenarios and were consistent when cross-tested with data of patients infected with the original strain and those infected with the Omicron variant. Our models ranked elevated d-dimer (biochemical), elevated high sensitivity troponin I (biochemical), and age greater than 55 years (clinical) as the most positively predictive features of severe COVID-19.</p><p><strong>Conclusions: </strong>These results are compatible with the hypothesis that ML is a useful tool for predicting severe COVID-19 based on comprehensive individual patient-level data. Further, ML models trained on the biochemical and clinical modalities together show patterns consistent with enhanced predictive performance. The improved performance observed with Omicron variant data agrees with the hypothesis that ML approaches may retain utility across variants in this study setting, although further validation is required before clinical application. Future work using larger datasets with more ethnic variation and investigating unbiased ML interpretation methods may be able to provide further validation.</p>","PeriodicalId":14841,"journal":{"name":"JMIR Formative Research","volume":"10 ","pages":"e76542"},"PeriodicalIF":2.0,"publicationDate":"2026-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13132021/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}