{"title":"Clinical Efficacy, Therapeutic Mechanisms, and Implementation Features of Cognitive Behavioral Therapy-Based Chatbots for Depression and Anxiety: Narrative Review.","authors":"Chang-Ha Im, Minjung Woo","doi":"10.2196/78340","DOIUrl":"10.2196/78340","url":null,"abstract":"<p><strong>Background: </strong>Cognitive behavioral therapy (CBT)-based chatbots, many of which incorporate artificial intelligence (AI) techniques, such as natural language processing and machine learning, are increasingly evaluated as scalable solutions for addressing mental health issues, such as depression and anxiety. These fully automated or minimally supported interventions offer novel pathways for psychological support, especially for individuals with limited access to traditional therapy.</p><p><strong>Objective: </strong>This narrative review synthesized evidence on the clinical efficacy, therapeutic mechanisms, and technological features of CBT-based chatbots designed to alleviate depressive and anxiety symptoms.</p><p><strong>Methods: </strong>Fourteen peer-reviewed studies published between January 2015 and March 2025 were identified through systematic searches and met predefined inclusion criteria. The studies were analyzed to extract information on intervention structure, therapeutic components, outcomes, and implementation characteristics.</p><p><strong>Results: </strong>Across the included studies, CBT-based chatbots consistently demonstrated short-term reductions in depressive symptoms, whereas findings for anxiety outcomes were mixed, with some studies reporting improvements and others showing nonsignificant or unreported effects. Moderate effect sizes were observed for depression. Reported therapeutic features included cognitive restructuring, behavioral activation, relaxation and mindfulness strategies, emotional support, self-monitoring and feedback, and therapeutic alliance. Technological characteristics such as real-time feedback and adaptive goal tracking were associated with enhanced engagement and adherence.</p><p><strong>Conclusions: </strong>CBT-based chatbots appear to be a promising and scalable modality for delivering psychological support, particularly for underserved populations. However, variability in study designs, heterogeneity of outcome reporting, and limited long-term evidence pose challenges for generalizability. Emerging evidence from generative AI chatbots (eg, Therabot and Limbic Care) highlights both opportunities and risks. Future work should examine long-term efficacy, adaptive personalization, cross-cultural adaptation, and rigorous ethical oversight.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e78340"},"PeriodicalIF":5.8,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12669916/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145656011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Torous, Kathryn Taylor Ledley, Carla Gorban, Gillian Strudwick, Julian Schwarz, Soumya Choudhary, Margaret Emerson, Michelle Patriquin, Allison Dempsey, Jason Bantjes, Laura Ospina-Pinillos, Jennie Hornick, Shruti Kochhar
{"title":"Accelerating Digital Mental Health: The Society of Digital Psychiatry's Three-Pronged Road Map for Education, Digital Navigators, and AI.","authors":"John Torous, Kathryn Taylor Ledley, Carla Gorban, Gillian Strudwick, Julian Schwarz, Soumya Choudhary, Margaret Emerson, Michelle Patriquin, Allison Dempsey, Jason Bantjes, Laura Ospina-Pinillos, Jennie Hornick, Shruti Kochhar","doi":"10.2196/84501","DOIUrl":"10.2196/84501","url":null,"abstract":"<p><strong>Unlabelled: </strong>Digital mental health tools such as apps, virtual reality, and artificial intelligence (AI) hold great promise but continue to face barriers to widespread clinical adoption. The Society of Digital Psychiatry, in partnership with JMIR Mental Health, presents a 3-pronged road map to accelerate their safe, effective, and equitable implementation. First, education: integrate digital psychiatry into core training and professional development through a global webinar series, annual symposium, newsletter, and an updated open-access curriculum addressing AI and the evolving digital navigator role. Second, AI standards: develop transparent, actionable benchmarks and consensus guidance through initiatives like MindBench.ai to assess reasoning, safety, and representativeness across populations. Third, digital navigators: expand structured, train-the-trainer programs that enhance digital literacy, engagement, and workflow integration across diverse care settings, including low- and middle-income countries. Together, these pillars bridge research and practice, advancing digital psychiatry grounded in inclusivity, accountability, and measurable clinical impact.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e84501"},"PeriodicalIF":5.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12661594/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145641422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Seeking Emotional and Mental Health Support From Generative AI: Mixed-Methods Study of ChatGPT User Experiences.","authors":"Xiaochen Luo, Zixuan Wang, Jacqueline L Tilley, Sanjeev Balarajan, Ukeme-Abasi Bassey, Choi Ieng Cheang","doi":"10.2196/77951","DOIUrl":"10.2196/77951","url":null,"abstract":"<p><strong>Background: </strong>Generative artificial intelligence (GenAI) models have emerged as a promising yet controversial tool for mental health.</p><p><strong>Objective: </strong>The purpose of this study is to understand the experiences of individuals who repeatedly used ChatGPT (GenAI) for emotional and mental health support (EMS).</p><p><strong>Methods: </strong>We recruited 270 adult participants across 29 countries who regularly used ChatGPT (OpenAI) for EMS during April 2024. Participants responded to quantitative survey questions on the frequency and helpfulness of using ChatGPT for EMS, and qualitative questions regarding their therapeutic purposes, emotional experiences of using, and perceived helpfulness and rationales. Thematic analysis was used to analyze qualitative data.</p><p><strong>Results: </strong>Most participants reported using ChatGPT for EMS at least 1-2 times per month for purposes spanning traditional mental health needs (diagnosis, treatment, and psychoeducation) and general psychosocial needs (companionship, relational guidance, well-being improvement, and decision-making). Users reported various emotional experiences during and after use for EMS (eg, connected, relieved, curious, embarrassed, or disappointed). Almost all users found it at least somewhat helpful. The rationales for perceived helpfulness include perceived changes after use, emotional support, professionalism, information quality, and free expression, whereas the unhelpful aspects include superficial emotional engagement, limited information quality, and lack of professionalism.</p><p><strong>Conclusions: </strong>Despite the absence of ethical regulations for EMS use, GenAI is becoming an increasingly popular self-help tool for emotional and mental health support. These results highlight the blurring boundary between formal mental health care and informal self-help and underscore the importance of understanding the relational and emotional dynamics of human-GenAI interaction. There is an urgent need to promote AI literacy and ethical awareness among community users and health care providers and to clarify the conditions under which GenAI use for mental health promotes well-being or poses risk.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e77951"},"PeriodicalIF":5.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12661908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145641486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pain Cues in People With Dementia: Scoping Review.","authors":"Urška Smrke, Ana Milošič, Izidor Mlakar, Matic Kadiš, Satja Mulej Bratec","doi":"10.2196/75671","DOIUrl":"10.2196/75671","url":null,"abstract":"<p><strong>Background: </strong>Individuals with dementia, especially those in later stages, have difficulties with verbally reporting their experience of pain. This results in both underassessment and undertreatment of pain, signaling the need for better pain recognition in persons with dementia. A promising form of pain assessment is digital monitoring, which can concurrently and more objectively detect and use numerous relevant pain cues.</p><p><strong>Objective: </strong>This review aimed to identify observable cues of pain, which could be used for digital pain monitoring. A total of 2 research questions (RQs) were formed as we set out to examine which digital cues offered a valid insight into pain in people with dementia (RQ1) and identify how these cues were originally measured (RQ2).</p><p><strong>Methods: </strong>A standard methodological approach for scoping reviews was used. Relevant research papers were chosen based on SCOPUS and Web of Science databases, and relevant data on pain cues were extracted from all papers that satisfied the inclusion criteria. The gathered data were analyzed using a thematic analysis, which involved categorizing the observable cues into higher-order categories.</p><p><strong>Results: </strong>Of the 3705 publications identified in the search, 34 satisfied the inclusion criteria and were closely examined. Addressing RQ1, we identified 7 categories of behavioral and physiological cues associated with pain, most frequently facial expressions (20/34, 59%) and body movements or expressions (15/34, 44%). Several subcategories for each main category of pain cues were also identified, each involving between 1 and 28 relevant specific pain cues. Addressing RQ2, 29/34 (85%) studies assessed pain cues via human observation only, while 5/34 (15%) combined human observation with either facial recognition software, PainChek app, or computer vision.</p><p><strong>Conclusions: </strong>The review provides a comprehensive list of the most relevant cues that signify pain in persons with dementia and offers a foundation for the use of artificial intelligence and digital monitoring for the screening of pain in dementia.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e75671"},"PeriodicalIF":5.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12661616/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145641463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Associations Between Both Smartphone Addiction and Objectively Measured Smartphone Use and Sleep Quality and Duration Among University Students: Cross-Sectional Study.","authors":"Jian Yin, Xuanyi Tang, Zeshi Liu, Yangyang Gong, Hui Yang, Yanping Zhang","doi":"10.2196/77796","DOIUrl":"10.2196/77796","url":null,"abstract":"<p><strong>Background: </strong>The impact of smartphone use on sleep remains intensely debated. Most existing studies have used self-reported smartphone use data. Moreover, few studies have simultaneously examined associations between both smartphone addiction and objectively measured smartphone use and sleep, and the dose-response relationship between smartphone use and risk of poor sleep has been consistently overlooked, requiring systematic and further research on this topic.</p><p><strong>Objective: </strong>This study aimed to examine the associations between smartphone addiction and objectively measured smartphone use and sleep quality and duration.</p><p><strong>Methods: </strong>This cross-sectional study enrolled 17,713 participants from a university in China. We assessed objective smartphone screen time and unlocks by collecting screenshots of use records and measured smartphone addiction using a validated questionnaire. Sleep quality and duration were estimated via the Pittsburgh Sleep Quality Index. Binary logistic regression, linear regression, and restricted cubic spline regression models were used for the analyses.</p><p><strong>Results: </strong>A total of 14.3% (2533/17,713) of the participants met the criterion for poor sleep, with a mean sleep duration of 507.1 (SD 103.2) minutes per night. Notably, university students with smartphone addiction exhibited 184% higher risk of poor sleep (odds ratio [OR] 2.84, 95% CI 2.59-3.11) and a 15.47-minute-shorter nighttime sleep duration (β=-15.47, 95% CI -18.53 to -12.42) compared to those without smartphone addiction. Regarding objectively measured smartphone use, participants with ≥63 hours per week of smartphone screen time had 22% higher odds of poor sleep (OR 1.22, 95% CI 1.08-1.37) and a 6.66-minute-shorter nighttime sleep duration (β=-6.66, 95% CI -10.19 to -3.13) compared to those with 0 to 21 hours of screen time per week, whereas those with approximately 21 to 42 hours per week of smartphone screen time had a 5.47-minute-longer nighttime sleep duration (β=5.47, 95% CI 1.28-9.65). Similarly, compared to those with 0 to 50 smartphone unlocks per week, participants with ≥400 smartphone unlocks per week showed 61% higher odds of poor sleep (OR 1.61, 95% CI 1.41-1.85) accompanied by a 4.09-minute-shorter nighttime sleep duration (β=-4.09, 95% CI -8.08 to -0.09), whereas those with approximately 50 to 150 smartphone unlocks per week had a 5.84-minute-longer sleep duration (β=5.84, 95% CI 2.32-9.36). An inverted U-shaped association between smartphone screen time and sleep duration was observed (P<.001 for nonlinearity).</p><p><strong>Conclusions: </strong>Smartphone addiction, excessive objectively measured smartphone screen time, and unlocks are positively associated with both sleep quality and duration. Restricted cubic spline analyses revealed different nuanced dose-response relationships, with an inverted U-shaped association observed between smartphone screen time and sleep dura","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e77796"},"PeriodicalIF":5.8,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12646561/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145607327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Israel Opoku Agyapong, Reham Shalaby, Belinda Agyapong, Wanying Mao, Ernest Owusu, Hossam Eldin Elgendy, Ejemai Eboreime, Peter H Silverstone, Pierre Chue, Xin-Min Li, Wesley Vuong, Arto Ohinmaa, Frank MacMaster, Andrew J Greenshaw
{"title":"Effectiveness of Text Messages and Text Messages Plus Peer Support on Psychiatric Readmission and Length of Stay: Outcomes From a Quantitative Stepped-Wedge Cluster Randomized Trial.","authors":"Vincent Israel Opoku Agyapong, Reham Shalaby, Belinda Agyapong, Wanying Mao, Ernest Owusu, Hossam Eldin Elgendy, Ejemai Eboreime, Peter H Silverstone, Pierre Chue, Xin-Min Li, Wesley Vuong, Arto Ohinmaa, Frank MacMaster, Andrew J Greenshaw","doi":"10.2196/81760","DOIUrl":"10.2196/81760","url":null,"abstract":"<p><strong>Background: </strong>Mental health recovery typically continues after patients leave the hospital. However, hospital readmission in the 12 months after discharge is common and costly.</p><p><strong>Objective: </strong>This study aimed to examine the effectiveness of supportive text messaging (hereinafter \"SMS\") and SMS with or without peer support service on hospital readmission and length of stay after discharge from inpatient psychiatric care.</p><p><strong>Methods: </strong>A stepped-wedge cluster randomized trial was used to examine differences in the changes in the mean number of admissions and the mean duration of total length of stay in days, for patients discharged from psychiatric inpatient care, at 6 and 12 months pre- and post index admissions, for 2 intervention periods compared to a control period of treatment as usual.</p><p><strong>Results: </strong>Overall, 1070 participants were assigned to 1 of 3 study arms: SMS (n=302), SMS with or without peer support service (n=342), or treatment as usual (n=426). Compared to treatment as usual, SMS with or without peer support service reduced hospital readmissions 6 months pre- and post index admission by an average of 0.26 admissions, and SMS alone reduced inpatient length of stays 6 months pre- and post index admission by an average of 7.28 days.</p><p><strong>Conclusions: </strong>Our results demonstrate that simple, low-cost digital tools-either by themselves or paired with peer support-can help close gaps in postdischarge care. We anticipate that these findings may inform future service delivery models and policy development aimed at enhancing postdischarge mental health support. By supporting smoother transitions and reducing future hospital use, such approaches may offer a scalable way to build more sustainable and person-centered mental health systems.</p><p><strong>Trial registration: </strong>ClinicalTrials.gov NCT05133726; https://clinicaltrials.gov/study/NCT05133726.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":" ","pages":"e81760"},"PeriodicalIF":5.8,"publicationDate":"2025-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12673307/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145313984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan Antonio Blasco Amaro, Agnieszka Dobrzynska, Rebeca Isabel-Gómez, Enrique Enrique Perez-Ostos, Eva María Venegas Moreno
{"title":"Telemedicine in Eating Disorder Treatment: Systematic Review.","authors":"Juan Antonio Blasco Amaro, Agnieszka Dobrzynska, Rebeca Isabel-Gómez, Enrique Enrique Perez-Ostos, Eva María Venegas Moreno","doi":"10.2196/74057","DOIUrl":"10.2196/74057","url":null,"abstract":"<p><strong>Background: </strong>Telemedicine has emerged as a promising tool to enhance adherence and monitoring in patients with eating disorders (EDs). Traditional face-to-face cognitive therapies remain the gold standard; however, integrating telemedicine may provide additional support and improve patient engagement and retention. Given the increasing use of digital health interventions, it is crucial to assess their safety and effectiveness in complementing conventional treatments.</p><p><strong>Objective: </strong>We aimed to evaluate the safety and effectiveness of telemedicine as a complementary tool for cognitive face-to-face therapies to promote adherence and monitoring of patients with EDs.</p><p><strong>Methods: </strong>We consulted the National Institute for Health and Care Excellence, the Canadian Agency for Drugs and Technologies in Health (now known as Canada's Drug Agency), MEDLINE (Ovid), Embase, Web of Science, Cochrane Library, international HTA database (International Network of Agencies for Health Technology Assessment), CINAHL (EBSCO), and PsycINFO (EBSCO) websites and databases in December 2024 to identify eligible systematic reviews, synthesis reports, or meta-analyses that address telemedicine as a complementary therapy to face-to-face care in patients with EDs. Two researchers performed an independent critical reading of the systematic reviews and assessed the risk of bias using AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews, version 2).</p><p><strong>Results: </strong>We initially identified 1004 studies, but only 5 (0.5%) systematic reviews met the inclusion criteria. Email, vodcasts, smartphone apps, and SMS text messaging were the principal telemedicine channels. Telemedicine interventions were safe, helpful, and motivating; improved retention rates and patient-physician communication; and reduced ED symptoms.</p><p><strong>Conclusions: </strong>Telemedicine interventions showed promising, positive findings as a complementary tool for face-to-face ED treatment that must be interpreted cautiously. The limited number of systematic reviews selected and their moderate to critically low quality underscore the need for further research in this area.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e74057"},"PeriodicalIF":5.8,"publicationDate":"2025-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12670051/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145543324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tomas Meaney, Vijay Yadav, Isaac Galatzer-Levy, Richard Bryant
{"title":"Using Digital Phenotypes to Identify Individuals With Alexithymia in Posttraumatic Stress Disorder: Cross-Sectional Study.","authors":"Tomas Meaney, Vijay Yadav, Isaac Galatzer-Levy, Richard Bryant","doi":"10.2196/83575","DOIUrl":"10.2196/83575","url":null,"abstract":"<p><strong>Background: </strong>Alexithymia, defined as difficulty identifying and describing one's emotions, has been identified as a transdiagnostic emotional process that impacts the course, severity, and treatment outcomes of psychiatric conditions such as posttraumatic stress disorder (PTSD). As such, alexithymia is an important process to accurately measure and identify in clinical contexts. However, research identifying the association between the experience of alexithymia and psychopathology has been limited by an overreliance on self-report scales, which have restricted use for measuring constructs that involve deficits in self-awareness, such as alexithymia. Hence, more suitable and effective methods of measuring and identifying those experiencing alexithymia in clinical samples are needed.</p><p><strong>Objective: </strong>In this cross-sectional study, we aimed to determine if facial, vocal, and language phenotypes extracted from 1-minute recordings of war veterans with PTSD describing a traumatic event could be used to identify those experiencing alexithymia.</p><p><strong>Methods: </strong>A total of 96 participants were included in this cross-sectional study. Specialized software was used to extract facial, vocal, and language features from the recordings. These features were then integrated into machine learning (extreme gradient boosting [XGBoost]) classification models that were trained and tested within a 5-fold nested cross-validation pipeline for their capacity to classify veterans scoring above the cutoff for alexithymia on the Toronto Alexithymia Scale-20.</p><p><strong>Results: </strong>The best performing XGBoost classification model trained in the nested cross-validation pipeline was able to classify those experiencing alexithymia with a good level of accuracy (average F<sub>1</sub>-score=0.78, SD 0.07; average area under the curve score=0.87, SD 0.12). Consistent with theoretical models and past research into phenotypes of alexithymia, language, vocal, and facial features all contributed to the accuracy of the XGBoost classification model.</p><p><strong>Conclusions: </strong>These findings indicate that facial, vocal, and language phenotypes incorporated in machine learning models could represent a promising alternative to identifying individuals with PTSD who are experiencing alexithymia. The further validation and use of this approach could facilitate more tailored and effective allocation of treatment resources to individuals experiencing alexithymia in clinical settings.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e83575"},"PeriodicalIF":5.8,"publicationDate":"2025-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12661231/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145514794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Influence of Topic Familiarity and Prompt Specificity on Citation Fabrication in Mental Health Research Using Large Language Models: Experimental Study.","authors":"Jake Linardon, Hannah K Jarman, Zoe McClure, Cleo Anderson, Claudia Liu, Mariel Messer","doi":"10.2196/80371","DOIUrl":"10.2196/80371","url":null,"abstract":"<p><strong>Background: </strong>Mental health researchers are increasingly using large language models (LLMs) to improve efficiency, yet these tools can generate fabricated but plausible-sounding content (hallucinations). A notable form of hallucination involves fabricated bibliographic citations that cannot be traced to real publications. Although previous studies have explored citation fabrication across disciplines, it remains unclear whether citation accuracy in LLM output systematically varies across topics within the same field that differ in public visibility, scientific maturity, and specialization.</p><p><strong>Objective: </strong>This study aims to examine the frequency and nature of citation fabrication and bibliographic errors in GPT-4o (Omni) outputs when generating literature reviews on mental health topics that varied in public familiarity and scientific maturity. We also tested whether prompt specificity (general vs specialized) influenced fabrication or accuracy rates.</p><p><strong>Methods: </strong>In June 2025, GPT-4o was prompted to generate 6 literature reviews (~2000 words; ≥20 citations) on 3 disorders representing different levels of public awareness and research coverage: major depressive disorder (high), binge eating disorder (moderate), and body dysmorphic disorder (low). Each disorder was reviewed at 2 levels of specificity: a general overview (symptoms, impacts, and treatments) and a specialized review (evidence for digital interventions). All citations were extracted (N=176) and systematically verified using Google Scholar, Scopus, PubMed, WorldCat, and publisher databases. Citations were classified as fabricated (no identifiable source), real with errors, or fully accurate. Fabrication and accuracy rates were compared by disorder and review type by using chi-square tests.</p><p><strong>Results: </strong>Across the 6 reviews, GPT-4o generated 176 citations; 35 (19.9%) were fabricated. Among the 141 real citations, 64 (45.4%) contained errors, most frequently incorrect or invalid digital object identifiers. Fabrication rates differed significantly by disorder (χ<sup>2</sup><sub>2</sub>=13.7; P=.001), with higher rates for binge eating disorder (17/60, 28%) and body dysmorphic disorder (14/48, 29%) than for major depressive disorder (4/68, 6%). While fabrication did not differ overall by review type, stratified analyses showed higher fabrication for specialized versus general reviews of binge eating disorder (11/24, 46% vs 6/36, 17%; P=.01). Accuracy rates also varied by disorder (χ<sup>2</sup><sub>2</sub>=11.6; P=.003), being lowest for body dysmorphic disorder (20/34, 59%) and highest for major depressive disorder (41/64, 64%). Accuracy rates differed by review type within some disorders, including higher accuracy for general reviews of major depressive disorder (26/34, 77% vs 15/30, 50%; P=.03).</p><p><strong>Conclusions: </strong>Citation fabrication and bibliographic errors remain common in GPT-4o outputs, with ","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e80371"},"PeriodicalIF":5.8,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12658395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145507490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Leslie Miller, Tenzin C Lhaksampa, Alex Walker, Carlos Aguirre, Matthew DeCamp, Keith Harrigian, Jennifer Meuchel, Aja M Meyer, Brittany Nesbitt, Sazal Sthapit, Jason Straub, Danielle Virgadamo, Ayah Zirikly, Mark Dredze, Margaret S Chisolm, Peter P Zandi
{"title":"Dashboard Intervention for Tracking Digital Social Media Activity in the Clinical Care of Individuals With Mood and Anxiety Disorders: Randomized Trial.","authors":"Leslie Miller, Tenzin C Lhaksampa, Alex Walker, Carlos Aguirre, Matthew DeCamp, Keith Harrigian, Jennifer Meuchel, Aja M Meyer, Brittany Nesbitt, Sazal Sthapit, Jason Straub, Danielle Virgadamo, Ayah Zirikly, Mark Dredze, Margaret S Chisolm, Peter P Zandi","doi":"10.2196/74212","DOIUrl":"10.2196/74212","url":null,"abstract":"<p><strong>Background: </strong>Digital social activity, defined as interactions on social media and electronic communication platforms, has become increasingly important. Social factors impact mental health and can contribute to depression and anxiety. Therefore, incorporating digital social activity into routine mental health care has the potential to improve outcomes.</p><p><strong>Objective: </strong>This study aimed to compare treatment augmented with an electronic dashboard of patient's digital social activity versus treatment-as-usual on patient-rated outcomes symptoms of depression in a randomized trial of patients with mood and anxiety disorders.</p><p><strong>Methods: </strong>We developed a personalized electronic dashboard summarizing a participant's digital social activity. This dashboard, collaboratively discussed during mental health visits, was used to augment clinical care and tested in a randomized trial against treatment-as-usual. Clinicians and patients were recruited from outpatient psychiatry clinics. Patients were eligible if they were 12 years or older and were receiving treatment for a mood or anxiety disorder. Psychiatric symptoms measures for depression (primary outcome measure) and anxiety (secondary outcome measure) were obtained at each clinic visit as part of measurement-based standard of care. Baseline and 3-month follow-up assessments included a measure of mental health status and therapeutic alliance measure. Collateral information and clinical action scale were also collected at each visit.</p><p><strong>Results: </strong>A total of 103 patients consented to participate, 97 of whom were randomized to the dashboard arm (n=49) or the treatment-as-usual arm (n=48). There were no differences in psychiatry symptom rating scores or mental health status between the two arms. However, there was a significant increase in the discussion of digital social activity with the intervention, and it did not appear to change patient therapeutic alliance.</p><p><strong>Conclusions: </strong>The incorporation of a personalized electronic dashboard into clinical care was feasible and led to an increased discussion of digital social activity, but there was no impact on mental health outcomes.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e74212"},"PeriodicalIF":5.8,"publicationDate":"2025-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12604431/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145490694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}