Alison W Xin, Dylan M Nielson, Karolin Rose Krause, Guilherme Fiorini, Nick Midgley, Francisco Pereira, Juan Antonio Lossio-Ventura
{"title":"Using large language models to detect outcomes in qualitative studies of adolescent depression.","authors":"Alison W Xin, Dylan M Nielson, Karolin Rose Krause, Guilherme Fiorini, Nick Midgley, Francisco Pereira, Juan Antonio Lossio-Ventura","doi":"10.1093/jamia/ocae298","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences.</p><p><strong>Materials and methods: </strong>Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues).</p><p><strong>Results: </strong>We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns.</p><p><strong>Discussion: </strong>LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data.</p><p><strong>Conclusion: </strong>Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae298","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: We aim to use large language models (LLMs) to detect mentions of nuanced psychotherapeutic outcomes and impacts than previously considered in transcripts of interviews with adolescent depression. Our clinical authors previously created a novel coding framework containing fine-grained therapy outcomes beyond the binary classification (eg, depression vs control) based on qualitative analysis embedded within a clinical study of depression. Moreover, we seek to demonstrate that embeddings from LLMs are informative enough to accurately label these experiences.
Materials and methods: Data were drawn from interviews, where text segments were annotated with different outcome labels. Five different open-source LLMs were evaluated to classify outcomes from the coding framework. Classification experiments were carried out in the original interview transcripts. Furthermore, we repeated those experiments for versions of the data produced by breaking those segments into conversation turns, or keeping non-interviewer utterances (monologues).
Results: We used classification models to predict 31 outcomes and 8 derived labels, for 3 different text segmentations. Area under the ROC curve scores ranged between 0.6 and 0.9 for the original segmentation and 0.7 and 1.0 for the monologues and turns.
Discussion: LLM-based classification models could identify outcomes important to adolescents, such as friendships or academic and vocational functioning, in text transcripts of patient interviews. By using clinical data, we also aim to better generalize to clinical settings compared to studies based on public social media data.
Conclusion: Our results demonstrate that fine-grained therapy outcome coding in psychotherapeutic text is feasible, and can be used to support the quantification of important outcomes for downstream uses.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.