Recovery in personality disorders: the development and preliminary testing of a novel natural language processing model to identify recovery in mental health electronic records.
Giouliana Kadra-Scalzo, Jaya Chaturvedi, Oliver Dale, Richard D Hayes, Lifang Li, Shaza Mahmood, Jonathan Monk-Cunliffe, Angus Roberts, Paul Moran
{"title":"Recovery in personality disorders: the development and preliminary testing of a novel natural language processing model to identify recovery in mental health electronic records.","authors":"Giouliana Kadra-Scalzo, Jaya Chaturvedi, Oliver Dale, Richard D Hayes, Lifang Li, Shaza Mahmood, Jonathan Monk-Cunliffe, Angus Roberts, Paul Moran","doi":"10.3389/fdgth.2025.1544781","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The concept of recovery is of great importance in mental health as it emphasizes improvements in quality of life and functioning alongside the traditional focus on symptomatic remission. Yet, investigating non-symptomatic recovery in the field of personality disorders has been particularly challenging due to complexities in capturing the occurrence of recovery. Electronic health records (EHRs) provide a robust platform from which episodes of recovery can be detected. However, much of the relevant information may be embedded in free-text clinical notes, requiring the development of appropriate tools to extract these data.</p><p><strong>Methods: </strong>Using data from one of Europe's largest electronic health records databases [the Clinical Records Interactive Search (CRIS)], we developed and evaluated natural language processing (NLP) models for the identification of occupational and activities of daily living (ADL) recovery among individuals diagnosed with personality disorder.</p><p><strong>Results: </strong>The models on ADL performed better (precision: 0.80; 95% CI: 0.73-0.84) than those on occupational recovery (precision: 0.62; 95%CI: 0.52-0.72). However, the models performed less acceptably in correctly identifying all those who recovered, generally missing at least 50% of the population of those who had recovered.</p><p><strong>Conclusion: </strong>It is feasible to develop NLP models for the identification of recovery domains for individuals with a diagnosis of personality disorder. Future research needs to improve the efficiency of pre-processing strategies to handle long clinical documents.</p>","PeriodicalId":73078,"journal":{"name":"Frontiers in digital health","volume":"7 ","pages":"1544781"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12003297/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fdgth.2025.1544781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: The concept of recovery is of great importance in mental health as it emphasizes improvements in quality of life and functioning alongside the traditional focus on symptomatic remission. Yet, investigating non-symptomatic recovery in the field of personality disorders has been particularly challenging due to complexities in capturing the occurrence of recovery. Electronic health records (EHRs) provide a robust platform from which episodes of recovery can be detected. However, much of the relevant information may be embedded in free-text clinical notes, requiring the development of appropriate tools to extract these data.
Methods: Using data from one of Europe's largest electronic health records databases [the Clinical Records Interactive Search (CRIS)], we developed and evaluated natural language processing (NLP) models for the identification of occupational and activities of daily living (ADL) recovery among individuals diagnosed with personality disorder.
Results: The models on ADL performed better (precision: 0.80; 95% CI: 0.73-0.84) than those on occupational recovery (precision: 0.62; 95%CI: 0.52-0.72). However, the models performed less acceptably in correctly identifying all those who recovered, generally missing at least 50% of the population of those who had recovered.
Conclusion: It is feasible to develop NLP models for the identification of recovery domains for individuals with a diagnosis of personality disorder. Future research needs to improve the efficiency of pre-processing strategies to handle long clinical documents.