Xintong Ju , Jake Solka , Katherine Weber , VG Vinod Vydiswaran , Lewei Allison Lin , Erin E. Bonar , Anne C. Fernandez
{"title":"电子健康记录中不健康酒精使用检测:使用自然语言处理的比较研究。","authors":"Xintong Ju , Jake Solka , Katherine Weber , VG Vinod Vydiswaran , Lewei Allison Lin , Erin E. Bonar , Anne C. Fernandez","doi":"10.1016/j.drugalcdep.2025.112920","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Unhealthy alcohol use, including risky alcohol use and alcohol use disorder (AUD), are under-identified in primary care settings. Natural Language Processing (NLP) is a promising approach that could identify unhealthy alcohol use from clinical notes even when structured data (SD) indicators are lacking. This study prospectively evaluated the performance of SD and NLP in identifying unhealthy alcohol use in primary care patients.</div></div><div><h3>Methods</h3><div>We extracted electronic health record (EHR) data of primary care patients at a large Midwestern Health System (N = 133,144) and applied two identification approaches; an SD approach (i.e., diagnostic codes and alcohol screening scores) and an NLP-based approach. We then recruited N = 170 participants identified by SD (N = 85) or NLP (N = 85) to complete gold-standard self-report measures and compared the number of positive cases identified by each method.</div></div><div><h3>Results</h3><div>In the full EHR sample, SD identified 820 cases of unhealthy alcohol use, and NLP identified 48,262 cases with unhealthy alcohol use. Among participants identified by SD, 41.18 % reported AUD, and 28.82 % reported risky alcohol use. Among those identified by NLP, 20 % reported AUD and 27.06 % reported risky alcohol use. Participants identified by SD had more AUD symptoms and mental health difficulties.</div></div><div><h3>Conclusions</h3><div>NLP identified many primary care patients with indicators of unhealthy alcohol use that SD missed, indicating NLP could substantially expand identification of unhealthy alcohol use in primary care populations, particularly those with lower severity alcohol use disorder. NLP could complement traditional screening methods for comprehensive unhealthy alcohol use detection.</div></div>","PeriodicalId":11322,"journal":{"name":"Drug and alcohol dependence","volume":"277 ","pages":"Article 112920"},"PeriodicalIF":3.6000,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unhealthy alcohol use detection in electronic health records: A comparative study using natural language processing\",\"authors\":\"Xintong Ju , Jake Solka , Katherine Weber , VG Vinod Vydiswaran , Lewei Allison Lin , Erin E. Bonar , Anne C. Fernandez\",\"doi\":\"10.1016/j.drugalcdep.2025.112920\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Unhealthy alcohol use, including risky alcohol use and alcohol use disorder (AUD), are under-identified in primary care settings. Natural Language Processing (NLP) is a promising approach that could identify unhealthy alcohol use from clinical notes even when structured data (SD) indicators are lacking. This study prospectively evaluated the performance of SD and NLP in identifying unhealthy alcohol use in primary care patients.</div></div><div><h3>Methods</h3><div>We extracted electronic health record (EHR) data of primary care patients at a large Midwestern Health System (N = 133,144) and applied two identification approaches; an SD approach (i.e., diagnostic codes and alcohol screening scores) and an NLP-based approach. We then recruited N = 170 participants identified by SD (N = 85) or NLP (N = 85) to complete gold-standard self-report measures and compared the number of positive cases identified by each method.</div></div><div><h3>Results</h3><div>In the full EHR sample, SD identified 820 cases of unhealthy alcohol use, and NLP identified 48,262 cases with unhealthy alcohol use. Among participants identified by SD, 41.18 % reported AUD, and 28.82 % reported risky alcohol use. Among those identified by NLP, 20 % reported AUD and 27.06 % reported risky alcohol use. Participants identified by SD had more AUD symptoms and mental health difficulties.</div></div><div><h3>Conclusions</h3><div>NLP identified many primary care patients with indicators of unhealthy alcohol use that SD missed, indicating NLP could substantially expand identification of unhealthy alcohol use in primary care populations, particularly those with lower severity alcohol use disorder. NLP could complement traditional screening methods for comprehensive unhealthy alcohol use detection.</div></div>\",\"PeriodicalId\":11322,\"journal\":{\"name\":\"Drug and alcohol dependence\",\"volume\":\"277 \",\"pages\":\"Article 112920\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Drug and alcohol dependence\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0376871625003734\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Drug and alcohol dependence","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0376871625003734","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
Unhealthy alcohol use detection in electronic health records: A comparative study using natural language processing
Background
Unhealthy alcohol use, including risky alcohol use and alcohol use disorder (AUD), are under-identified in primary care settings. Natural Language Processing (NLP) is a promising approach that could identify unhealthy alcohol use from clinical notes even when structured data (SD) indicators are lacking. This study prospectively evaluated the performance of SD and NLP in identifying unhealthy alcohol use in primary care patients.
Methods
We extracted electronic health record (EHR) data of primary care patients at a large Midwestern Health System (N = 133,144) and applied two identification approaches; an SD approach (i.e., diagnostic codes and alcohol screening scores) and an NLP-based approach. We then recruited N = 170 participants identified by SD (N = 85) or NLP (N = 85) to complete gold-standard self-report measures and compared the number of positive cases identified by each method.
Results
In the full EHR sample, SD identified 820 cases of unhealthy alcohol use, and NLP identified 48,262 cases with unhealthy alcohol use. Among participants identified by SD, 41.18 % reported AUD, and 28.82 % reported risky alcohol use. Among those identified by NLP, 20 % reported AUD and 27.06 % reported risky alcohol use. Participants identified by SD had more AUD symptoms and mental health difficulties.
Conclusions
NLP identified many primary care patients with indicators of unhealthy alcohol use that SD missed, indicating NLP could substantially expand identification of unhealthy alcohol use in primary care populations, particularly those with lower severity alcohol use disorder. NLP could complement traditional screening methods for comprehensive unhealthy alcohol use detection.
期刊介绍:
Drug and Alcohol Dependence is an international journal devoted to publishing original research, scholarly reviews, commentaries, and policy analyses in the area of drug, alcohol and tobacco use and dependence. Articles range from studies of the chemistry of substances of abuse, their actions at molecular and cellular sites, in vitro and in vivo investigations of their biochemical, pharmacological and behavioural actions, laboratory-based and clinical research in humans, substance abuse treatment and prevention research, and studies employing methods from epidemiology, sociology, and economics.