物质使用障碍信息提取的自然语言处理：系统文献综述。

IF 4.6 2区医学 Q1 SUBSTANCE ABUSE

Current Addiction Reports Pub Date : 2026-01-01 Epub Date: 2026-04-11 DOI:10.1007/s40429-026-00733-3

Ransom J Wyse, David C Samuels, Sandra Sanchez-Roige, Lori Schirle, Bethany A Rhoten, Seo Yoon Lee, Alvin D Jeffery

{"title":"物质使用障碍信息提取的自然语言处理：系统文献综述。","authors":"Ransom J Wyse, David C Samuels, Sandra Sanchez-Roige, Lori Schirle, Bethany A Rhoten, Seo Yoon Lee, Alvin D Jeffery","doi":"10.1007/s40429-026-00733-3","DOIUrl":null,"url":null,"abstract":"Purpose of review: To examine the use of natural language processing (NLP) for substance use disorder (SUD) information extraction.Recent findings: 623 studies were reviewed, of which 35 met inclusion criteria. 1 paper (2.9%) was alcohol-related, 12 (34.3%) were opioid-related, 6 (17.1%) were tobacco-related, and 16 (45.7%) included multiple SUDs. Of the three types of NLP categorized for this analysis, 65.7% followed a Rule-Based approach, 37.1% followed a Machine-Learning approach, and 11.4% followed a Deep-Learning approach. NLP methods were categorized into three groups, with 43% as \"Most common use\" (e.g., concept extraction), 20-35% as \"Regular use\" (e.g., regular expressions), and < 10% as \"Rare use\" (e.g., sentiment analysis). Various software applications were used in each included paper, with Python leading (10 papers), followed by cTAKES (9 papers), NegEx (6 papers), R (4 papers) and others. Multiple evaluation metrics were used in each included paper; Multiple SUDs (6 papers) utilized a comparison of F1 scores and ROC AUC, followed by Tobacco (4 papers), Opioids (3 papers), and Alcohol (1 paper), each with acceptable-to-outstanding ROC AUC scores ( > = 0.7) and good-to-excellent F1 scores ( > = 0.7).Summary: Most papers included in this systematic review encompassed multiple SUDs following Rule-Based approaches, \"Most common use\" NLP methods (e.g. concept extraction), and familiar software applications (e.g. Python). Evaluation metrics for SUD papers utilizing NLP included common performance metrics, with ROC AUC and F1 scores achieving acceptable-to-outstanding discrimination between classes and good-to-excellent balance between precision and recall, respectively. The future direction of NLP for SUD information extraction could make use of Machine- or Deep-Learning approaches, advanced methods including Regular expressions or Sentiment analysis, and/or advanced software packages designed specifically for NLP endeavors, to better inform public health research and clinical decision making.","PeriodicalId":52300,"journal":{"name":"Current Addiction Reports","volume":"13 1","pages":"34"},"PeriodicalIF":4.6000,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13070045/pdf/","citationCount":"0","resultStr":"{\"title\":\"Natural Language Processing for Substance Use Disorder Information Extraction: A Systematic Literature Review.\",\"authors\":\"Ransom J Wyse, David C Samuels, Sandra Sanchez-Roige, Lori Schirle, Bethany A Rhoten, Seo Yoon Lee, Alvin D Jeffery\",\"doi\":\"10.1007/s40429-026-00733-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose of review: To examine the use of natural language processing (NLP) for substance use disorder (SUD) information extraction.Recent findings: 623 studies were reviewed, of which 35 met inclusion criteria. 1 paper (2.9%) was alcohol-related, 12 (34.3%) were opioid-related, 6 (17.1%) were tobacco-related, and 16 (45.7%) included multiple SUDs. Of the three types of NLP categorized for this analysis, 65.7% followed a Rule-Based approach, 37.1% followed a Machine-Learning approach, and 11.4% followed a Deep-Learning approach. NLP methods were categorized into three groups, with 43% as \\\"Most common use\\\" (e.g., concept extraction), 20-35% as \\\"Regular use\\\" (e.g., regular expressions), and < 10% as \\\"Rare use\\\" (e.g., sentiment analysis). Various software applications were used in each included paper, with Python leading (10 papers), followed by cTAKES (9 papers), NegEx (6 papers), R (4 papers) and others. Multiple evaluation metrics were used in each included paper; Multiple SUDs (6 papers) utilized a comparison of F1 scores and ROC AUC, followed by Tobacco (4 papers), Opioids (3 papers), and Alcohol (1 paper), each with acceptable-to-outstanding ROC AUC scores ( > = 0.7) and good-to-excellent F1 scores ( > = 0.7).Summary: Most papers included in this systematic review encompassed multiple SUDs following Rule-Based approaches, \\\"Most common use\\\" NLP methods (e.g. concept extraction), and familiar software applications (e.g. Python). Evaluation metrics for SUD papers utilizing NLP included common performance metrics, with ROC AUC and F1 scores achieving acceptable-to-outstanding discrimination between classes and good-to-excellent balance between precision and recall, respectively. The future direction of NLP for SUD information extraction could make use of Machine- or Deep-Learning approaches, advanced methods including Regular expressions or Sentiment analysis, and/or advanced software packages designed specifically for NLP endeavors, to better inform public health research and clinical decision making.\",\"PeriodicalId\":52300,\"journal\":{\"name\":\"Current Addiction Reports\",\"volume\":\"13 1\",\"pages\":\"34\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2026-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13070045/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Addiction Reports\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s40429-026-00733-3\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2026/4/11 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"SUBSTANCE ABUSE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Addiction Reports","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s40429-026-00733-3","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2026/4/11 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"SUBSTANCE ABUSE","Score":null,"Total":0}

引用次数: 0

摘要

综述目的：探讨自然语言处理（NLP）在物质使用障碍（SUD）信息提取中的应用。最近的发现：623项研究被审查，其中35项符合纳入标准。酒精相关文献1篇（2.9%），阿片类药物相关文献12篇（34.3%），烟草相关文献6篇（17.1%），多发sud 16篇（45.7%）。在用于该分析的三种类型的NLP中，65.7%采用基于规则的方法，37.1%采用机器学习方法，11.4%采用深度学习方法。NLP方法被分为三组，43%为“最常用”（如概念提取），20-35%为“常用”（如正则表达式），和= 0.7)和优秀到优秀的F1分数（> = 0.7）。摘要：本系统综述中包含的大多数论文都包含了基于规则的方法，“最常用”NLP方法（例如概念提取）和熟悉的软件应用程序（例如Python）的多个sud。使用NLP的SUD论文的评估指标包括常见的性能指标，ROC AUC和F1分数分别在类别之间实现了可接受到杰出的区分，在精度和召回率之间实现了良好到优秀的平衡。NLP用于SUD信息提取的未来方向可以利用机器或深度学习方法，包括正则表达式或情感分析在内的高级方法，和/或专门为NLP工作设计的高级软件包，以更好地为公共卫生研究和临床决策提供信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Natural Language Processing for Substance Use Disorder Information Extraction: A Systematic Literature Review.

Purpose of review: To examine the use of natural language processing (NLP) for substance use disorder (SUD) information extraction.

Recent findings: 623 studies were reviewed, of which 35 met inclusion criteria. 1 paper (2.9%) was alcohol-related, 12 (34.3%) were opioid-related, 6 (17.1%) were tobacco-related, and 16 (45.7%) included multiple SUDs. Of the three types of NLP categorized for this analysis, 65.7% followed a Rule-Based approach, 37.1% followed a Machine-Learning approach, and 11.4% followed a Deep-Learning approach. NLP methods were categorized into three groups, with 43% as "Most common use" (e.g., concept extraction), 20-35% as "Regular use" (e.g., regular expressions), and < 10% as "Rare use" (e.g., sentiment analysis). Various software applications were used in each included paper, with Python leading (10 papers), followed by cTAKES (9 papers), NegEx (6 papers), R (4 papers) and others. Multiple evaluation metrics were used in each included paper; Multiple SUDs (6 papers) utilized a comparison of F1 scores and ROC AUC, followed by Tobacco (4 papers), Opioids (3 papers), and Alcohol (1 paper), each with acceptable-to-outstanding ROC AUC scores ( > = 0.7) and good-to-excellent F1 scores ( > = 0.7).

Summary: Most papers included in this systematic review encompassed multiple SUDs following Rule-Based approaches, "Most common use" NLP methods (e.g. concept extraction), and familiar software applications (e.g. Python). Evaluation metrics for SUD papers utilizing NLP included common performance metrics, with ROC AUC and F1 scores achieving acceptable-to-outstanding discrimination between classes and good-to-excellent balance between precision and recall, respectively. The future direction of NLP for SUD information extraction could make use of Machine- or Deep-Learning approaches, advanced methods including Regular expressions or Sentiment analysis, and/or advanced software packages designed specifically for NLP endeavors, to better inform public health research and clinical decision making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Current Addiction Reports Psychology-Clinical Psychology

CiteScore

6.60

自引率

4.70%

发文量

期刊介绍： This journal focuses on the prevention, assessment and diagnosis, and treatment of addiction. Designed for physicians and other mental health professionals who need to keep up-to-date with the latest research, Current Addiction Reports offers expert reviews on the most recent and important research in addiction. We accomplish this by appointing leaders in the field to serve as Section Editors in key subject areas and disciplines, such asAlcoholTobaccoStimulants, cannabis, and club drugsBehavioral addictionsGender disparities in addictionComorbid psychiatric disorders and addictionSubstance abuse disorders and HIVSection Editors, in turn, select the most pressing topics as well as experts to evaluate the latest research, report on any controversial discoveries or hypotheses of interest, and ultimately bring readers up-to-date on the topic. Articles represent interdisciplinary endeavors with research from fields such as psychiatry, psychology, pharmacology, epidemiology, and neuroscience.Additionally, an international Editorial Board—representing a range of disciplines within addiction medicine—ensures that the journal content includes current, emerging research and suggests articles of special interest to their country or region.