Sentiment Analysis Using a Large Language Model-Based Approach to Detect Opioids Mixed With Other Substances Via Social Media: Method Development and Validation.

IF 2.3 Q1 HEALTH CARE SCIENCES & SERVICES

JMIR infodemiology Pub Date : 2025-06-19 DOI:10.2196/70525

Muhammad Ahmad, Ildar Batyrshin, Grigori Sidorov

{"title":"Sentiment Analysis Using a Large Language Model-Based Approach to Detect Opioids Mixed With Other Substances Via Social Media: Method Development and Validation.","authors":"Muhammad Ahmad, Ildar Batyrshin, Grigori Sidorov","doi":"10.2196/70525","DOIUrl":null,"url":null,"abstract":"Background: The opioid crisis poses a significant health challenge in the United States, with increasing overdoses and death rates due to opioids mixed with other illicit substances. Various strategies have been developed by federal and local governments and health organizations to address this crisis. One of the most significant objectives is to understand the epidemic through better health surveillance, and machine learning techniques can support this by identifying opioid users at risk of overdose through the analysis of social media data, as many individuals may avoid direct testing but still share their experiences online.Objective: In this study, we take advantage of recent developments in machine learning that allow for insights into patterns of opioid use and potential risk factors in a less invasive manner using self-reported information available on social platforms.Methods: This study used YouTube comments posted between December 2020 and March 2024, in which individuals shared their self-reported experiences of opioid drugs mixed with other substances. We manually annotated our dataset into multiclass categories, capturing both the positive effects of opioid use, such as pain relief, euphoria, and relaxation, and negative experiences, including nausea, sadness, and respiratory depression, to provide a comprehensive understanding of the multifaceted impact of opioids. By analyzing this sentiment, we used 4 state-of-the-art machine learning models, 2 deep learning models, 3 transformer models, and 1 large language model (GPT-3.5 Turbo) to predict overdose risks to improve health care response and intervention strategies.Results: Our proposed methodology (GPT-3.5 Turbo) was highly precise and accurate, helping to automatically identify sentiment based on the adverse effects of opioid drug combinations and high-risk drug use in YouTube comments. Our proposed methodology demonstrated the highest achievable F1-score of 0.95 and a 3.26% performance improvement over traditional machine learning models such as extreme gradient boosting, which demonstrated an F1-score of 0.92.Conclusions: This study demonstrates the potential of leveraging machine learning and large language models, such as GPT-3.5 Turbo, to analyze public sentiment surrounding opioid use and its associated risks. By using YouTube comments as a rich source of self-reported data, the study provides valuable insights into both the positive and negative effects of opioids, particularly when mixed with other substances. The proposed methodology significantly outperformed traditional models, contributing to more accurate predictions of overdose risks and enhancing health care responses to the opioid crisis.","PeriodicalId":73554,"journal":{"name":"JMIR infodemiology","volume":"5 ","pages":"e70525"},"PeriodicalIF":2.3000,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12199843/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR infodemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/70525","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: The opioid crisis poses a significant health challenge in the United States, with increasing overdoses and death rates due to opioids mixed with other illicit substances. Various strategies have been developed by federal and local governments and health organizations to address this crisis. One of the most significant objectives is to understand the epidemic through better health surveillance, and machine learning techniques can support this by identifying opioid users at risk of overdose through the analysis of social media data, as many individuals may avoid direct testing but still share their experiences online.

Objective: In this study, we take advantage of recent developments in machine learning that allow for insights into patterns of opioid use and potential risk factors in a less invasive manner using self-reported information available on social platforms.

Methods: This study used YouTube comments posted between December 2020 and March 2024, in which individuals shared their self-reported experiences of opioid drugs mixed with other substances. We manually annotated our dataset into multiclass categories, capturing both the positive effects of opioid use, such as pain relief, euphoria, and relaxation, and negative experiences, including nausea, sadness, and respiratory depression, to provide a comprehensive understanding of the multifaceted impact of opioids. By analyzing this sentiment, we used 4 state-of-the-art machine learning models, 2 deep learning models, 3 transformer models, and 1 large language model (GPT-3.5 Turbo) to predict overdose risks to improve health care response and intervention strategies.

Results: Our proposed methodology (GPT-3.5 Turbo) was highly precise and accurate, helping to automatically identify sentiment based on the adverse effects of opioid drug combinations and high-risk drug use in YouTube comments. Our proposed methodology demonstrated the highest achievable F1-score of 0.95 and a 3.26% performance improvement over traditional machine learning models such as extreme gradient boosting, which demonstrated an F1-score of 0.92.

Conclusions: This study demonstrates the potential of leveraging machine learning and large language models, such as GPT-3.5 Turbo, to analyze public sentiment surrounding opioid use and its associated risks. By using YouTube comments as a rich source of self-reported data, the study provides valuable insights into both the positive and negative effects of opioids, particularly when mixed with other substances. The proposed methodology significantly outperformed traditional models, contributing to more accurate predictions of overdose risks and enhancing health care responses to the opioid crisis.

Abstract Image

查看原文本刊更多论文

使用基于大型语言模型的方法通过社交媒体检测阿片类药物与其他物质混合的情感分析：方法开发和验证。

背景：阿片类药物危机在美国构成了重大的健康挑战，阿片类药物与其他非法物质混合导致的过量和死亡率不断上升。联邦和地方政府以及卫生组织为应对这一危机制定了各种战略。最重要的目标之一是通过更好的健康监测来了解这种流行病，机器学习技术可以通过分析社交媒体数据来识别有过量风险的阿片类药物使用者，从而支持这一目标，因为许多人可能避免直接测试，但仍在网上分享他们的经验。目的：在本研究中，我们利用机器学习的最新发展，利用社交平台上提供的自我报告信息，以一种侵入性较小的方式深入了解阿片类药物的使用模式和潜在的风险因素。方法：本研究使用了2020年12月至2024年3月期间发布的YouTube评论，其中个人分享了他们自我报告的阿片类药物与其他物质混合的经历。我们手动将我们的数据集标注为多类类别，捕捉阿片类药物使用的积极影响，如疼痛缓解、欣快感和放松，以及负面体验，包括恶心、悲伤和呼吸抑制，以全面了解阿片类药物的多方面影响。通过分析这种情绪，我们使用了4个最先进的机器学习模型、2个深度学习模型、3个变压器模型和1个大型语言模型（GPT-3.5 Turbo）来预测药物过量风险，以提高医疗响应和干预策略。结果：我们提出的方法（GPT-3.5 Turbo）非常精确和准确，有助于根据YouTube评论中阿片类药物组合的不良反应和高风险药物使用自动识别情绪。我们提出的方法证明了最高可实现的f1分数为0.95，比传统的机器学习模型（如极端梯度增强）的性能提高了3.26%，后者的f1分数为0.92。结论：这项研究证明了利用机器学习和大型语言模型（如GPT-3.5 Turbo）来分析阿片类药物使用及其相关风险的公众情绪的潜力。通过使用YouTube评论作为自我报告数据的丰富来源，该研究为阿片类药物的积极和消极影响提供了有价值的见解，特别是当与其他物质混合时。拟议的方法明显优于传统模型，有助于更准确地预测过量风险，并加强对阿片类药物危机的卫生保健反应。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JMIR infodemiology

CiteScore

4.80

自引率

0.00%

发文量