如何将零点学习应用于物质使用研究中的文本数据：概述和媒体数据教程。

IF 5.2 1区医学 Q1 PSYCHIATRY

Addiction Pub Date : 2024-01-11 DOI:10.1111/add.16427

Benjamin Riordan, Abraham Albert Bonela, Zhen He, Aiden Nibali, Dan Anderson-Luxford, Emmanuel Kuntsche

{"title":"如何将零点学习应用于物质使用研究中的文本数据：概述和媒体数据教程。","authors":"Benjamin Riordan, Abraham Albert Bonela, Zhen He, Aiden Nibali, Dan Anderson-Luxford, Emmanuel Kuntsche","doi":"10.1111/add.16427","DOIUrl":null,"url":null,"abstract":"<p>A vast amount of media-related text data is generated daily in the form of social media posts, news stories or academic articles. These text data provide opportunities for researchers to analyse and understand how substance-related issues are being discussed. The main methods to analyse large text data (content analyses or specifically trained deep-learning models) require substantial manual annotation and resources. A machine-learning approach called ‘zero-shot learning’ may be quicker, more flexible and require fewer resources. Zero-shot learning uses models trained on large, unlabelled (or weakly labelled) data sets to classify previously unseen data into categories on which the model has not been specifically trained. This means that a pre-existing zero-shot learning model can be used to analyse media-related text data without the need for task-specific annotation or model training. This approach may be particularly important for analysing data that is time critical. This article describes the relatively new concept of zero-shot learning and how it can be applied to text data in substance use research, including a brief practical tutorial.</p>","PeriodicalId":109,"journal":{"name":"Addiction","volume":"119 5","pages":"951-959"},"PeriodicalIF":5.2000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/add.16427","citationCount":"0","resultStr":"{\"title\":\"How to apply zero-shot learning to text data in substance use research: An overview and tutorial with media data\",\"authors\":\"Benjamin Riordan, Abraham Albert Bonela, Zhen He, Aiden Nibali, Dan Anderson-Luxford, Emmanuel Kuntsche\",\"doi\":\"10.1111/add.16427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>A vast amount of media-related text data is generated daily in the form of social media posts, news stories or academic articles. These text data provide opportunities for researchers to analyse and understand how substance-related issues are being discussed. The main methods to analyse large text data (content analyses or specifically trained deep-learning models) require substantial manual annotation and resources. A machine-learning approach called ‘zero-shot learning’ may be quicker, more flexible and require fewer resources. Zero-shot learning uses models trained on large, unlabelled (or weakly labelled) data sets to classify previously unseen data into categories on which the model has not been specifically trained. This means that a pre-existing zero-shot learning model can be used to analyse media-related text data without the need for task-specific annotation or model training. This approach may be particularly important for analysing data that is time critical. This article describes the relatively new concept of zero-shot learning and how it can be applied to text data in substance use research, including a brief practical tutorial.</p>\",\"PeriodicalId\":109,\"journal\":{\"name\":\"Addiction\",\"volume\":\"119 5\",\"pages\":\"951-959\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1111/add.16427\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Addiction\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/add.16427\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Addiction","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/add.16427","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}

引用次数: 0

摘要

每天都会以社交媒体帖子、新闻报道或学术文章的形式产生大量与媒体相关的文本数据。这些文本数据为研究人员提供了分析和了解如何讨论药物相关问题的机会。分析大型文本数据的主要方法（内容分析或专门训练的深度学习模型）需要大量的人工标注和资源。一种名为 "零点学习 "的机器学习方法可能更快、更灵活，所需的资源也更少。零点学习使用在未标记（或弱标记）的大型数据集上训练的模型，将以前未见过的数据归入模型未专门训练过的类别。这意味着，可以使用已有的零点学习模型来分析与媒体相关的文本数据，而无需进行特定任务标注或模型训练。这种方法对于分析时间紧迫的数据尤为重要。本文介绍了零镜头学习这一相对较新的概念，以及如何将其应用于物质使用研究中的文本数据，包括一个简短的实用教程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

How to apply zero-shot learning to text data in substance use research: An overview and tutorial with media data

A vast amount of media-related text data is generated daily in the form of social media posts, news stories or academic articles. These text data provide opportunities for researchers to analyse and understand how substance-related issues are being discussed. The main methods to analyse large text data (content analyses or specifically trained deep-learning models) require substantial manual annotation and resources. A machine-learning approach called ‘zero-shot learning’ may be quicker, more flexible and require fewer resources. Zero-shot learning uses models trained on large, unlabelled (or weakly labelled) data sets to classify previously unseen data into categories on which the model has not been specifically trained. This means that a pre-existing zero-shot learning model can be used to analyse media-related text data without the need for task-specific annotation or model training. This approach may be particularly important for analysing data that is time critical. This article describes the relatively new concept of zero-shot learning and how it can be applied to text data in substance use research, including a brief practical tutorial.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Addiction 医学-精神病学

CiteScore

10.80

自引率

6.70%

发文量

319

审稿时长

3 months

期刊介绍： Addiction publishes peer-reviewed research reports on pharmacological and behavioural addictions, bringing together research conducted within many different disciplines. Its goal is to serve international and interdisciplinary scientific and clinical communication, to strengthen links between science and policy, and to stimulate and enhance the quality of debate. We seek submissions that are not only technically competent but are also original and contain information or ideas of fresh interest to our international readership. We seek to serve low- and middle-income (LAMI) countries as well as more economically developed countries. Addiction’s scope spans human experimental, epidemiological, social science, historical, clinical and policy research relating to addiction, primarily but not exclusively in the areas of psychoactive substance use and/or gambling. In addition to original research, the journal features editorials, commentaries, reviews, letters, and book reviews.