大型语言模型时代的攻击性语言识别

IF 1.9 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Natural Language Engineering Pub Date : 2023-12-06 DOI:10.1017/s1351324923000517

Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe

{"title":"大型语言模型时代的攻击性语言识别","authors":"Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe","doi":"10.1017/s1351324923000517","DOIUrl":null,"url":null,"abstract":"The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"187 ","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OffensEval 2023: Offensive language identification in the age of Large Language Models\",\"authors\":\"Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe\",\"doi\":\"10.1017/s1351324923000517\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.\",\"PeriodicalId\":49143,\"journal\":{\"name\":\"Natural Language Engineering\",\"volume\":\"187 \",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2023-12-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Language Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1017/s1351324923000517\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1017/s1351324923000517","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

作为SemEval-2019-2020的一部分，“offensive seval”共享任务非常受欢迎，吸引了1300多个团队参与。共享任务的两个版本通过向社区提供阿拉伯文、丹麦文、英文、希腊文和土耳其文的基准数据集，帮助推进了攻击性语言识别的最新技术。数据集使用OLID分层分类法进行注释，从那时起，OLID已成为一般攻击性语言识别研究的事实上的标准，并在OffensEval之外被广泛使用。本文介绍了对OffensEval和相关比赛的调查，并讨论了从中获得的主要经验教训。我们进一步评估了大型语言模型(llm)的性能，它们最近彻底改变了自然语言处理领域。我们使用六个流行的llm的零射击提示和两个特定任务微调BERT模型的零射击学习，并将结果与OffensEval比赛中表现最好的团队的结果进行比较。我们的研究结果表明，虽然一些lmm(如Flan-T5)达到了具有竞争力的性能，但总体而言，llm落后于最佳的OffensEval系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

OffensEval 2023: Offensive language identification in the age of Large Language Models

The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Natural Language Engineering COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

5.90

自引率

12.00%

发文量

审稿时长

>12 weeks

期刊介绍： Natural Language Engineering meets the needs of professionals and researchers working in all areas of computerised language processing, whether from the perspective of theoretical or descriptive linguistics, lexicology, computer science or engineering. Its aim is to bridge the gap between traditional computational linguistics research and the implementation of practical applications with potential real-world use. As well as publishing research articles on a broad range of topics - from text analysis, machine translation, information retrieval and speech analysis and generation to integrated systems and multi modal interfaces - it also publishes special issues on specific areas and technologies within these topics, an industry watch column and book reviews.