MultiLexNorm: A Shared Task on Multilingual Lexical Normalization

Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021) Pub Date : 1900-01-01 DOI:10.18653/v1/2021.wnut-1.55

R. Goot, Alan Ramponi, A. Zubiaga, Barbara Plank, Benjamin Muller, I. Roncal, Nikola Ljubesic, Özlem Çetinoğlu, Rahmad Mahendra, Talha Çolakoğlu, Timothy Baldwin, Tommaso Caselli, Wladimir Sidorenko, Bruno Kessler

{"title":"MultiLexNorm: A Shared Task on Multilingual Lexical Normalization","authors":"R. Goot, Alan Ramponi, A. Zubiaga, Barbara Plank, Benjamin Muller, I. Roncal, Nikola Ljubesic, Özlem Çetinoğlu, Rahmad Mahendra, Talha Çolakoğlu, Timothy Baldwin, Tommaso Caselli, Wladimir Sidorenko, Bruno Kessler","doi":"10.18653/v1/2021.wnut-1.55","DOIUrl":null,"url":null,"abstract":"Lexical normalization is the task of transforming an utterance into its standardized form. This task is beneficial for downstream analysis, as it provides a way to harmonize (often spontaneous) linguistic variation. Such variation is typical for social media on which information is shared in a multitude of ways, including diverse languages and code-switching. Since the seminal work of Han and Baldwin (2011) a decade ago, lexical normalization has attracted attention in English and multiple other languages. However, there exists a lack of a common benchmark for comparison of systems across languages with a homogeneous data and evaluation setup. The MULTILEXNORM shared task sets out to fill this gap. We provide the largest publicly available multilingual lexical normalization benchmark including 12 language variants. We propose a homogenized evaluation setup with both intrinsic and extrinsic evaluation. As extrinsic evaluation, we use dependency parsing and part-ofspeech tagging with adapted evaluation metrics (a-LAS, a-UAS, and a-POS) to account for alignment discrepancies. The shared task hosted at W-NUT 2021 attracted 9 participants and 18 submissions. The results show that neural normalization systems outperform the previous state-of-the-art system by a large margin. Downstream parsing and part-of-speech tagging performance is positively affected but to varying degrees, with improvements of up to 1.72 a-LAS, 0.85 a-UAS, and 1.54 a-POS for the winning system.1","PeriodicalId":387944,"journal":{"name":"Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2021.wnut-1.55","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 16

Abstract

Lexical normalization is the task of transforming an utterance into its standardized form. This task is beneficial for downstream analysis, as it provides a way to harmonize (often spontaneous) linguistic variation. Such variation is typical for social media on which information is shared in a multitude of ways, including diverse languages and code-switching. Since the seminal work of Han and Baldwin (2011) a decade ago, lexical normalization has attracted attention in English and multiple other languages. However, there exists a lack of a common benchmark for comparison of systems across languages with a homogeneous data and evaluation setup. The MULTILEXNORM shared task sets out to fill this gap. We provide the largest publicly available multilingual lexical normalization benchmark including 12 language variants. We propose a homogenized evaluation setup with both intrinsic and extrinsic evaluation. As extrinsic evaluation, we use dependency parsing and part-ofspeech tagging with adapted evaluation metrics (a-LAS, a-UAS, and a-POS) to account for alignment discrepancies. The shared task hosted at W-NUT 2021 attracted 9 participants and 18 submissions. The results show that neural normalization systems outperform the previous state-of-the-art system by a large margin. Downstream parsing and part-of-speech tagging performance is positively affected but to varying degrees, with improvements of up to 1.72 a-LAS, 0.85 a-UAS, and 1.54 a-POS for the winning system.1

查看原文本刊更多论文

MultiLexNorm:多语言词汇规范化的共同任务

词汇规范化是将话语转化为其标准化形式的任务。这项任务对下游分析是有益的，因为它提供了一种协调(通常是自发的)语言变化的方法。这种变化在社交媒体上是典型的，在社交媒体上，信息以多种方式共享，包括不同的语言和代码转换。自十年前Han和Baldwin(2011)的开创性工作以来，词汇规范化在英语和其他多种语言中引起了人们的关注。然而，缺乏一个通用的基准来比较具有同构数据和评估设置的跨语言系统。MULTILEXNORM共享任务旨在填补这一空白。我们提供了最大的公开可用的多语言词汇规范化基准，包括12种语言变体。我们提出了一种均质化的内部和外部评价体系。作为外部评估，我们使用依赖解析和词性标记与适应的评估指标(a-LAS, a-UAS和a-POS)来解释对齐差异。在W-NUT 2021举办的共享任务吸引了9名参与者和18份提交。结果表明，神经归一化系统在很大程度上优于以前最先进的系统。下游解析和词性标注性能受到了积极影响，但程度不同，获胜系统的改进幅度高达1.72 a-LAS、0.85 a-UAS和1.54 a-POS

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)

自引率

0.00%

发文量