Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)最新文献

筛选
英文 中文
MultiLexNorm: A Shared Task on Multilingual Lexical Normalization MultiLexNorm:多语言词汇规范化的共同任务
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021) Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.wnut-1.55
R. Goot, Alan Ramponi, A. Zubiaga, Barbara Plank, Benjamin Muller, I. Roncal, Nikola Ljubesic, Özlem Çetinoğlu, Rahmad Mahendra, Talha Çolakoğlu, Timothy Baldwin, Tommaso Caselli, Wladimir Sidorenko, Bruno Kessler
{"title":"MultiLexNorm: A Shared Task on Multilingual Lexical Normalization","authors":"R. Goot, Alan Ramponi, A. Zubiaga, Barbara Plank, Benjamin Muller, I. Roncal, Nikola Ljubesic, Özlem Çetinoğlu, Rahmad Mahendra, Talha Çolakoğlu, Timothy Baldwin, Tommaso Caselli, Wladimir Sidorenko, Bruno Kessler","doi":"10.18653/v1/2021.wnut-1.55","DOIUrl":"https://doi.org/10.18653/v1/2021.wnut-1.55","url":null,"abstract":"Lexical normalization is the task of transforming an utterance into its standardized form. This task is beneficial for downstream analysis, as it provides a way to harmonize (often spontaneous) linguistic variation. Such variation is typical for social media on which information is shared in a multitude of ways, including diverse languages and code-switching. Since the seminal work of Han and Baldwin (2011) a decade ago, lexical normalization has attracted attention in English and multiple other languages. However, there exists a lack of a common benchmark for comparison of systems across languages with a homogeneous data and evaluation setup. The MULTILEXNORM shared task sets out to fill this gap. We provide the largest publicly available multilingual lexical normalization benchmark including 12 language variants. We propose a homogenized evaluation setup with both intrinsic and extrinsic evaluation. As extrinsic evaluation, we use dependency parsing and part-ofspeech tagging with adapted evaluation metrics (a-LAS, a-UAS, and a-POS) to account for alignment discrepancies. The shared task hosted at W-NUT 2021 attracted 9 participants and 18 submissions. The results show that neural normalization systems outperform the previous state-of-the-art system by a large margin. Downstream parsing and part-of-speech tagging performance is positively affected but to varying degrees, with improvements of up to 1.72 a-LAS, 0.85 a-UAS, and 1.54 a-POS for the winning system.1","PeriodicalId":387944,"journal":{"name":"Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130018462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
CL-MoNoise: Cross-lingual Lexical Normalization CL-MoNoise:跨语言词汇规范化
Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021) Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.wnut-1.56
Rob van der Goot
{"title":"CL-MoNoise: Cross-lingual Lexical Normalization","authors":"Rob van der Goot","doi":"10.18653/v1/2021.wnut-1.56","DOIUrl":"https://doi.org/10.18653/v1/2021.wnut-1.56","url":null,"abstract":"","PeriodicalId":387944,"journal":{"name":"Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130801544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信