任务2:比较NER的跨度预测和序列标记方法

International Workshop on Semantic Evaluation Pub Date : 2023-05-05 DOI:10.48550/arXiv.2305.03845

Harsh Verma, S. Bergler

{"title":"任务2:比较NER的跨度预测和序列标记方法","authors":"Harsh Verma, S. Bergler","doi":"10.48550/arXiv.2305.03845","DOIUrl":null,"url":null,"abstract":"This paper summarizes the CLaC submission for the MultiCoNER 2 task which concerns the recognition of complex, fine-grained named entities. We compare two popular approaches for NER, namely SequenceLabeling and Span Prediction. We find that our best Span Prediction system performs slightly better than our best Sequence Labeling system on test data. Moreover, we find that using the larger version of XLM RoBERTa significantly improves performance. Post-competition experiments show that Span Prediction and Sequence Labeling approaches improve when they use special input tokens ([s] and [/s]) of XLM-RoBERTa. The code for training all models, preprocessing, and post-processing is available at https://github.com/harshshredding/semeval2023-multiconer-paper.","PeriodicalId":444285,"journal":{"name":"International Workshop on Semantic Evaluation","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling Approaches for NER\",\"authors\":\"Harsh Verma, S. Bergler\",\"doi\":\"10.48550/arXiv.2305.03845\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper summarizes the CLaC submission for the MultiCoNER 2 task which concerns the recognition of complex, fine-grained named entities. We compare two popular approaches for NER, namely SequenceLabeling and Span Prediction. We find that our best Span Prediction system performs slightly better than our best Sequence Labeling system on test data. Moreover, we find that using the larger version of XLM RoBERTa significantly improves performance. Post-competition experiments show that Span Prediction and Sequence Labeling approaches improve when they use special input tokens ([s] and [/s]) of XLM-RoBERTa. The code for training all models, preprocessing, and post-processing is available at https://github.com/harshshredding/semeval2023-multiconer-paper.\",\"PeriodicalId\":444285,\"journal\":{\"name\":\"International Workshop on Semantic Evaluation\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Workshop on Semantic Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2305.03845\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Workshop on Semantic Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2305.03845","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

本文总结了MultiCoNER 2任务的CLaC提交，该任务涉及复杂的、细粒度的命名实体的识别。我们比较了两种流行的NER方法，即序列标记和跨度预测。我们发现我们的最佳Span预测系统在测试数据上的表现略好于我们的最佳序列标记系统。此外，我们发现使用较大版本的XLM RoBERTa可以显著提高性能。赛后实验表明，当使用XLM-RoBERTa的特殊输入标记([s]和[/s])时，Span预测和序列标记方法得到了改进。用于训练所有模型、预处理和后处理的代码可在https://github.com/harshshredding/semeval2023-multiconer-paper上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CLaC at SemEval-2023 Task 2: Comparing Span-Prediction and Sequence-Labeling Approaches for NER

This paper summarizes the CLaC submission for the MultiCoNER 2 task which concerns the recognition of complex, fine-grained named entities. We compare two popular approaches for NER, namely SequenceLabeling and Span Prediction. We find that our best Span Prediction system performs slightly better than our best Sequence Labeling system on test data. Moreover, we find that using the larger version of XLM RoBERTa significantly improves performance. Post-competition experiments show that Span Prediction and Sequence Labeling approaches improve when they use special input tokens ([s] and [/s]) of XLM-RoBERTa. The code for training all models, preprocessing, and post-processing is available at https://github.com/harshshredding/semeval2023-multiconer-paper.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Workshop on Semantic Evaluation

自引率

0.00%

发文量