CASE 2021 Task 2 Socio-political Fine-grained Event Classification using Fine-tuned RoBERTa Document Embeddings

Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021) Pub Date : 1900-01-01 DOI:10.18653/v1/2021.case-1.26

Samantha Kent, Theresa Krumbiegel

引用次数: 4

Abstract

We present our submission to Task 2 of the Socio-political and Crisis Events Detection Shared Task at the CASE @ ACL-IJCNLP 2021 workshop. The task at hand aims at the fine-grained classification of socio-political events. Our best model was a fine-tuned RoBERTa transformer model using document embeddings. The corpus consisted of a balanced selection of sub-events extracted from the ACLED event dataset. We achieved a macro F-score of 0.923 and a micro F-score of 0.932 during our preliminary experiments on a held-out test set. The same model also performed best on the shared task test data (weighted F-score = 0.83). To analyze the results we calculated the topic compactness of the commonly misclassified events and conducted an error analysis.

查看原文本刊更多论文

使用微调RoBERTa文档嵌入的社会政治细粒度事件分类

我们在CASE @ ACL-IJCNLP 2021研讨会上提交了社会政治和危机事件检测共享任务的任务2。手头的任务旨在对社会政治事件进行细粒度分类。我们最好的模型是一个使用文档嵌入的微调RoBERTa转换器模型。该语料库由从ACLED事件数据集中提取的子事件的平衡选择组成。我们在一个hold -out测试集上的初步实验中获得了宏观F-score为0.923，微观F-score为0.932。同一模型在共享任务测试数据上也表现最好(加权f分数= 0.83)。为了分析结果，我们计算了常见错误分类事件的主题紧密度，并进行了误差分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

自引率

0.00%

发文量