PresiUniv at TSAR-2022 Shared Task: Generation and Ranking of Simplification Substitutes of Complex Words in Multiple Languages

Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) Pub Date : 1900-01-01 DOI:10.18653/v1/2022.tsar-1.22

Peniel Whistely, Sandeep Albert Mathias, Galiveeti Poornima

引用次数: 7

Abstract

In this paper, we describe our approach to generate and rank candidate simplifications using pre-trained language models (Eg. BERT), publicly available word embeddings (Eg. FastText), and a part-of-speech tagger, to generate and rank candidate contextual simplifications for a given complex word. In this task, our system, PresiUniv, was placed first in the Spanish track, 5th in the Brazilian-Portuguese track, and 10th in the English track. We upload our codes and data for this project to aid in replication of our results. We also analyze some of the errors and describe design decisions which we took while writing the paper.

查看原文本刊更多论文

多语言复杂词语简化替代词的生成与排序

在本文中，我们描述了我们使用预训练的语言模型(例如:BERT)，公开可用的词嵌入(例如;FastText)和词性标注器，用于生成给定复杂单词的候选上下文简化并对其进行排序。在这项任务中，我们的系统PresiUniv在西班牙语组中排名第一，在巴西-葡萄牙语组中排名第五，在英语组中排名第十。我们为这个项目上传代码和数据，以帮助复制我们的结果。我们还分析了一些错误，并描述了我们在撰写论文时所做的设计决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

自引率

0.00%

发文量