面向词义消歧的变形器的准双向编码器表示

Recent Advances in Natural Language Processing Pub Date : 2019-10-22 DOI:10.26615/978-954-452-056-4_015

Michele Bevilacqua, Roberto Navigli

{"title":"面向词义消歧的变形器的准双向编码器表示","authors":"Michele Bevilacqua, Roberto Navigli","doi":"10.26615/978-954-452-056-4_015","DOIUrl":null,"url":null,"abstract":"While contextualized embeddings have produced performance breakthroughs in many Natural Language Processing (NLP) tasks, Word Sense Disambiguation (WSD) has not benefited from them yet. In this paper, we introduce QBERT, a Transformer-based architecture for contextualized embeddings which makes use of a co-attentive layer to produce more deeply bidirectional representations, better-fitting for the WSD task. As a result, we are able to train a WSD system that beats the state of the art on the concatenation of all evaluation datasets by over 3 points, also outperforming a comparable model using ELMo.","PeriodicalId":284493,"journal":{"name":"Recent Advances in Natural Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Quasi Bidirectional Encoder Representations from Transformers for Word Sense Disambiguation\",\"authors\":\"Michele Bevilacqua, Roberto Navigli\",\"doi\":\"10.26615/978-954-452-056-4_015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While contextualized embeddings have produced performance breakthroughs in many Natural Language Processing (NLP) tasks, Word Sense Disambiguation (WSD) has not benefited from them yet. In this paper, we introduce QBERT, a Transformer-based architecture for contextualized embeddings which makes use of a co-attentive layer to produce more deeply bidirectional representations, better-fitting for the WSD task. As a result, we are able to train a WSD system that beats the state of the art on the concatenation of all evaluation datasets by over 3 points, also outperforming a comparable model using ELMo.\",\"PeriodicalId\":284493,\"journal\":{\"name\":\"Recent Advances in Natural Language Processing\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Recent Advances in Natural Language Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.26615/978-954-452-056-4_015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Recent Advances in Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26615/978-954-452-056-4_015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

摘要

虽然上下文嵌入在许多自然语言处理(NLP)任务中取得了性能突破，但词义消歧(WSD)尚未从中受益。在本文中，我们介绍了QBERT，这是一种基于transformer的上下文嵌入体系结构，它利用共同关注层来产生更深入的双向表示，更适合WSD任务。因此，我们能够训练出一个WSD系统，该系统在所有评估数据集的连接方面超过了目前最先进的系统3分以上，也优于使用ELMo的可比模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Quasi Bidirectional Encoder Representations from Transformers for Word Sense Disambiguation

While contextualized embeddings have produced performance breakthroughs in many Natural Language Processing (NLP) tasks, Word Sense Disambiguation (WSD) has not benefited from them yet. In this paper, we introduce QBERT, a Transformer-based architecture for contextualized embeddings which makes use of a co-attentive layer to produce more deeply bidirectional representations, better-fitting for the WSD task. As a result, we are able to train a WSD system that beats the state of the art on the concatenation of all evaluation datasets by over 3 points, also outperforming a comparable model using ELMo.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Recent Advances in Natural Language Processing

自引率

0.00%

发文量