Analysis and Evaluation of Language Models for Word Sense Disambiguation

IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Daniel Loureiro, Kiamehr Rezaee, Mohammad Taher Pilehvar, José Camacho-Collados
{"title":"Analysis and Evaluation of Language Models for Word Sense Disambiguation","authors":"Daniel Loureiro, Kiamehr Rezaee, Mohammad Taher Pilehvar, José Camacho-Collados","doi":"10.1162/coli_a_00405","DOIUrl":null,"url":null,"abstract":"Abstract Transformer-based language models have taken many fields in NLP by storm. BERT and its derivatives dominate most of the existing evaluation benchmarks, including those for Word Sense Disambiguation (WSD), thanks to their ability in capturing context-sensitive semantic nuances. However, there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses. In this article, we provide an in-depth quantitative and qualitative analysis of the celebrated BERT model with respect to lexical ambiguity. One of the main conclusions of our analysis is that BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense. Our analysis also reveals that in some cases language models come close to solving coarse-grained noun disambiguation under ideal conditions in terms of availability of training data and computing resources. However, this scenario rarely occurs in real-world settings and, hence, many practical challenges remain even in the coarse-grained setting. We also perform an in-depth comparison of the two main language model-based WSD strategies, namely, fine-tuning and feature extraction, finding that the latter approach is more robust with respect to sense bias and it can better exploit limited available training data. In fact, the simple feature extraction strategy of averaging contextualized embeddings proves robust even using only three training sentences per word sense, with minimal improvements obtained by increasing the size of this training data.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"387-443"},"PeriodicalIF":3.7000,"publicationDate":"2021-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_a_00405","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 41

Abstract

Abstract Transformer-based language models have taken many fields in NLP by storm. BERT and its derivatives dominate most of the existing evaluation benchmarks, including those for Word Sense Disambiguation (WSD), thanks to their ability in capturing context-sensitive semantic nuances. However, there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses. In this article, we provide an in-depth quantitative and qualitative analysis of the celebrated BERT model with respect to lexical ambiguity. One of the main conclusions of our analysis is that BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense. Our analysis also reveals that in some cases language models come close to solving coarse-grained noun disambiguation under ideal conditions in terms of availability of training data and computing resources. However, this scenario rarely occurs in real-world settings and, hence, many practical challenges remain even in the coarse-grained setting. We also perform an in-depth comparison of the two main language model-based WSD strategies, namely, fine-tuning and feature extraction, finding that the latter approach is more robust with respect to sense bias and it can better exploit limited available training data. In fact, the simple feature extraction strategy of averaging contextualized embeddings proves robust even using only three training sentences per word sense, with minimal improvements obtained by increasing the size of this training data.
词义消歧的语言模型分析与评价
基于变换的语言模型在自然语言处理领域掀起了一股热潮。BERT及其衍生工具由于能够捕获上下文敏感的语义细微差别,在大多数现有的评估基准中占据主导地位,包括那些用于词义消歧(WSD)的基准。然而,关于它们在编码和恢复词义方面的能力和潜在局限性,我们仍然知之甚少。在本文中,我们对著名的BERT模型在词汇歧义方面进行了深入的定量和定性分析。我们分析的一个主要结论是,BERT可以准确地捕获高级意义的区别,即使每个词义的可用示例数量有限。我们的分析还表明,在某些情况下,就训练数据和计算资源的可用性而言,语言模型在理想条件下接近于解决粗粒度名词消歧问题。然而,这种情况很少出现在现实环境中,因此,即使在粗粒度的环境中也存在许多实际挑战。我们还对两种主要的基于语言模型的WSD策略(即微调和特征提取)进行了深入的比较,发现后一种方法在感觉偏差方面更具鲁棒性,并且可以更好地利用有限的可用训练数据。事实上,即使每个词义只使用三个训练句子,平均上下文化嵌入的简单特征提取策略也证明了鲁棒性,并且通过增加训练数据的大小获得了最小的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Linguistics
Computational Linguistics 工程技术-计算机:跨学科应用
CiteScore
15.80
自引率
0.00%
发文量
45
审稿时长
>12 weeks
期刊介绍: Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信