Decoding Word Embeddings with Brain-Based Semantic Features

IF 3.7 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Emmanuele Chersoni, Enrico Santus, Chu-Ren Huang, Alessandro Lenci
{"title":"Decoding Word Embeddings with Brain-Based Semantic Features","authors":"Emmanuele Chersoni, Enrico Santus, Chu-Ren Huang, Alessandro Lenci","doi":"10.1162/coli_a_00412","DOIUrl":null,"url":null,"abstract":"Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their introduction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings, including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT). In our analysis, we first evaluate the quality of the mapping in a retrieval task, then we shed light on the semantic features that are better encoded in each embedding type. A large number of probing tasks is finally set to assess how the original and the mapped embeddings perform in discriminating semantic categories. For each probing task, we identify the most relevant semantic features and we show that there is a correlation between the embedding performance and how they encode those features. This study sets itself as a step forward in understanding which aspects of meaning are captured by vector spaces, by proposing a new and simple method to carve human-interpretable semantic representations from distributional vectors.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-36"},"PeriodicalIF":3.7000,"publicationDate":"2021-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_a_00412","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 24

Abstract

Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their introduction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings, including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT). In our analysis, we first evaluate the quality of the mapping in a retrieval task, then we shed light on the semantic features that are better encoded in each embedding type. A large number of probing tasks is finally set to assess how the original and the mapped embeddings perform in discriminating semantic categories. For each probing task, we identify the most relevant semantic features and we show that there is a correlation between the embedding performance and how they encode those features. This study sets itself as a step forward in understanding which aspects of meaning are captured by vector spaces, by proposing a new and simple method to carve human-interpretable semantic representations from distributional vectors.
基于大脑语义特征的单词嵌入解码
单词嵌入是通过计数或预测技术构建的矢量语义表示,旨在从单词的共同出现中捕捉含义的阴影。自从它们被引入以来,这些表征一直被批评为缺乏可解释的维度。单词嵌入的这种特性限制了我们对它们实际编码的语义特征的理解。此外,由于单词嵌入性能的原因对人类来说往往是不透明的,因此它有助于使用它们的任务的“黑匣子”性质。在这篇文章中,我们通过将单词嵌入映射到可解释向量上来探索单词嵌入中编码的语义特性,该向量由显式和神经生物学动机的语义特征组成(Binder等人,2016)。我们的探索考虑了不同类型的嵌入,包括因子分解计数向量和预测模型(Skip Gram、GloVe等),以及最新的上下文表示(即ELMo和BERT)。在我们的分析中,我们首先评估检索任务中映射的质量,然后阐明在每个嵌入类型中更好地编码的语义特征。最后设置了大量的探测任务来评估原始嵌入和映射嵌入在区分语义类别方面的表现。对于每个探测任务,我们确定了最相关的语义特征,并表明嵌入性能和它们如何编码这些特征之间存在相关性。这项研究提出了一种新的简单方法,从分布向量中雕刻出人类可解释的语义表示,从而在理解向量空间捕捉到意义的哪些方面方面迈出了一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Linguistics
Computational Linguistics 工程技术-计算机:跨学科应用
CiteScore
15.80
自引率
0.00%
发文量
45
审稿时长
>12 weeks
期刊介绍: Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信