A Fistful of Vectors: A Tool for Intrinsic Evaluation of Word Embeddings

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Cognitive Computation Pub Date : 2024-01-22 DOI:10.1007/s12559-023-10235-3

Roberto Ascari, Anna Giabelli, Lorenzo Malandri, Fabio Mercorio, Mario Mezzanzanica

{"title":"A Fistful of Vectors: A Tool for Intrinsic Evaluation of Word Embeddings","authors":"Roberto Ascari, Anna Giabelli, Lorenzo Malandri, Fabio Mercorio, Mario Mezzanzanica","doi":"10.1007/s12559-023-10235-3","DOIUrl":null,"url":null,"abstract":"The utilization of word embeddings—powerful models computed through Neural Network architectures that encode words as vectors—has witnessed rapid growth across various Natural Language Processing applications, encompassing semantic analysis, information retrieval, dependency parsing, question answering, and machine translation. The efficacy of these tasks is strictly linked to the quality of the embeddings, underscoring the critical importance of evaluating and selecting optimal embedding models. While established procedures and benchmarks exist for intrinsic evaluation, the authors note a conspicuous absence of comprehensive evaluations of intrinsic embedding quality across multiple tasks. This paper introduces vec2best, a unified tool encompassing state-of-the-art intrinsic evaluation tasks across diverse benchmarks. vec2best furnishes the user with an extensive evaluation of word embedding models. It represents a framework for evaluating word embeddings trained using various methods and hyper-parameters on a range of tasks from the literature. The tool yields a holistic evaluation metric for each model called the PCE (Principal Component Evaluation). We conducted evaluations on 135 word embedding models, trained using GloVe, fastText, and word2vec, across four tasks integrated into vec2best (similarity, analogy, categorization, and outlier detection), along with their respective benchmarks. Additionally, we leveraged vec2best to optimize embedding hyper-parameter configurations in a real-world scenario. vec2best is conveniently accessible as a pip-installable Python package.","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":"256 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12559-023-10235-3","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The utilization of word embeddings—powerful models computed through Neural Network architectures that encode words as vectors—has witnessed rapid growth across various Natural Language Processing applications, encompassing semantic analysis, information retrieval, dependency parsing, question answering, and machine translation. The efficacy of these tasks is strictly linked to the quality of the embeddings, underscoring the critical importance of evaluating and selecting optimal embedding models. While established procedures and benchmarks exist for intrinsic evaluation, the authors note a conspicuous absence of comprehensive evaluations of intrinsic embedding quality across multiple tasks. This paper introduces vec2best, a unified tool encompassing state-of-the-art intrinsic evaluation tasks across diverse benchmarks. vec2best furnishes the user with an extensive evaluation of word embedding models. It represents a framework for evaluating word embeddings trained using various methods and hyper-parameters on a range of tasks from the literature. The tool yields a holistic evaluation metric for each model called the PCE (Principal Component Evaluation). We conducted evaluations on 135 word embedding models, trained using GloVe, fastText, and word2vec, across four tasks integrated into vec2best (similarity, analogy, categorization, and outlier detection), along with their respective benchmarks. Additionally, we leveraged vec2best to optimize embedding hyper-parameter configurations in a real-world scenario. vec2best is conveniently accessible as a pip-installable Python package.

Abstract Image

查看原文本刊更多论文

大量向量：词嵌入的内在评估工具

词嵌入--通过神经网络架构计算出的强大模型，可将单词编码为向量--在各种自然语言处理应用中得到了快速发展，包括语义分析、信息检索、依赖关系解析、问题解答和机器翻译。这些任务的效率与嵌入的质量密切相关，因此评估和选择最佳嵌入模型至关重要。虽然已经有了内在评估的既定程序和基准，但作者注意到明显缺乏对多个任务的内在嵌入质量的全面评估。本文介绍的 vec2best 是一个统一的工具，涵盖了不同基准下最先进的内在评估任务。它是一个框架，用于在一系列文献任务中评估使用各种方法和超参数训练的词嵌入。该工具为每个模型提供一个整体评估指标，称为 PCE（主成分评估）。我们对使用 GloVe、fastText 和 word2vec 训练的 135 个词嵌入模型进行了评估，评估涉及集成到 vec2best 中的四个任务（相似性、类比、分类和离群点检测）及其各自的基准。此外，我们还利用 vec2best 优化了实际场景中的嵌入超参数配置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Cognitive Computation COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-NEUROSCIENCES

CiteScore

9.30

自引率

3.70%

发文量

116

审稿时长

>12 weeks

期刊介绍： Cognitive Computation is an international, peer-reviewed, interdisciplinary journal that publishes cutting-edge articles describing original basic and applied work involving biologically-inspired computational accounts of all aspects of natural and artificial cognitive systems. It provides a new platform for the dissemination of research, current practices and future trends in the emerging discipline of cognitive computation that bridges the gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities.