Quality metrics of generative models

Proceedings of the 31th International Conference on Computer Graphics and Vision. Volume 1 Pub Date : 1900-01-01 DOI:10.20948/graphicon-2021-1-124-130

K. Abrosimov, Tatiana Vladimirovna Lvutina, A. Surkova

{"title":"Quality metrics of generative models","authors":"K. Abrosimov, Tatiana Vladimirovna Lvutina, A. Surkova","doi":"10.20948/graphicon-2021-1-124-130","DOIUrl":null,"url":null,"abstract":"Within the framework of this article, modern metrics for evaluating generative models are considered. Particular attention is paid to metrics that are used in the field of natural language processing - BLUE (evaluates quality based on a comparison of the result obtained by a model and a person), NIST (based on the BLUE metric), METEOR (based on the harmonic mean of unigrams of accuracy and completeness), ROUGE (. The article presents a new metric, which is based on subjective assessments. The subjective estimates used in the considered metric are collected using pairwise comparison in the form of evaluation scales. The article also proposes an algorithm for generating music based on automatic models of working with ABC notation, models of distributive semantics and generative models of deep neural networks - Transformers. The new quality metric (SS-metric) presented in the article is used to assess the quality of the proposed algorithm for generating music in comparison with the solutions offered by humans and baseline models. Music generation based on the baseline model builds a continuation of a musical fragment by randomly selecting bars from the first half of the musical fragment. During the experiments, it was found out that the SS-metric allows you to formalize and generalize subjective assessments, this can be used to assess the quality of various objects.","PeriodicalId":135912,"journal":{"name":"Proceedings of the 31th International Conference on Computer Graphics and Vision. Volume 1","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 31th International Conference on Computer Graphics and Vision. Volume 1","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20948/graphicon-2021-1-124-130","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Within the framework of this article, modern metrics for evaluating generative models are considered. Particular attention is paid to metrics that are used in the field of natural language processing - BLUE (evaluates quality based on a comparison of the result obtained by a model and a person), NIST (based on the BLUE metric), METEOR (based on the harmonic mean of unigrams of accuracy and completeness), ROUGE (. The article presents a new metric, which is based on subjective assessments. The subjective estimates used in the considered metric are collected using pairwise comparison in the form of evaluation scales. The article also proposes an algorithm for generating music based on automatic models of working with ABC notation, models of distributive semantics and generative models of deep neural networks - Transformers. The new quality metric (SS-metric) presented in the article is used to assess the quality of the proposed algorithm for generating music in comparison with the solutions offered by humans and baseline models. Music generation based on the baseline model builds a continuation of a musical fragment by randomly selecting bars from the first half of the musical fragment. During the experiments, it was found out that the SS-metric allows you to formalize and generalize subjective assessments, this can be used to assess the quality of various objects.

查看原文本刊更多论文

生成模型的质量度量

在本文的框架内，考虑了评估生成模型的现代度量。特别关注的是在自然语言处理领域中使用的度量标准——BLUE(基于模型和人获得的结果的比较来评估质量)、NIST(基于BLUE度量标准)、METEOR(基于准确性和完整性的单图的调和平均值)、ROUGE(基于人工智能)。本文提出了一种基于主观评价的新指标。所考虑的度量中使用的主观估计以评估量表的形式使用两两比较收集。本文还提出了一种基于ABC符号自动模型、分布语义模型和深度神经网络生成模型的音乐生成算法——变形金刚。本文中提出的新质量度量(SS-metric)用于与人类和基线模型提供的解决方案相比，评估所提出的生成音乐的算法的质量。基于基线模型的音乐生成通过从音乐片段的前半部分随机选择小节来构建音乐片段的延续。在实验过程中，我们发现SS-metric允许您形式化和概括主观评估，这可以用于评估各种对象的质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 31th International Conference on Computer Graphics and Vision. Volume 1

自引率

0.00%

发文量