Automatic Evaluation of Machine Generated Feedback For Text and Image Data

2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR) Pub Date : 2022-08-01 DOI:10.1109/MIPR54900.2022.00081

Pratham Goyal, Anjali Raj, Puneet Kumar, Kishore Babu Nampalle

{"title":"Automatic Evaluation of Machine Generated Feedback For Text and Image Data","authors":"Pratham Goyal, Anjali Raj, Puneet Kumar, Kishore Babu Nampalle","doi":"10.1109/MIPR54900.2022.00081","DOIUrl":null,"url":null,"abstract":"In this paper, a novel system, ‘AutoEvaINet,’ has been developed for evaluating machine-generated feedback in response to multimodal input containing text and images. A new metric, ‘Automatically Evaluated Relevance Score’ (AER Score), has also been defined to automatically compute the similarity between human-generated comments and machine-generatedfeedback. The AutoEvalNet's architecture comprises a pre-trained feedback synthesis model and the proposed feedback evaluation model. It uses an ensemble of Bidirectional Encoder Representations from Transformers (BERT) and Global Vectors for Word Representation (GloVe) models to generate the embeddings of the ground-truth comment and machine-synthesized feedback using which the similarity score is calculated. The experiments have been performed on the MMFeed dataset. The generated feedback has been evaluated automatically using the AER score and manually by having the human users evaluate the feedbackfor relevance to the input and ground-truth comments. The values of the AER score and human evaluation scores are in line, affirming the AER score's applicability as an automatic evaluation measure for machine-generated text instead of human evaluation.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MIPR54900.2022.00081","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, a novel system, ‘AutoEvaINet,’ has been developed for evaluating machine-generated feedback in response to multimodal input containing text and images. A new metric, ‘Automatically Evaluated Relevance Score’ (AER Score), has also been defined to automatically compute the similarity between human-generated comments and machine-generatedfeedback. The AutoEvalNet's architecture comprises a pre-trained feedback synthesis model and the proposed feedback evaluation model. It uses an ensemble of Bidirectional Encoder Representations from Transformers (BERT) and Global Vectors for Word Representation (GloVe) models to generate the embeddings of the ground-truth comment and machine-synthesized feedback using which the similarity score is calculated. The experiments have been performed on the MMFeed dataset. The generated feedback has been evaluated automatically using the AER score and manually by having the human users evaluate the feedbackfor relevance to the input and ground-truth comments. The values of the AER score and human evaluation scores are in line, affirming the AER score's applicability as an automatic evaluation measure for machine-generated text instead of human evaluation.

查看原文本刊更多论文

机器生成的文本和图像数据反馈的自动评估

在本文中，一个新的系统“AutoEvaINet”已经开发出来，用于评估机器生成的反馈，以响应包含文本和图像的多模态输入。一个新的度量标准，“自动评估相关性评分”(AER评分)，也被定义为自动计算人类生成的评论和机器生成的反馈之间的相似性。AutoEvalNet的体系结构包括预训练的反馈综合模型和建议的反馈评估模型。它使用变形器的双向编码器表示(BERT)和词表示的全局向量(GloVe)模型的集合来生成基本事实评论和机器合成反馈的嵌入，使用这些嵌入来计算相似度分数。在MMFeed数据集上进行了实验。生成的反馈已经使用AER评分自动评估，并通过让人类用户评估反馈与输入和基本事实评论的相关性来手动评估。AER分数与人工评价分数的值一致，肯定了AER分数作为机器生成文本的自动评价指标的适用性，而不是人工评价。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)

自引率

0.00%

发文量