Shannon Entropy is better Feature than Category and Sentiment in User Feedback Processing

arXiv - CS - Software Engineering Pub Date : 2024-09-18 DOI:arxiv-2409.12012

Andres Rojas Paredes, Brenda Mareco

引用次数: 0

Abstract

App reviews in mobile app stores contain useful information which is used to improve applications and promote software evolution. This information is processed by automatic tools which prioritize reviews. In order to carry out this prioritization, reviews are decomposed into features like category and sentiment. Then, a weighted function assigns a weight to each feature and a review ranking is calculated. Unfortunately, in order to extract category and sentiment from reviews, its is required at least a classifier trained in an annotated corpus. Therefore this task is computational demanding. Thus, in this work, we propose Shannon Entropy as a simple feature which can replace standard features. Our results show that a Shannon Entropy based ranking is better than a standard ranking according to the NDCG metric. This result is promising even if we require fairness by means of algorithmic bias. Finally, we highlight a computational limit which appears in the search of the best ranking.

查看原文本刊更多论文

在用户反馈处理中，香农熵是比类别和情感更好的特征

移动应用商店中的应用评论包含有用信息，可用于改进应用和促进软件发展。这些信息由自动工具处理，这些工具会对评论进行优先排序。为了进行优先级排序，评论会被分解成类别和情感等特征。然后，用加权函数为每个特征分配权重，并计算出评论排名。遗憾的是，要从评论中提取类别和情感，至少需要一个在有注释的语料库中训练过的分类器。因此，这项任务对计算要求很高。因此，在这项工作中，我们提出香农熵作为一种简单的特征，可以取代标准特征。我们的结果表明，根据 NDCG 指标，基于香农熵的排序优于标准排序。即使我们通过算法偏差来要求公平性，这一结果也是很有希望的。最后，我们强调了在搜索最佳排名时出现的计算极限。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Software Engineering

自引率

0.00%

发文量