The Rating Scale Paradox: Semantics Instability versus Information Loss

Standards Pub Date : 2022-08-01 DOI:10.3390/standards2030024

J. Giacomelli

{"title":"The Rating Scale Paradox: Semantics Instability versus Information Loss","authors":"J. Giacomelli","doi":"10.3390/standards2030024","DOIUrl":null,"url":null,"abstract":"Rating systems are applied to a wide variety of different contexts as a tool to map a large amount of information to a symbol, or notch, chosen from a finite, ordered set. Such a set is commonly known as the rating scale, and its elements represent all the different degrees of quality—in some sense—that a given rating system aims to express. This work investigates a simple yet nontrivial paradox in constructing that scale. When the considered quality parameter is continuous, a bijection must exist between a specific partition of its domain and the rating scale. The number of notches and their meanings are commonly defined a priori based on the convenience of the rating system users. However, regarding the partition, the number of subsets and their amplitudes should be chosen a posteriori to minimize the unavoidable information loss due to discretization. Considering the typical case of a creditworthy rating system based on a logistic regression model, we discuss to what extent this contrast may impact a realistic framework and how a proper rating scale definition may handle it. Indeed, we show that choosing between a priori methods, which privilege the meaning of the rating scale, and a posteriori methods, which minimize information loss, is not strictly necessary. It is possible to mix the two approaches instead, choosing a hybrid criterion tunable according to the rating model’s user needs.","PeriodicalId":21933,"journal":{"name":"Standards","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/standards2030024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Rating systems are applied to a wide variety of different contexts as a tool to map a large amount of information to a symbol, or notch, chosen from a finite, ordered set. Such a set is commonly known as the rating scale, and its elements represent all the different degrees of quality—in some sense—that a given rating system aims to express. This work investigates a simple yet nontrivial paradox in constructing that scale. When the considered quality parameter is continuous, a bijection must exist between a specific partition of its domain and the rating scale. The number of notches and their meanings are commonly defined a priori based on the convenience of the rating system users. However, regarding the partition, the number of subsets and their amplitudes should be chosen a posteriori to minimize the unavoidable information loss due to discretization. Considering the typical case of a creditworthy rating system based on a logistic regression model, we discuss to what extent this contrast may impact a realistic framework and how a proper rating scale definition may handle it. Indeed, we show that choosing between a priori methods, which privilege the meaning of the rating scale, and a posteriori methods, which minimize information loss, is not strictly necessary. It is possible to mix the two approaches instead, choosing a hybrid criterion tunable according to the rating model’s user needs.

查看原文本刊更多论文

评定量表悖论:语义不稳定与信息丢失

评级系统被广泛应用于各种不同的上下文，作为一种工具，将大量信息映射到从有限有序集合中选择的符号或缺口。这样一个集合通常被称为评分量表，它的元素在某种意义上代表了给定评分系统旨在表达的所有不同程度的质量。这项工作调查了一个简单但不平凡的悖论在构建这个规模。当所考虑的质量参数是连续的时，在其域的特定分区和评定尺度之间必须存在双射。等级的数量及其含义通常是根据评级系统用户的便利性先验地定义的。但是，对于划分，子集的数量和它们的幅度应该在后验选择，以尽量减少由于离散化而不可避免的信息损失。考虑到基于逻辑回归模型的信用评级系统的典型案例，我们讨论了这种对比在多大程度上可能影响现实框架，以及适当的评级量表定义如何处理它。事实上，我们表明，在先验方法和后验方法之间进行选择并不是严格必要的，前者赋予了评价表的意义，而后者最小化了信息损失。也可以混合使用这两种方法，根据评级模型的用户需求选择可调的混合标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Standards

自引率

0.00%

发文量