{"title":"Adapted B-CUBED Metrics to Unbalanced Datasets","authors":"Jose G. Moreno, G. Dias","doi":"10.1145/2766462.2767836","DOIUrl":null,"url":null,"abstract":"B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, where classes are distributed following a strong unbalanced pattern. In this paper, we present a modified version of B-CUBED metrics to overcome this situation. Results in toy and real datasets indicate that the proposed adaptation correctly considers the particularities of unbalanced cases.","PeriodicalId":297035,"journal":{"name":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2766462.2767836","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
B-CUBED metrics have recently been adopted in the evaluation of clustering results as well as in many other related tasks. However, this family of metrics is not well adapted when datasets are unbalanced. This issue is extremely frequent in Web results, where classes are distributed following a strong unbalanced pattern. In this paper, we present a modified version of B-CUBED metrics to overcome this situation. Results in toy and real datasets indicate that the proposed adaptation correctly considers the particularities of unbalanced cases.
b - cube指标最近被用于聚类结果的评估以及许多其他相关任务。然而,当数据集不平衡时,这一系列指标不能很好地适应。这个问题在Web结果中非常常见,因为类是按照强烈的不平衡模式分布的。在本文中,我们提出了一个修改版本的B-CUBED度量来克服这种情况。在玩具和实际数据集上的结果表明,所提出的自适应方法正确地考虑了不平衡情况的特殊性。