Distribution Consistency Penalty in the Quadratic Kappa Loss for Ordinal Regression of Imbalanced Datasets

Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence Pub Date : 2021-12-04 DOI:10.1145/3507548.3507612

Bin-Bin Yang, Shengjie Zhao, Kenan Ye, Rongqing Zhang

{"title":"Distribution Consistency Penalty in the Quadratic Kappa Loss for Ordinal Regression of Imbalanced Datasets","authors":"Bin-Bin Yang, Shengjie Zhao, Kenan Ye, Rongqing Zhang","doi":"10.1145/3507548.3507612","DOIUrl":null,"url":null,"abstract":"Ordinal regression is a typical deep learning problem, which involves inherently ordered labels that are common in practical applications, especially in medical diagnosis tasks. To overcome the neglect of ordered or non-stationary property by merely exploiting classification or regression, quadratic weighted kappa (QWK) is proposed to be employed in the QWK loss function design as an efficient evaluation metric for ordinal regression. However, the paradox that kappa will be higher with an asymmetrical marginal histogram leads the QWK loss function to get the local optimal solution with all-zero-column in the confusion matrices during training. In practice, the all-zero column problem will result in a certain category not being detected at all, which can have serious consequences for the exclusion of pathology. To address this limitation, a new form of penalty term is proposed for the QWK loss function by penalizing the distance of marginal histogram to effectively avoid all-zero-column of the models. The experiments on the category-imbalanced datasets demonstrate that our penalty terms solve all-zero-column problem. On Adience dataset our penalty terms achieve 0.915 QWK, 0.446 MAE and 0.612 accuracy, while on DR dataset our penalty terms achieve 0.744 QWK, 0.281 MAE and 0.810 accuracy. Besides, experiments on the category-balanced datasets HCI show that our penalty terms achieve 0.810 QWK, 0.499 MAE and 0.610 accuracy.","PeriodicalId":414908,"journal":{"name":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3507548.3507612","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Ordinal regression is a typical deep learning problem, which involves inherently ordered labels that are common in practical applications, especially in medical diagnosis tasks. To overcome the neglect of ordered or non-stationary property by merely exploiting classification or regression, quadratic weighted kappa (QWK) is proposed to be employed in the QWK loss function design as an efficient evaluation metric for ordinal regression. However, the paradox that kappa will be higher with an asymmetrical marginal histogram leads the QWK loss function to get the local optimal solution with all-zero-column in the confusion matrices during training. In practice, the all-zero column problem will result in a certain category not being detected at all, which can have serious consequences for the exclusion of pathology. To address this limitation, a new form of penalty term is proposed for the QWK loss function by penalizing the distance of marginal histogram to effectively avoid all-zero-column of the models. The experiments on the category-imbalanced datasets demonstrate that our penalty terms solve all-zero-column problem. On Adience dataset our penalty terms achieve 0.915 QWK, 0.446 MAE and 0.612 accuracy, while on DR dataset our penalty terms achieve 0.744 QWK, 0.281 MAE and 0.810 accuracy. Besides, experiments on the category-balanced datasets HCI show that our penalty terms achieve 0.810 QWK, 0.499 MAE and 0.610 accuracy.

查看原文本刊更多论文

不平衡数据集有序回归的二次Kappa损失中的分布一致性惩罚

有序回归是一个典型的深度学习问题，它涉及到在实际应用中常见的固有有序标签，特别是在医疗诊断任务中。为了克服单纯利用分类或回归而忽略有序或非平稳性质的问题，提出在QWK损失函数设计中采用二次加权kappa (quadratic weighted kappa, QWK)作为有序回归的有效评价指标。然而，由于边缘直方图不对称时kappa会更高的悖论，导致QWK损失函数在训练时只能得到混淆矩阵中列全为零的局部最优解。在实践中，全零列问题将导致某个类别根本没有被检测到，这可能对排除病理产生严重后果。针对这一局限性，提出了一种新的QWK损失函数惩罚项形式，通过惩罚边缘直方图的距离，有效避免模型的全零列。在类别不平衡数据集上的实验表明，我们的惩罚项解决了全零列问题。在Adience数据集上，我们的惩罚项达到了0.915 QWK、0.446 MAE和0.612的精度，而在DR数据集上，我们的惩罚项达到了0.744 QWK、0.281 MAE和0.810的精度。此外，在类别平衡数据集HCI上的实验表明，我们的惩罚项达到了0.810 QWK, 0.499 MAE和0.610准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2021 5th International Conference on Computer Science and Artificial Intelligence

自引率

0.00%

发文量