遥感影像不平衡数据集的多类分类性能指标分析

Andrea González-Ramírez, Josué López, Deni Torres, Israel Yañez-Vargas
{"title":"遥感影像不平衡数据集的多类分类性能指标分析","authors":"Andrea González-Ramírez, Josué López, Deni Torres, Israel Yañez-Vargas","doi":"10.35429/jqsa.2021.22.8.11.17","DOIUrl":null,"url":null,"abstract":"Remote sensing imaging datasets for classification generally present high levels of imbalance between classes of interest. This work presented a study of a set of performance evaluation metrics for an imbalance dataset. In this work, a support vector machine (SVM) was used to perform the classification of seven classes of interest in a popular dataset called Salinas-A. The performance evaluation of the classifier was performed using two types of metrics: 1) Metrics for multi-class classification, and 2) Metrics based on the binary confusion matrix. In the results, a comparison of the scores of each metric is developed, some being more optimistic than others due to the bias that they present given the imbalance. In addition, our case study helps to conclude that the Matthews correlation coefficient (MCC) presents the lowest bias in imbalanced cases and is regarded to be robust metric. These results can be extended to any imbalanced dataset taking into account the equations developed by Luque.","PeriodicalId":158993,"journal":{"name":"Journal of Quantitative and Statistical Analysis","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Analysis of multi-class classification performance metrics for remote sensing imagery imbalanced datasets\",\"authors\":\"Andrea González-Ramírez, Josué López, Deni Torres, Israel Yañez-Vargas\",\"doi\":\"10.35429/jqsa.2021.22.8.11.17\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Remote sensing imaging datasets for classification generally present high levels of imbalance between classes of interest. This work presented a study of a set of performance evaluation metrics for an imbalance dataset. In this work, a support vector machine (SVM) was used to perform the classification of seven classes of interest in a popular dataset called Salinas-A. The performance evaluation of the classifier was performed using two types of metrics: 1) Metrics for multi-class classification, and 2) Metrics based on the binary confusion matrix. In the results, a comparison of the scores of each metric is developed, some being more optimistic than others due to the bias that they present given the imbalance. In addition, our case study helps to conclude that the Matthews correlation coefficient (MCC) presents the lowest bias in imbalanced cases and is regarded to be robust metric. These results can be extended to any imbalanced dataset taking into account the equations developed by Luque.\",\"PeriodicalId\":158993,\"journal\":{\"name\":\"Journal of Quantitative and Statistical Analysis\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Quantitative and Statistical Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.35429/jqsa.2021.22.8.11.17\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Quantitative and Statistical Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35429/jqsa.2021.22.8.11.17","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

用于分类的遥感成像数据集通常在感兴趣的类别之间存在高度不平衡。这项工作提出了一组不平衡数据集的性能评估指标的研究。在这项工作中,使用支持向量机(SVM)在一个名为Salinas-A的流行数据集中对七个感兴趣的类别进行分类。使用两种类型的指标对分类器进行性能评估:1)多类分类指标,2)基于二元混淆矩阵的指标。在结果中,对每个指标的分数进行比较,由于它们在不平衡的情况下呈现的偏见,一些指标比其他指标更乐观。此外,我们的案例研究有助于得出结论,马修斯相关系数(MCC)在不平衡情况下表现出最低的偏差,被认为是一个稳健的指标。考虑到Luque开发的方程,这些结果可以扩展到任何不平衡数据集。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Analysis of multi-class classification performance metrics for remote sensing imagery imbalanced datasets
Remote sensing imaging datasets for classification generally present high levels of imbalance between classes of interest. This work presented a study of a set of performance evaluation metrics for an imbalance dataset. In this work, a support vector machine (SVM) was used to perform the classification of seven classes of interest in a popular dataset called Salinas-A. The performance evaluation of the classifier was performed using two types of metrics: 1) Metrics for multi-class classification, and 2) Metrics based on the binary confusion matrix. In the results, a comparison of the scores of each metric is developed, some being more optimistic than others due to the bias that they present given the imbalance. In addition, our case study helps to conclude that the Matthews correlation coefficient (MCC) presents the lowest bias in imbalanced cases and is regarded to be robust metric. These results can be extended to any imbalanced dataset taking into account the equations developed by Luque.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信