Data Complexity Measures and Nearest Neighbor Classifiers: A Practical Analysis for Meta-learning

George D. C. Cavalcanti, Ing Ren Tsang, Breno A. Vale
{"title":"Data Complexity Measures and Nearest Neighbor Classifiers: A Practical Analysis for Meta-learning","authors":"George D. C. Cavalcanti, Ing Ren Tsang, Breno A. Vale","doi":"10.1109/ICTAI.2012.150","DOIUrl":null,"url":null,"abstract":"The classifier accuracy is affected by the properties of the data sets used to train it. Nearest neighbor classifiers are known for being simple and accurate in several domains, but their behavior is strongly dependent on data complexity. On the other hand, there are data complexity measures which aim to describe properties of the data sets. This work aims to show how data complexity measures can be efficiently used to predict the behavior of the Nearest Neighbor classifier. Seven data complexity measures and seventeen real datasets are used in the experimental study. Each data complexity measure is analyzed individually in order to find a relationship between its value and the accuracy of the classifier on a given dataset. No single measure used is good enough to predict the behavior of the Nearest Neighbor classifier. However, the combination of these measures provides a powerful tool to predict the accuracy of the Nearest Neighbor classifier.","PeriodicalId":155588,"journal":{"name":"2012 IEEE 24th International Conference on Tools with Artificial Intelligence","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 24th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2012.150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

The classifier accuracy is affected by the properties of the data sets used to train it. Nearest neighbor classifiers are known for being simple and accurate in several domains, but their behavior is strongly dependent on data complexity. On the other hand, there are data complexity measures which aim to describe properties of the data sets. This work aims to show how data complexity measures can be efficiently used to predict the behavior of the Nearest Neighbor classifier. Seven data complexity measures and seventeen real datasets are used in the experimental study. Each data complexity measure is analyzed individually in order to find a relationship between its value and the accuracy of the classifier on a given dataset. No single measure used is good enough to predict the behavior of the Nearest Neighbor classifier. However, the combination of these measures provides a powerful tool to predict the accuracy of the Nearest Neighbor classifier.
数据复杂性度量和最近邻分类器:元学习的实用分析
分类器的准确性受到用于训练它的数据集的属性的影响。众所周知,最近邻分类器在一些领域简单而准确,但它们的行为强烈依赖于数据复杂性。另一方面,有一些数据复杂性度量,旨在描述数据集的属性。这项工作旨在展示如何有效地使用数据复杂性度量来预测最近邻分类器的行为。实验研究中使用了7个数据复杂度度量和17个真实数据集。每个数据复杂性度量都被单独分析,以便找到其值与给定数据集上分类器的准确性之间的关系。使用的任何单一度量都不足以预测最近邻分类器的行为。然而,这些措施的组合提供了一个强大的工具来预测最近邻分类器的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信