A. M. Qamar, Éric Gaussier, J. Chevallet, Joo-Hwee Lim
{"title":"Similarity Learning for Nearest Neighbor Classification","authors":"A. M. Qamar, Éric Gaussier, J. Chevallet, Joo-Hwee Lim","doi":"10.1109/ICDM.2008.81","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an algorithm for learning a general class of similarity measures for kNN classification. This class encompasses, among others, the standard cosine measure, as well as the Dice and Jaccard coefficients. The algorithm we propose is an extension of the voted perceptron algorithm and allows one to learn different types of similarity functions (either based on diagonal, symmetric or asymmetric similarity matrices). The results we obtained show that learning similarity measures yields significant improvements on several collections, for two prediction rules: the standard kNN rule, which was our primary goal, and a symmetric version of it.","PeriodicalId":252958,"journal":{"name":"2008 Eighth IEEE International Conference on Data Mining","volume":"24 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"63","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 Eighth IEEE International Conference on Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2008.81","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 63
Abstract
In this paper, we propose an algorithm for learning a general class of similarity measures for kNN classification. This class encompasses, among others, the standard cosine measure, as well as the Dice and Jaccard coefficients. The algorithm we propose is an extension of the voted perceptron algorithm and allows one to learn different types of similarity functions (either based on diagonal, symmetric or asymmetric similarity matrices). The results we obtained show that learning similarity measures yields significant improvements on several collections, for two prediction rules: the standard kNN rule, which was our primary goal, and a symmetric version of it.