{"title":"一种评价最近邻分类的统一加权框架","authors":"Oliver Urs Lenz , Henri Bollaert , Chris Cornelis","doi":"10.1016/j.fss.2025.109516","DOIUrl":null,"url":null,"abstract":"<div><div>We present the first comprehensive and large-scale evaluation of classical (NN), fuzzy (FNN) and fuzzy rough (FRNN) nearest neighbour classification. We standardise existing proposals for nearest neighbour weighting with kernel functions, applied to the distance values and/or ranks of the nearest neighbours of a test instance. In particular, we show that the theoretically optimal Samworth weights converge to a kernel. Kernel functions are closely related to fuzzy negation operators, and we propose a new kernel based on Yager negation. We also consider various distance and scaling measures, which we show can be related to each other. Through a systematic series of experiments on 85 real-life classification datasets, we find that NN, FNN and FRNN all perform best with Boscovich distance, and that NN and FRNN perform best with a combination of Samworth rank- and distance-weights and scaling by the mean absolute deviation around the median (<span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>), the standard deviation (<span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>) or the semi-interquartile range (<span><math><msubsup><mrow><mi>r</mi></mrow><mrow><mo>∞</mo></mrow><mrow><mo>⁎</mo></mrow></msubsup></math></span>), while FNN performs best with only Samworth distance-weights and <span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>- or <span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-scaling. However, NN achieves comparable performance with Yager-<span><math><mfrac><mrow><mn>1</mn></mrow><mrow><mn>2</mn></mrow></mfrac></math></span> distance-weights, which are simpler to implement than a combination of Samworth distance- and rank-weights. Finally, FRNN generally outperforms NN, which in turn performs systematically better than FNN.</div></div>","PeriodicalId":55130,"journal":{"name":"Fuzzy Sets and Systems","volume":"519 ","pages":"Article 109516"},"PeriodicalIF":2.7000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A unified weighting framework for evaluating nearest neighbour classification\",\"authors\":\"Oliver Urs Lenz , Henri Bollaert , Chris Cornelis\",\"doi\":\"10.1016/j.fss.2025.109516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We present the first comprehensive and large-scale evaluation of classical (NN), fuzzy (FNN) and fuzzy rough (FRNN) nearest neighbour classification. We standardise existing proposals for nearest neighbour weighting with kernel functions, applied to the distance values and/or ranks of the nearest neighbours of a test instance. In particular, we show that the theoretically optimal Samworth weights converge to a kernel. Kernel functions are closely related to fuzzy negation operators, and we propose a new kernel based on Yager negation. We also consider various distance and scaling measures, which we show can be related to each other. Through a systematic series of experiments on 85 real-life classification datasets, we find that NN, FNN and FRNN all perform best with Boscovich distance, and that NN and FRNN perform best with a combination of Samworth rank- and distance-weights and scaling by the mean absolute deviation around the median (<span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>), the standard deviation (<span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>) or the semi-interquartile range (<span><math><msubsup><mrow><mi>r</mi></mrow><mrow><mo>∞</mo></mrow><mrow><mo>⁎</mo></mrow></msubsup></math></span>), while FNN performs best with only Samworth distance-weights and <span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>- or <span><math><msub><mrow><mi>r</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-scaling. However, NN achieves comparable performance with Yager-<span><math><mfrac><mrow><mn>1</mn></mrow><mrow><mn>2</mn></mrow></mfrac></math></span> distance-weights, which are simpler to implement than a combination of Samworth distance- and rank-weights. Finally, FRNN generally outperforms NN, which in turn performs systematically better than FNN.</div></div>\",\"PeriodicalId\":55130,\"journal\":{\"name\":\"Fuzzy Sets and Systems\",\"volume\":\"519 \",\"pages\":\"Article 109516\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fuzzy Sets and Systems\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165011425002556\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fuzzy Sets and Systems","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165011425002556","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
A unified weighting framework for evaluating nearest neighbour classification
We present the first comprehensive and large-scale evaluation of classical (NN), fuzzy (FNN) and fuzzy rough (FRNN) nearest neighbour classification. We standardise existing proposals for nearest neighbour weighting with kernel functions, applied to the distance values and/or ranks of the nearest neighbours of a test instance. In particular, we show that the theoretically optimal Samworth weights converge to a kernel. Kernel functions are closely related to fuzzy negation operators, and we propose a new kernel based on Yager negation. We also consider various distance and scaling measures, which we show can be related to each other. Through a systematic series of experiments on 85 real-life classification datasets, we find that NN, FNN and FRNN all perform best with Boscovich distance, and that NN and FRNN perform best with a combination of Samworth rank- and distance-weights and scaling by the mean absolute deviation around the median (), the standard deviation () or the semi-interquartile range (), while FNN performs best with only Samworth distance-weights and - or -scaling. However, NN achieves comparable performance with Yager- distance-weights, which are simpler to implement than a combination of Samworth distance- and rank-weights. Finally, FRNN generally outperforms NN, which in turn performs systematically better than FNN.
期刊介绍:
Since its launching in 1978, the journal Fuzzy Sets and Systems has been devoted to the international advancement of the theory and application of fuzzy sets and systems. The theory of fuzzy sets now encompasses a well organized corpus of basic notions including (and not restricted to) aggregation operations, a generalized theory of relations, specific measures of information content, a calculus of fuzzy numbers. Fuzzy sets are also the cornerstone of a non-additive uncertainty theory, namely possibility theory, and of a versatile tool for both linguistic and numerical modeling: fuzzy rule-based systems. Numerous works now combine fuzzy concepts with other scientific disciplines as well as modern technologies.
In mathematics fuzzy sets have triggered new research topics in connection with category theory, topology, algebra, analysis. Fuzzy sets are also part of a recent trend in the study of generalized measures and integrals, and are combined with statistical methods. Furthermore, fuzzy sets have strong logical underpinnings in the tradition of many-valued logics.