{"title":"Calibrated kNN classification via second-layer neighborhood analysis","authors":"Bastian Pfeifer, Markus Kreuzthaler","doi":"10.1007/s11634-025-00654-5","DOIUrl":null,"url":null,"abstract":"<div><p>The integration of artificial intelligence (AI) into medical practices has emphasized the importance of ensuring the reliability of predictive confidence, as it influences decision-making and the efficacy of AI-driven solutions. This paper focuses on the utilization of a non-parametric machine learning method, with a particular focus on the computation of confidence scores for individual classifications. Unlike parametric approaches, non-parametric techniques like k-nearest neighbors (kNN) offer flexibility without imposing strict data distribution assumptions. Leveraging this flexibility, we propose a novel kNN approach that introduces confidence-awareness through a two-layered neighborhood analysis. The developed approach is intended to support the classical non-parametric kNN classifier by providing more reliable and trustworthy class probabilities. Experimental evaluations conducted on benchmark datasets as well as a de-identified clinical real-world Electronic Health Records (EHR) data table consisting of thousands of unique class labels demonstrate the effectiveness of our approach in enhancing both, prediction accuracy and certainty assessment.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"20 :","pages":"145 - 165"},"PeriodicalIF":1.3000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11634-025-00654-5.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s11634-025-00654-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
The integration of artificial intelligence (AI) into medical practices has emphasized the importance of ensuring the reliability of predictive confidence, as it influences decision-making and the efficacy of AI-driven solutions. This paper focuses on the utilization of a non-parametric machine learning method, with a particular focus on the computation of confidence scores for individual classifications. Unlike parametric approaches, non-parametric techniques like k-nearest neighbors (kNN) offer flexibility without imposing strict data distribution assumptions. Leveraging this flexibility, we propose a novel kNN approach that introduces confidence-awareness through a two-layered neighborhood analysis. The developed approach is intended to support the classical non-parametric kNN classifier by providing more reliable and trustworthy class probabilities. Experimental evaluations conducted on benchmark datasets as well as a de-identified clinical real-world Electronic Health Records (EHR) data table consisting of thousands of unique class labels demonstrate the effectiveness of our approach in enhancing both, prediction accuracy and certainty assessment.
期刊介绍:
The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.