{"title":"FWLMkNN: Efficient functional K-nearest neighbor based on clustering and functional data analysis","authors":"Mohammed Sabri, Rosanna Verde, Antonio Balzanella","doi":"10.1016/j.eswa.2025.128567","DOIUrl":null,"url":null,"abstract":"<div><div>The increase in data characterized by continuous time and space-varying sequences of observations, such as curves, surfaces, and trajectories, has established the fundamental role of functional data analysis (FDA) in modern statistical methodology. This paper introduces an innovative classification framework that enhances the accuracy of functional data classifiers. This approach merges the strengths of functional supervised and unsupervised learning techniques. It introduces a unique objective function for the unsupervised learning stage to discover novel patterns that are critical for the successful classification of functional data. The process begins with a clustering phase as a preprocessing step that sets the groundwork for the subsequent classification process, which is guided by the clustering results. A partition of the original classes of the training set into distinct subgroups is provided by optimizing a new objective function. This process is achieved by decreasing the variability within each subgroup of a given class while improving the separation between these subgroups and those of other classes. The algorithm automatically determines representative subgroups and the weights assigned to the variables. The weight optimization technique identifies the most discriminative variables for clustering by dynamically adjusting weights to minimize the influence of noise-inducing features in the classification process. Hence this strategy allows for a more efficient and robust classification. Our proposal employs a weighted local mean k-nearest neighbor (KNN) approach within the classification phase. The proposed methodology leverages the novel augmented label space derived from the initial clustering phase, enhancing the classification process. Specifically, the method entails identifying the <span><math><mi>k</mi></math></span> nearest neighbors within each subgroup, computing <span><math><mi>k</mi></math></span> distinct local mean vectors, and subsequently utilizing these vectors to determine their weighted distance relative to the query sample. Consequently, the classification of the query sample is achieved by allocating it to the category exhibiting the minimum distance. The proposed methodology was evaluated using both synthetic datasets and established real-world datasets. Experimental results demonstrate significant reductions in classification error rate compared to state-of-the-art methods, highlighting the framework’s robustness across diverse data. Furthermore, we validate our approach through a practical case study on seasonal classification of Italian electricity load curves, demonstrating its effectiveness in real-world energy management applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"292 ","pages":"Article 128567"},"PeriodicalIF":7.5000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425021864","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The increase in data characterized by continuous time and space-varying sequences of observations, such as curves, surfaces, and trajectories, has established the fundamental role of functional data analysis (FDA) in modern statistical methodology. This paper introduces an innovative classification framework that enhances the accuracy of functional data classifiers. This approach merges the strengths of functional supervised and unsupervised learning techniques. It introduces a unique objective function for the unsupervised learning stage to discover novel patterns that are critical for the successful classification of functional data. The process begins with a clustering phase as a preprocessing step that sets the groundwork for the subsequent classification process, which is guided by the clustering results. A partition of the original classes of the training set into distinct subgroups is provided by optimizing a new objective function. This process is achieved by decreasing the variability within each subgroup of a given class while improving the separation between these subgroups and those of other classes. The algorithm automatically determines representative subgroups and the weights assigned to the variables. The weight optimization technique identifies the most discriminative variables for clustering by dynamically adjusting weights to minimize the influence of noise-inducing features in the classification process. Hence this strategy allows for a more efficient and robust classification. Our proposal employs a weighted local mean k-nearest neighbor (KNN) approach within the classification phase. The proposed methodology leverages the novel augmented label space derived from the initial clustering phase, enhancing the classification process. Specifically, the method entails identifying the nearest neighbors within each subgroup, computing distinct local mean vectors, and subsequently utilizing these vectors to determine their weighted distance relative to the query sample. Consequently, the classification of the query sample is achieved by allocating it to the category exhibiting the minimum distance. The proposed methodology was evaluated using both synthetic datasets and established real-world datasets. Experimental results demonstrate significant reductions in classification error rate compared to state-of-the-art methods, highlighting the framework’s robustness across diverse data. Furthermore, we validate our approach through a practical case study on seasonal classification of Italian electricity load curves, demonstrating its effectiveness in real-world energy management applications.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.