Supervised Pattern Recognition Involving Skewed Feature Densities

arXiv - PHYS - Physics and Society Pub Date : 2024-09-02 DOI:arxiv-2409.01213

Alexandre Benatti, Luciano da F. Costa

{"title":"Supervised Pattern Recognition Involving Skewed Feature Densities","authors":"Alexandre Benatti, Luciano da F. Costa","doi":"arxiv-2409.01213","DOIUrl":null,"url":null,"abstract":"Pattern recognition constitutes a particularly important task underlying a\ngreat deal of scientific and technologica activities. At the same time, pattern\nrecognition involves several challenges, including the choice of features to\nrepresent the data elements, as well as possible respective transformations. In\nthe present work, the classification potential of the Euclidean distance and a\ndissimilarity index based on the coincidence similarity index are compared by\nusing the k-neighbors supervised classification method respectively to features\nresulting from several types of transformations of one- and two-dimensional\nsymmetric densities. Given two groups characterized by respective densities\nwithout or with overlap, different types of respective transformations are\nobtained and employed to quantitatively evaluate the performance of k-neighbors\nmethodologies based on the Euclidean distance an coincidence similarity index.\nMore specifically, the accuracy of classifying the intersection point between\nthe densities of two adjacent groups is taken into account for the comparison.\nSeveral interesting results are described and discussed, including the enhanced\npotential of the dissimilarity index for classifying datasets with right skewed\nfeature densities, as well as the identification that the sharpness of the\ncomparison between data elements can be independent of the respective\nsupervised classification performance.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01213","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Pattern recognition constitutes a particularly important task underlying a great deal of scientific and technologica activities. At the same time, pattern recognition involves several challenges, including the choice of features to represent the data elements, as well as possible respective transformations. In the present work, the classification potential of the Euclidean distance and a dissimilarity index based on the coincidence similarity index are compared by using the k-neighbors supervised classification method respectively to features resulting from several types of transformations of one- and two-dimensional symmetric densities. Given two groups characterized by respective densities without or with overlap, different types of respective transformations are obtained and employed to quantitatively evaluate the performance of k-neighbors methodologies based on the Euclidean distance an coincidence similarity index. More specifically, the accuracy of classifying the intersection point between the densities of two adjacent groups is taken into account for the comparison. Several interesting results are described and discussed, including the enhanced potential of the dissimilarity index for classifying datasets with right skewed feature densities, as well as the identification that the sharpness of the comparison between data elements can be independent of the respective supervised classification performance.

查看原文本刊更多论文

涉及倾斜特征密度的有监督模式识别

模式识别是一项特别重要的任务，是大量科技活动的基础。与此同时，模式识别也面临着一些挑战，包括如何选择特征来表示数据元素，以及可能的相应转换。在本研究中，通过使用 k-neighbors 监督分类方法，分别比较了欧氏距离和基于重合相似性指数的相似性指数的分类潜力，以及一维和二维不对称密度的几种变换所产生的特征。给定两个组的特征是各自的密度没有重叠或有重叠，我们得到了各自不同类型的变换，并采用这些变换来定量评估基于欧氏距离和重合相似性指数的 k-邻居方法的性能。文中描述和讨论了几个有趣的结果，包括提高了不相似性指数对特征密度向右倾斜的数据集进行分类的潜力，以及发现数据元素间比较的锐度可以独立于相关的监督分类性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - PHYS - Physics and Society

自引率

0.00%

发文量