Normalization in Proportional Feature Spaces

Alexandre Benatti, Luciano da F. Costa
{"title":"Normalization in Proportional Feature Spaces","authors":"Alexandre Benatti, Luciano da F. Costa","doi":"arxiv-2409.11389","DOIUrl":null,"url":null,"abstract":"The subject of features normalization plays an important central role in data\nrepresentation, characterization, visualization, analysis, comparison,\nclassification, and modeling, as it can substantially influence and be\ninfluenced by all of these activities and respective aspects. The selection of\nan appropriate normalization method needs to take into account the type and\ncharacteristics of the involved features, the methods to be used subsequently\nfor the just mentioned data processing, as well as the specific questions being\nconsidered. After briefly considering how normalization constitutes one of the\nmany interrelated parts typically involved in data analysis and modeling, the\npresent work addressed the important issue of feature normalization from the\nperspective of uniform and proportional (right skewed) features and comparison\noperations. More general right skewed features are also considered in an\napproximated manner. Several concepts, properties, and results are described\nand discussed, including the description of a duality relationship between\nuniform and proportional feature spaces and respective comparisons, specifying\nconditions for consistency between comparisons in each of the two domains. Two\nnormalization possibilities based on non-centralized dispersion of features are\nalso presented, and also described is a modified version of the Jaccard\nsimilarity index which incorporates intrinsically normalization. Preliminary\nexperiments are presented in order to illustrate the developed concepts and\nmethods.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The subject of features normalization plays an important central role in data representation, characterization, visualization, analysis, comparison, classification, and modeling, as it can substantially influence and be influenced by all of these activities and respective aspects. The selection of an appropriate normalization method needs to take into account the type and characteristics of the involved features, the methods to be used subsequently for the just mentioned data processing, as well as the specific questions being considered. After briefly considering how normalization constitutes one of the many interrelated parts typically involved in data analysis and modeling, the present work addressed the important issue of feature normalization from the perspective of uniform and proportional (right skewed) features and comparison operations. More general right skewed features are also considered in an approximated manner. Several concepts, properties, and results are described and discussed, including the description of a duality relationship between uniform and proportional feature spaces and respective comparisons, specifying conditions for consistency between comparisons in each of the two domains. Two normalization possibilities based on non-centralized dispersion of features are also presented, and also described is a modified version of the Jaccard similarity index which incorporates intrinsically normalization. Preliminary experiments are presented in order to illustrate the developed concepts and methods.
比例特征空间中的归一化
特征归一化课题在数据呈现、特征描述、可视化、分析、比较、分类和建模中发挥着重要的核心作用,因为它可以对所有这些活动和各自的方面产生重大影响,也会受到所有这些活动和方面的影响。选择合适的归一化方法时,需要考虑所涉及特征的类型和特征、随后用于上述数据处理的方法以及正在考虑的具体问题。在简要考虑了归一化如何构成数据分析和建模通常涉及的众多相互关联的部分之一之后,本研究从均匀和比例(右斜)特征以及比较操作的角度探讨了特征归一化的重要问题。此外,还以近似方式考虑了更一般的右斜特征。文中描述并讨论了一些概念、属性和结果,包括描述了均匀和比例特征空间及各自比较之间的对偶关系,明确了两个领域中每个领域的比较一致性条件。此外,还介绍了基于特征非集中分散的两种归一化可能性,并描述了包含内在归一化的 Jaccards 相似度指数的改进版本。为了说明所开发的概念和方法,还介绍了初步实验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信