{"title":"比例特征空间中的归一化","authors":"Alexandre Benatti, Luciano da F. Costa","doi":"arxiv-2409.11389","DOIUrl":null,"url":null,"abstract":"The subject of features normalization plays an important central role in data\nrepresentation, characterization, visualization, analysis, comparison,\nclassification, and modeling, as it can substantially influence and be\ninfluenced by all of these activities and respective aspects. The selection of\nan appropriate normalization method needs to take into account the type and\ncharacteristics of the involved features, the methods to be used subsequently\nfor the just mentioned data processing, as well as the specific questions being\nconsidered. After briefly considering how normalization constitutes one of the\nmany interrelated parts typically involved in data analysis and modeling, the\npresent work addressed the important issue of feature normalization from the\nperspective of uniform and proportional (right skewed) features and comparison\noperations. More general right skewed features are also considered in an\napproximated manner. Several concepts, properties, and results are described\nand discussed, including the description of a duality relationship between\nuniform and proportional feature spaces and respective comparisons, specifying\nconditions for consistency between comparisons in each of the two domains. Two\nnormalization possibilities based on non-centralized dispersion of features are\nalso presented, and also described is a modified version of the Jaccard\nsimilarity index which incorporates intrinsically normalization. Preliminary\nexperiments are presented in order to illustrate the developed concepts and\nmethods.","PeriodicalId":501043,"journal":{"name":"arXiv - PHYS - Physics and Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Normalization in Proportional Feature Spaces\",\"authors\":\"Alexandre Benatti, Luciano da F. Costa\",\"doi\":\"arxiv-2409.11389\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The subject of features normalization plays an important central role in data\\nrepresentation, characterization, visualization, analysis, comparison,\\nclassification, and modeling, as it can substantially influence and be\\ninfluenced by all of these activities and respective aspects. The selection of\\nan appropriate normalization method needs to take into account the type and\\ncharacteristics of the involved features, the methods to be used subsequently\\nfor the just mentioned data processing, as well as the specific questions being\\nconsidered. After briefly considering how normalization constitutes one of the\\nmany interrelated parts typically involved in data analysis and modeling, the\\npresent work addressed the important issue of feature normalization from the\\nperspective of uniform and proportional (right skewed) features and comparison\\noperations. More general right skewed features are also considered in an\\napproximated manner. Several concepts, properties, and results are described\\nand discussed, including the description of a duality relationship between\\nuniform and proportional feature spaces and respective comparisons, specifying\\nconditions for consistency between comparisons in each of the two domains. Two\\nnormalization possibilities based on non-centralized dispersion of features are\\nalso presented, and also described is a modified version of the Jaccard\\nsimilarity index which incorporates intrinsically normalization. Preliminary\\nexperiments are presented in order to illustrate the developed concepts and\\nmethods.\",\"PeriodicalId\":501043,\"journal\":{\"name\":\"arXiv - PHYS - Physics and Society\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Physics and Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11389\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Physics and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11389","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The subject of features normalization plays an important central role in data
representation, characterization, visualization, analysis, comparison,
classification, and modeling, as it can substantially influence and be
influenced by all of these activities and respective aspects. The selection of
an appropriate normalization method needs to take into account the type and
characteristics of the involved features, the methods to be used subsequently
for the just mentioned data processing, as well as the specific questions being
considered. After briefly considering how normalization constitutes one of the
many interrelated parts typically involved in data analysis and modeling, the
present work addressed the important issue of feature normalization from the
perspective of uniform and proportional (right skewed) features and comparison
operations. More general right skewed features are also considered in an
approximated manner. Several concepts, properties, and results are described
and discussed, including the description of a duality relationship between
uniform and proportional feature spaces and respective comparisons, specifying
conditions for consistency between comparisons in each of the two domains. Two
normalization possibilities based on non-centralized dispersion of features are
also presented, and also described is a modified version of the Jaccard
similarity index which incorporates intrinsically normalization. Preliminary
experiments are presented in order to illustrate the developed concepts and
methods.