{"title":"欧几里得距离差和绝对曼哈顿距离差之和:小数据表的多准则决策工具","authors":"Károly Héberger","doi":"10.1016/j.aca.2025.344649","DOIUrl":null,"url":null,"abstract":"<h3>Background</h3>Despite its advantages, rank transformation leads inevitably to information loss. This work presents an extension for sum of ranking differences (SRD) algorithm for non-ranking environment. It is expedient to elaborate a new algorithm, which overcomes this difficulty. The procedure has been developed by the analogy of SRD, <em>i.e</em>., pairwise comparisons of (column) vectors, fixing one of them as gold standard and introducing two validation steps (the randomization and Wilcoxon tests after assigning uncertainties by cross-validation).<h3>Results</h3>Two emblematic distance metrics were involved in the development: the most frequently applied Euclidean distance and its robust counterpart the city block (Manhattan) distance. Such a way two new dissimilarity measures have been defined: Sum of Euclidean Distance Differences (DnE) and Sum of Absolute Manhattan Distance Differences (DnM) along with their randomization tests and Variance Analysis (ANOVA). Unfortunately, when leaving the safe rank environment, we also leave the well-known permutations and the theoretical backgrounds (Spearman footrule), as well. This study is limited to a maximum of eight rows in the input matrix, where exact theoretical random distributions are available. Sixteen carefully chosen data sets were selected covering a wide range of scientific disciplines and of numbers for columns and rows in the input matrix: between three to 80 and five to eight, respectively. Three case studies illustrate the advantages and disadvantages of the new dissimilarity measures and statistical tests.<h3>Significance</h3>Superior discrimination ability characterizes DnE and DnM; they provide a more sophisticated ranking (and grouping) patterns than SRD despite their smaller visualization (applicability) domain. The randomization test loses its sensitivity in the order of SRD>DnE>DnM. The latter two realize different clustering patterns from SRD and from each other but (almost) the same ordering. Hence, only one of them is recommended in a ranking environment. Although the random distributions of DnE and DnM is distorted a little, the probability of first kind error (say 5%) can safely be determined from the cumulated frequencies. Comprehensive enumeration of advantages and disadvantages has been completed for SRD, DnE and DnM as dissimilarity measures, clustering tools, multicriteria decision making (MCDM) techniques and their competitors. While preserving great advantages of SRD (simplicity, generality, MCDM character and lack of subjective weights), both new techniques are suitable dissimilarity measures, clustering and MCDM tools in non-ranking environments. DnE and DnM also inflict universal scales for later ANOVA and Wilcoxon tests.","PeriodicalId":240,"journal":{"name":"Analytica Chimica Acta","volume":"38 1","pages":""},"PeriodicalIF":6.0000,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sum of Euclidean Distance Differences and Sum of Absolute Manhattan Distance Differences: multicriteria decision making tools for small data tables\",\"authors\":\"Károly Héberger\",\"doi\":\"10.1016/j.aca.2025.344649\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<h3>Background</h3>Despite its advantages, rank transformation leads inevitably to information loss. This work presents an extension for sum of ranking differences (SRD) algorithm for non-ranking environment. It is expedient to elaborate a new algorithm, which overcomes this difficulty. The procedure has been developed by the analogy of SRD, <em>i.e</em>., pairwise comparisons of (column) vectors, fixing one of them as gold standard and introducing two validation steps (the randomization and Wilcoxon tests after assigning uncertainties by cross-validation).<h3>Results</h3>Two emblematic distance metrics were involved in the development: the most frequently applied Euclidean distance and its robust counterpart the city block (Manhattan) distance. Such a way two new dissimilarity measures have been defined: Sum of Euclidean Distance Differences (DnE) and Sum of Absolute Manhattan Distance Differences (DnM) along with their randomization tests and Variance Analysis (ANOVA). Unfortunately, when leaving the safe rank environment, we also leave the well-known permutations and the theoretical backgrounds (Spearman footrule), as well. This study is limited to a maximum of eight rows in the input matrix, where exact theoretical random distributions are available. Sixteen carefully chosen data sets were selected covering a wide range of scientific disciplines and of numbers for columns and rows in the input matrix: between three to 80 and five to eight, respectively. Three case studies illustrate the advantages and disadvantages of the new dissimilarity measures and statistical tests.<h3>Significance</h3>Superior discrimination ability characterizes DnE and DnM; they provide a more sophisticated ranking (and grouping) patterns than SRD despite their smaller visualization (applicability) domain. The randomization test loses its sensitivity in the order of SRD>DnE>DnM. The latter two realize different clustering patterns from SRD and from each other but (almost) the same ordering. Hence, only one of them is recommended in a ranking environment. Although the random distributions of DnE and DnM is distorted a little, the probability of first kind error (say 5%) can safely be determined from the cumulated frequencies. Comprehensive enumeration of advantages and disadvantages has been completed for SRD, DnE and DnM as dissimilarity measures, clustering tools, multicriteria decision making (MCDM) techniques and their competitors. While preserving great advantages of SRD (simplicity, generality, MCDM character and lack of subjective weights), both new techniques are suitable dissimilarity measures, clustering and MCDM tools in non-ranking environments. DnE and DnM also inflict universal scales for later ANOVA and Wilcoxon tests.\",\"PeriodicalId\":240,\"journal\":{\"name\":\"Analytica Chimica Acta\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analytica Chimica Acta\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1016/j.aca.2025.344649\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytica Chimica Acta","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1016/j.aca.2025.344649","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
Sum of Euclidean Distance Differences and Sum of Absolute Manhattan Distance Differences: multicriteria decision making tools for small data tables
Background
Despite its advantages, rank transformation leads inevitably to information loss. This work presents an extension for sum of ranking differences (SRD) algorithm for non-ranking environment. It is expedient to elaborate a new algorithm, which overcomes this difficulty. The procedure has been developed by the analogy of SRD, i.e., pairwise comparisons of (column) vectors, fixing one of them as gold standard and introducing two validation steps (the randomization and Wilcoxon tests after assigning uncertainties by cross-validation).
Results
Two emblematic distance metrics were involved in the development: the most frequently applied Euclidean distance and its robust counterpart the city block (Manhattan) distance. Such a way two new dissimilarity measures have been defined: Sum of Euclidean Distance Differences (DnE) and Sum of Absolute Manhattan Distance Differences (DnM) along with their randomization tests and Variance Analysis (ANOVA). Unfortunately, when leaving the safe rank environment, we also leave the well-known permutations and the theoretical backgrounds (Spearman footrule), as well. This study is limited to a maximum of eight rows in the input matrix, where exact theoretical random distributions are available. Sixteen carefully chosen data sets were selected covering a wide range of scientific disciplines and of numbers for columns and rows in the input matrix: between three to 80 and five to eight, respectively. Three case studies illustrate the advantages and disadvantages of the new dissimilarity measures and statistical tests.
Significance
Superior discrimination ability characterizes DnE and DnM; they provide a more sophisticated ranking (and grouping) patterns than SRD despite their smaller visualization (applicability) domain. The randomization test loses its sensitivity in the order of SRD>DnE>DnM. The latter two realize different clustering patterns from SRD and from each other but (almost) the same ordering. Hence, only one of them is recommended in a ranking environment. Although the random distributions of DnE and DnM is distorted a little, the probability of first kind error (say 5%) can safely be determined from the cumulated frequencies. Comprehensive enumeration of advantages and disadvantages has been completed for SRD, DnE and DnM as dissimilarity measures, clustering tools, multicriteria decision making (MCDM) techniques and their competitors. While preserving great advantages of SRD (simplicity, generality, MCDM character and lack of subjective weights), both new techniques are suitable dissimilarity measures, clustering and MCDM tools in non-ranking environments. DnE and DnM also inflict universal scales for later ANOVA and Wilcoxon tests.
期刊介绍:
Analytica Chimica Acta has an open access mirror journal Analytica Chimica Acta: X, sharing the same aims and scope, editorial team, submission system and rigorous peer review.
Analytica Chimica Acta provides a forum for the rapid publication of original research, and critical, comprehensive reviews dealing with all aspects of fundamental and applied modern analytical chemistry. The journal welcomes the submission of research papers which report studies concerning the development of new and significant analytical methodologies. In determining the suitability of submitted articles for publication, particular scrutiny will be placed on the degree of novelty and impact of the research and the extent to which it adds to the existing body of knowledge in analytical chemistry.