具有离群值的数据集归一化技术的比较

IF 0.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos
{"title":"具有离群值的数据集归一化技术的比较","authors":"Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos","doi":"10.4018/ijdsst.286184","DOIUrl":null,"url":null,"abstract":"With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.","PeriodicalId":42414,"journal":{"name":"International Journal of Decision Support System Technology","volume":"2 1","pages":"1-17"},"PeriodicalIF":0.6000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Comparison of Normalization Techniques on Data Sets With Outliers\",\"authors\":\"Nazanin Vafaei, Rita Almeida Ribeiro, L. Camarinha-Matos\",\"doi\":\"10.4018/ijdsst.286184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.\",\"PeriodicalId\":42414,\"journal\":{\"name\":\"International Journal of Decision Support System Technology\",\"volume\":\"2 1\",\"pages\":\"1-17\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Decision Support System Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/ijdsst.286184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Decision Support System Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/ijdsst.286184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2

摘要

随着数据丰富系统的快速发展,处理具有倾斜输入数据集和各自异常值的复杂决策问题是不可避免的。通常,数据偏度是指数据集中的非均匀分布,即包含不对称和/或异常值的数据集。规范化是大多数多标准决策(MCDM)问题的第一步,用于从异构输入数据集中获得无量纲数据,从而实现标准的聚合,从而对备选方案进行排序。因此,当标准数据集中存在异常值时,找到合适的归一化技术是至关重要的。因此,在这项工作中,我们比较了标准数据集上的七种归一化技术(Max、Max- min、Vector、Sum、Logarithmic、targetbased和Fuzzification),这些数据集包含异常值,以分析它们对MCDM问题的结果。数值示例说明了所选规范化技术的行为,并使用(正在进行的)评估评估框架来推荐此类标准的最佳规范化技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparison of Normalization Techniques on Data Sets With Outliers
With the fast growing of data-rich systems, dealing with complex decision problems with skewed input data sets and respective outliers is unavoidable. Generally, data skewness refers to a non-uniform distribution in a dataset, i.e. a dataset which contains asymmetries and/or outliers. Normalization is the first step of most multi-criteria decision making (MCDM) problems to obtain dimensionless data, from heterogeneous input data sets, that enable aggregation of criteria and thereby ranking of alternatives. Therefore, when in presence of outliers in criteria datasets, finding a suitable normalization technique is of utmost importance. As such, in this work, we compare seven normalization techniques (Max, Max-Min, Vector, Sum, Logarithmic, Target-based, and Fuzzification) on criteria datasets, which contain outliers to analyse their results for MCDM problems. A numerical example illustrates the behaviour of the chosen normalization techniques and an (ongoing) evaluation assessment framework is used to recommend the best normalization technique for this type of criteria.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Decision Support System Technology
International Journal of Decision Support System Technology COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
2.20
自引率
18.20%
发文量
40
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信