利用局部离群值整流器V.2.0重新定位由K-means和k - medium产生的局部离群值

Rogelio O. Badiang, B. Gerardo, Ruji P. Medina
{"title":"利用局部离群值整流器V.2.0重新定位由K-means和k - medium产生的局部离群值","authors":"Rogelio O. Badiang, B. Gerardo, Ruji P. Medina","doi":"10.1109/ICACSIS47736.2019.8979741","DOIUrl":null,"url":null,"abstract":"The extensive growth in the field of information and communication technology allows easy capture of massive amounts of valuable data in different areas. These data are used in various data mining techniques. However, in some cases, the presence of outliers in the dataset exists. One of the categories of an outlier is the local outlier. Local outliers are data points that deviate locally from the cluster center. They occur when the cluster center, known as centroid or medoid, cannot represent all the data members in the cluster. The unrepresented data are mistakenly classified to their closest clusters, making them local outliers. With this, the study aims to address the problem of local outliers produced by K-means and K-medoids. The Local Outlier Rectifier V.2.0 (LOR V.2.0) is a method used to relocate local outliers to their correct clusters. The simulations show that when LOR V.2.0 is partnered with K-means, it was able to relocate 35.37%, 34.78%, 25%, and 12.28% local outliers of Ionosphere, Breast Cancer Wisconsin, Iris, and Breast Cancer Coimbra datasets, respectively. On the contrary, when LOR V.2.0 is partnered with K-medoids, 29.67% of Breast Cancer Wisconsin, 29.11% of Ionosphere, 25.0% of Iris, and 10.34% of Breast Cancer Coimbra local outliers were transferred to their correct clusters. The result also indicates that the method works better when partnered with K-means.","PeriodicalId":165090,"journal":{"name":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Relocating Local Outliers Produced by K-means and K-medoids Using Local Outlier Rectifier V.2.0\",\"authors\":\"Rogelio O. Badiang, B. Gerardo, Ruji P. Medina\",\"doi\":\"10.1109/ICACSIS47736.2019.8979741\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The extensive growth in the field of information and communication technology allows easy capture of massive amounts of valuable data in different areas. These data are used in various data mining techniques. However, in some cases, the presence of outliers in the dataset exists. One of the categories of an outlier is the local outlier. Local outliers are data points that deviate locally from the cluster center. They occur when the cluster center, known as centroid or medoid, cannot represent all the data members in the cluster. The unrepresented data are mistakenly classified to their closest clusters, making them local outliers. With this, the study aims to address the problem of local outliers produced by K-means and K-medoids. The Local Outlier Rectifier V.2.0 (LOR V.2.0) is a method used to relocate local outliers to their correct clusters. The simulations show that when LOR V.2.0 is partnered with K-means, it was able to relocate 35.37%, 34.78%, 25%, and 12.28% local outliers of Ionosphere, Breast Cancer Wisconsin, Iris, and Breast Cancer Coimbra datasets, respectively. On the contrary, when LOR V.2.0 is partnered with K-medoids, 29.67% of Breast Cancer Wisconsin, 29.11% of Ionosphere, 25.0% of Iris, and 10.34% of Breast Cancer Coimbra local outliers were transferred to their correct clusters. The result also indicates that the method works better when partnered with K-means.\",\"PeriodicalId\":165090,\"journal\":{\"name\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACSIS47736.2019.8979741\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advanced Computer Science and information Systems (ICACSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACSIS47736.2019.8979741","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

信息和通信技术领域的广泛发展使得在不同领域轻松捕获大量有价值的数据成为可能。这些数据用于各种数据挖掘技术。然而,在某些情况下,数据集中存在异常值。离群值的一类是局部离群值。局部离群点是局部偏离聚类中心的数据点。当集群中心(称为质心或媒质)不能表示集群中的所有数据成员时,就会出现这种情况。未表示的数据被错误地分类到最接近的簇中,使它们成为局部异常值。因此,本研究旨在解决由k均值和k介质产生的局部异常值问题。Local Outlier Rectifier V.2.0 (LOR V.2.0)是一种将局部离群点重新定位到正确集群的方法。模拟结果表明,当LOR V.2.0与K-means相结合时,它能够分别重新定位电离层、Breast Cancer Wisconsin、Iris和Breast Cancer Coimbra数据集的35.37%、34.78%、25%和12.28%的局部异常值。相反,当LOR V.2.0与K-medoids结合时,29.67%的Breast Cancer Wisconsin、29.11%的Ionosphere、25.0%的Iris和10.34%的Breast Cancer Coimbra local outliers被转移到正确的集群中。结果还表明,当与K-means结合使用时,该方法效果更好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Relocating Local Outliers Produced by K-means and K-medoids Using Local Outlier Rectifier V.2.0
The extensive growth in the field of information and communication technology allows easy capture of massive amounts of valuable data in different areas. These data are used in various data mining techniques. However, in some cases, the presence of outliers in the dataset exists. One of the categories of an outlier is the local outlier. Local outliers are data points that deviate locally from the cluster center. They occur when the cluster center, known as centroid or medoid, cannot represent all the data members in the cluster. The unrepresented data are mistakenly classified to their closest clusters, making them local outliers. With this, the study aims to address the problem of local outliers produced by K-means and K-medoids. The Local Outlier Rectifier V.2.0 (LOR V.2.0) is a method used to relocate local outliers to their correct clusters. The simulations show that when LOR V.2.0 is partnered with K-means, it was able to relocate 35.37%, 34.78%, 25%, and 12.28% local outliers of Ionosphere, Breast Cancer Wisconsin, Iris, and Breast Cancer Coimbra datasets, respectively. On the contrary, when LOR V.2.0 is partnered with K-medoids, 29.67% of Breast Cancer Wisconsin, 29.11% of Ionosphere, 25.0% of Iris, and 10.34% of Breast Cancer Coimbra local outliers were transferred to their correct clusters. The result also indicates that the method works better when partnered with K-means.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信