通过渐进式随机抽样和基于图的聚类结果评估聚类倾向

K. R. Prasad, B. E. Reddy
{"title":"通过渐进式随机抽样和基于图的聚类结果评估聚类倾向","authors":"K. R. Prasad, B. E. Reddy","doi":"10.1109/IADCC.2013.6514316","DOIUrl":null,"url":null,"abstract":"Clustering analysis is widely used technique in many emerging applications. Assessment of clustering tendency is generally done by Visual Access Tendency (VAT) algorithm. VAT detects the clustering tendency by reordering the indices of objects from the dissimilarity matrix, according to logic of Prim's algorithm. Therefore, VAT demands high computational cost for large datasets. The contribution of proposed work is to develop best sampling technique for obtaining good representative of entire dataset in the form of sub-dissimilarity matrix in VAT, it provides accessing of prior tendency visually by detecting number of square shaped dark blocks along with diagonal in sample based VAT image. This proposed work gives same clustering tendency results when we compare with simple VAT, and it has an advantage of less processing time since it uses only sampled dissimilarity matrix. This sample based VAT (PSVAT) uses set of distinguished features for random selection of progressive sample representatives. Finally, known clustering tendency is used in graph-based clustering technique (Minimum Spanning Tree based clustering) for achieving efficient clustering results. Comparative runtime values of PSVAT and VAT on several datasets are presented in this paper for showing that PSVAT is better than VAT in respect of runtime performance and clustering validity is also tested by Dunn's Index for sampled data.","PeriodicalId":325901,"journal":{"name":"2013 3rd IEEE International Advance Computing Conference (IACC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Assessment of clustering tendency through progressive random sampling and graph-based clustering results\",\"authors\":\"K. R. Prasad, B. E. Reddy\",\"doi\":\"10.1109/IADCC.2013.6514316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering analysis is widely used technique in many emerging applications. Assessment of clustering tendency is generally done by Visual Access Tendency (VAT) algorithm. VAT detects the clustering tendency by reordering the indices of objects from the dissimilarity matrix, according to logic of Prim's algorithm. Therefore, VAT demands high computational cost for large datasets. The contribution of proposed work is to develop best sampling technique for obtaining good representative of entire dataset in the form of sub-dissimilarity matrix in VAT, it provides accessing of prior tendency visually by detecting number of square shaped dark blocks along with diagonal in sample based VAT image. This proposed work gives same clustering tendency results when we compare with simple VAT, and it has an advantage of less processing time since it uses only sampled dissimilarity matrix. This sample based VAT (PSVAT) uses set of distinguished features for random selection of progressive sample representatives. Finally, known clustering tendency is used in graph-based clustering technique (Minimum Spanning Tree based clustering) for achieving efficient clustering results. Comparative runtime values of PSVAT and VAT on several datasets are presented in this paper for showing that PSVAT is better than VAT in respect of runtime performance and clustering validity is also tested by Dunn's Index for sampled data.\",\"PeriodicalId\":325901,\"journal\":{\"name\":\"2013 3rd IEEE International Advance Computing Conference (IACC)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 3rd IEEE International Advance Computing Conference (IACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IADCC.2013.6514316\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 3rd IEEE International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2013.6514316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

聚类分析技术在许多新兴应用中得到了广泛的应用。聚类倾向的评价一般采用VAT (Visual Access tendency)算法。根据Prim算法的逻辑,VAT通过从不相似矩阵中重新排序对象的索引来检测聚类趋势。因此,对于大型数据集,VAT要求较高的计算成本。本文的贡献在于开发了一种最佳的采样技术,以VAT中的子不相似矩阵的形式获得整个数据集的良好代表,它通过检测基于样本的VAT图像中沿对角线的方形暗块的数量来直观地访问先验趋势。当我们与简单的VAT进行比较时,提出的工作给出了相同的聚类倾向结果,并且由于它只使用采样的不相似矩阵,因此具有较少的处理时间的优点。这种基于样本的增值税(PSVAT)使用一组显著特征来随机选择渐进样本代表。最后,将已知的聚类倾向用于基于图的聚类技术(基于最小生成树的聚类),以获得高效的聚类结果。本文给出了PSVAT和VAT在多个数据集上的运行时比较值,表明PSVAT在运行时性能上优于VAT,并通过Dunn's Index对采样数据进行了聚类有效性检验。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Assessment of clustering tendency through progressive random sampling and graph-based clustering results
Clustering analysis is widely used technique in many emerging applications. Assessment of clustering tendency is generally done by Visual Access Tendency (VAT) algorithm. VAT detects the clustering tendency by reordering the indices of objects from the dissimilarity matrix, according to logic of Prim's algorithm. Therefore, VAT demands high computational cost for large datasets. The contribution of proposed work is to develop best sampling technique for obtaining good representative of entire dataset in the form of sub-dissimilarity matrix in VAT, it provides accessing of prior tendency visually by detecting number of square shaped dark blocks along with diagonal in sample based VAT image. This proposed work gives same clustering tendency results when we compare with simple VAT, and it has an advantage of less processing time since it uses only sampled dissimilarity matrix. This sample based VAT (PSVAT) uses set of distinguished features for random selection of progressive sample representatives. Finally, known clustering tendency is used in graph-based clustering technique (Minimum Spanning Tree based clustering) for achieving efficient clustering results. Comparative runtime values of PSVAT and VAT on several datasets are presented in this paper for showing that PSVAT is better than VAT in respect of runtime performance and clustering validity is also tested by Dunn's Index for sampled data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信