SOM-empowered graph segmentation for fast automatic clustering of large and complex data

E. Merényi, Joshua Taylor
{"title":"SOM-empowered graph segmentation for fast automatic clustering of large and complex data","authors":"E. Merényi, Joshua Taylor","doi":"10.1109/WSOM.2017.8020004","DOIUrl":null,"url":null,"abstract":"Many clustering methods, including modern graph segmentation algorithms, run into limitations when encountering “Big Data”, data with high feature dimensions, large volume, and complex structure. SOM-based clustering has been demonstrated to accurately capture many clusters of widely varying statistical properties in such data. While a number of automated SOM segmentations have been put forward, the best identifications of complex cluster structures to date are those performed interactively from informative visualizations of the learned SOM's knowledge. This does not scale for Big Data, large archives or near-real time analyses for fast decision-making. We present a new automated approach to SOM-segmentation which closely approximates the precision of the interactive method for complicated data, and at the same time is very fast and memory-efficient. We achieve this by infusing SOM knowledge into leading graph segmentation algorithms which, by themselves, produce extremely poor results segmenting the SOM prototypes. We use the SOM prototypes as input vectors and CONN similarity measure, derived from the SOM's knowledge of the data connectivity, as edge weighting to the graph segmentation algorithms. We demonstrate the effectiveness on synthetic data and on real spectral imagery.","PeriodicalId":130086,"journal":{"name":"2017 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 12th International Workshop on Self-Organizing Maps and Learning Vector Quantization, Clustering and Data Visualization (WSOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSOM.2017.8020004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Many clustering methods, including modern graph segmentation algorithms, run into limitations when encountering “Big Data”, data with high feature dimensions, large volume, and complex structure. SOM-based clustering has been demonstrated to accurately capture many clusters of widely varying statistical properties in such data. While a number of automated SOM segmentations have been put forward, the best identifications of complex cluster structures to date are those performed interactively from informative visualizations of the learned SOM's knowledge. This does not scale for Big Data, large archives or near-real time analyses for fast decision-making. We present a new automated approach to SOM-segmentation which closely approximates the precision of the interactive method for complicated data, and at the same time is very fast and memory-efficient. We achieve this by infusing SOM knowledge into leading graph segmentation algorithms which, by themselves, produce extremely poor results segmenting the SOM prototypes. We use the SOM prototypes as input vectors and CONN similarity measure, derived from the SOM's knowledge of the data connectivity, as edge weighting to the graph segmentation algorithms. We demonstrate the effectiveness on synthetic data and on real spectral imagery.
基于som的图形分割,用于大型复杂数据的快速自动聚类
许多聚类方法,包括现代的图分割算法,在面对“大数据”这种特征维数高、体积大、结构复杂的数据时,都存在局限性。基于som的聚类已被证明可以准确地捕获此类数据中具有广泛不同统计属性的许多聚类。虽然已经提出了许多自动化的SOM分割方法,但迄今为止,对复杂簇结构的最佳识别是那些从学习的SOM知识的信息可视化中交互式执行的识别。这并不适用于大数据、大型档案或用于快速决策的近实时分析。本文提出了一种新的自动分割方法,该方法既接近复杂数据的交互式分割方法的精度,又具有快速和节省内存的特点。我们通过将SOM知识注入到领先的图分割算法中来实现这一点,这些算法本身会产生非常差的分割SOM原型的结果。我们使用SOM原型作为输入向量和CONN相似度度量,从SOM对数据连通性的了解中获得,作为图分割算法的边缘加权。我们证明了在合成数据和真实光谱图像上的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信