On dynamic data clustering and visualization using swarm intelligence

Esin Saka, O. Nasraoui
{"title":"On dynamic data clustering and visualization using swarm intelligence","authors":"Esin Saka, O. Nasraoui","doi":"10.1109/ICDEW.2010.5452721","DOIUrl":null,"url":null,"abstract":"Clustering and visualizing high-dimensional sparse data simultaneously is a very attractive goal, yet it is also a challenging problem. Our previous studies using a special type of swarms, known as flocks of agents, provided some promising approaches to this challenging problem on several limited size UCI machine learning data sets and Web usage sessions (from web access logs) [1], [2]. However, dynamic domains, such as practically any data generated on the Web, may require frequent costly updates of the clusters (and the visualization), whenever new data records are added to the dataset. The new coming data may be due to new user activity on a website (clickstreams) or a search engine (queries), or new Web pages in the case of document clustering, etc. Additionally, data records may result in a change of clustering in time. Therefore, clusters may need to be updated, thus leading to the need to mine dynamic clusters. This paper summarizes our initial studies in designing a simultaneous clustering and visualization algorithm and proposes the Dynamic-FClust Algorithm, which is based on flocks of agents as a biological metaphor. This algorithm falls within the swarm-based clustering family, which is unique compared to other approaches, because its model is an ongoing swarm of agents that socially interact with each other, and is therefore inherently dynamic.","PeriodicalId":442345,"journal":{"name":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW 2010)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2010.5452721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Clustering and visualizing high-dimensional sparse data simultaneously is a very attractive goal, yet it is also a challenging problem. Our previous studies using a special type of swarms, known as flocks of agents, provided some promising approaches to this challenging problem on several limited size UCI machine learning data sets and Web usage sessions (from web access logs) [1], [2]. However, dynamic domains, such as practically any data generated on the Web, may require frequent costly updates of the clusters (and the visualization), whenever new data records are added to the dataset. The new coming data may be due to new user activity on a website (clickstreams) or a search engine (queries), or new Web pages in the case of document clustering, etc. Additionally, data records may result in a change of clustering in time. Therefore, clusters may need to be updated, thus leading to the need to mine dynamic clusters. This paper summarizes our initial studies in designing a simultaneous clustering and visualization algorithm and proposes the Dynamic-FClust Algorithm, which is based on flocks of agents as a biological metaphor. This algorithm falls within the swarm-based clustering family, which is unique compared to other approaches, because its model is an ongoing swarm of agents that socially interact with each other, and is therefore inherently dynamic.
基于群体智能的动态数据聚类与可视化研究
同时实现高维稀疏数据的聚类和可视化是一个非常有吸引力的目标,但也是一个具有挑战性的问题。我们之前的研究使用了一种特殊类型的群体,称为代理群,在几个有限大小的UCI机器学习数据集和Web使用会话(来自Web访问日志)上提供了一些有希望的方法来解决这个具有挑战性的问题[1],[2]。然而,动态域(例如几乎在Web上生成的任何数据)可能需要在向数据集添加新数据记录时频繁地更新集群(和可视化),并且代价高昂。新的数据可能是由于网站上的新用户活动(点击流)或搜索引擎(查询),或者在文档聚类的情况下新的Web页面,等等。此外,数据记录可能会导致聚类在时间上发生变化。因此,集群可能需要更新,从而导致需要挖掘动态集群。本文总结了我们在设计同时聚类和可视化算法方面的初步研究,提出了基于agent群作为生物隐喻的Dynamic-FClust算法。该算法属于基于群体的聚类家族,与其他方法相比,它是独一无二的,因为它的模型是一个持续的、相互社会互动的代理群体,因此本质上是动态的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信