Enhancing pattern recognition in social networking dataset by using bisecting KMean

Shilpa V. Gajbhiye, Gaurav B. Malode
{"title":"Enhancing pattern recognition in social networking dataset by using bisecting KMean","authors":"Shilpa V. Gajbhiye, Gaurav B. Malode","doi":"10.1109/I2C2.2017.8321776","DOIUrl":null,"url":null,"abstract":"Databases today can range in size more than terabytes. Within these masses of data lies hidden information of strategic importance. So when there are lots of trees, how to find conclusions about the forest? The newest answer is mining of data, which is being used to increase revenues. Data mining is a process that uses a variety of data analysis tools to discover patterns and relationships in data that may be used to make valid predictions. This research uses social networking data set for pattern recognition, because it is one of the emerging application areas in data mining. We used Facebook 100 dataset and applied Bisecting KMeans algorithm on it, so that we would get better clustering outputs. Bisecting KMeans first bisects the data into 2 parts and selects the part with greater number of elements, then apply clustering on it again. This goes on till we have N Number of clusters. We would apply this to our dataset to get desired results. With this we are going to compare Bisecting K Mean algorithm with other data mining algorithm. And finally we are going to find out different pattern from social networking dataset.","PeriodicalId":288351,"journal":{"name":"2017 International Conference on Intelligent Computing and Control (I2C2)","volume":"37 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Intelligent Computing and Control (I2C2)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/I2C2.2017.8321776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Databases today can range in size more than terabytes. Within these masses of data lies hidden information of strategic importance. So when there are lots of trees, how to find conclusions about the forest? The newest answer is mining of data, which is being used to increase revenues. Data mining is a process that uses a variety of data analysis tools to discover patterns and relationships in data that may be used to make valid predictions. This research uses social networking data set for pattern recognition, because it is one of the emerging application areas in data mining. We used Facebook 100 dataset and applied Bisecting KMeans algorithm on it, so that we would get better clustering outputs. Bisecting KMeans first bisects the data into 2 parts and selects the part with greater number of elements, then apply clustering on it again. This goes on till we have N Number of clusters. We would apply this to our dataset to get desired results. With this we are going to compare Bisecting K Mean algorithm with other data mining algorithm. And finally we are going to find out different pattern from social networking dataset.
基于分割KMean的社交网络数据集模式识别方法研究
今天的数据库的大小可以超过tb。在这些海量的数据中隐藏着具有战略重要性的信息。那么当有很多树的时候,如何找到关于森林的结论呢?最新的答案是数据挖掘,这被用来增加收入。数据挖掘是一个使用各种数据分析工具来发现数据中的模式和关系的过程,这些模式和关系可用于进行有效的预测。本研究使用社交网络数据集进行模式识别,因为它是数据挖掘中新兴的应用领域之一。为了得到更好的聚类输出,我们使用了Facebook 100数据集,并对其应用了平分KMeans算法。平分KMeans首先将数据平分为2部分,选择元素数量较多的部分,然后再次对其进行聚类。这个过程一直持续到我们有N个簇。我们可以将此应用于我们的数据集以获得期望的结果。在此基础上,我们将比较平分K均值算法与其他数据挖掘算法。最后,我们将从社交网络数据中找出不同的模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信