Web 2.0 social bookmark selection for tag clustering

S. S. Kumar, H. Inbarani
{"title":"Web 2.0 social bookmark selection for tag clustering","authors":"S. S. Kumar, H. Inbarani","doi":"10.1109/ICPRIME.2013.6496724","DOIUrl":null,"url":null,"abstract":"Tagging is a popular way to annotate web 2.0 web sites. A tag is any user-generated word or phrase that helps to organize web 2.0 content. The current hype around web 2.0 applications, poses several important challenges for future data and web mining methods. An important challenge of Web 2.0 is the fact that a large amount of data has been generated over a short period. Clustering the tag data is very tedious since the tag space is very large in several social book marking web sites. So, instead of clustering the whole tag space of Web 2.0 data, some tags frequent enough in the tag space can be selected for clustering by applying feature selection techniques. The goal of feature selection is to determine a marginal bookmarked URL subset from a Web 2.0 data while retaining a suitably high accuracy in representing the original bookmarks. Tag clustering is the process of grouping similar tags into the same cluster and is important for the success of collaborative tagging services. In this paper, Unsupervised Quick Reduct feature selection algorithm is applied to find a set of most commonly tagged bookmarks and then clustering techniques such as Soft rough fuzzy clustering and Rough K-Means algorithms are applied for clustering of user generated tags and the performance of these clustering approaches are illustrated in this paper.","PeriodicalId":123210,"journal":{"name":"2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Pattern Recognition, Informatics and Mobile Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPRIME.2013.6496724","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

Tagging is a popular way to annotate web 2.0 web sites. A tag is any user-generated word or phrase that helps to organize web 2.0 content. The current hype around web 2.0 applications, poses several important challenges for future data and web mining methods. An important challenge of Web 2.0 is the fact that a large amount of data has been generated over a short period. Clustering the tag data is very tedious since the tag space is very large in several social book marking web sites. So, instead of clustering the whole tag space of Web 2.0 data, some tags frequent enough in the tag space can be selected for clustering by applying feature selection techniques. The goal of feature selection is to determine a marginal bookmarked URL subset from a Web 2.0 data while retaining a suitably high accuracy in representing the original bookmarks. Tag clustering is the process of grouping similar tags into the same cluster and is important for the success of collaborative tagging services. In this paper, Unsupervised Quick Reduct feature selection algorithm is applied to find a set of most commonly tagged bookmarks and then clustering techniques such as Soft rough fuzzy clustering and Rough K-Means algorithms are applied for clustering of user generated tags and the performance of these clustering approaches are illustrated in this paper.
用于标记聚类的Web 2.0社交书签选择
标记是注释web 2.0网站的一种流行方法。标签是任何用户生成的有助于组织web 2.0内容的单词或短语。当前围绕web 2.0应用程序的炒作,对未来的数据和web挖掘方法提出了几个重要的挑战。Web 2.0的一个重要挑战是在短时间内生成了大量数据。在一些社会化书签网站中,由于标签空间非常大,因此标签数据聚类是非常繁琐的。因此,不必对Web 2.0数据的整个标记空间进行聚类,而是可以通过应用特征选择技术选择标记空间中足够频繁的一些标记进行聚类。特性选择的目标是从Web 2.0数据中确定边缘书签URL子集,同时在表示原始书签方面保持适当的高精度。标签聚类是将相似的标签分组到同一集群中的过程,对于协作标记服务的成功至关重要。本文采用无监督快速约简特征选择算法寻找一组最常标记的书签,然后采用软粗糙模糊聚类和粗糙K-Means算法等聚类技术对用户生成的标签进行聚类,并对这些聚类方法的性能进行了说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信