Using empirical risk minimization to detect community structure in the blogosphere

Jiaxuan Huang, Hongsen Huang
{"title":"Using empirical risk minimization to detect community structure in the blogosphere","authors":"Jiaxuan Huang, Hongsen Huang","doi":"10.1109/ISKE.2010.5680843","DOIUrl":null,"url":null,"abstract":"When we are dealing with community structure detecting in the blogosphere, we have come to face some obstacles. The data in a blog may be updated frequently by its owner, making the whole blogosphere become very large during a short period of time. It can be very expensive to deal with such huge amount of data using those traditional methods. Meanwhile, few blogs in the blogosphere can be identified as members of a specify community clearly from their own characters, while we have to judge most blogs depending on the relationship with other neighboring blogs using centrality metrics. Recently, a new method that combines active learning and semi-supervised learning gives quite a good performance on improving the speed and accuracy of machine learning on large scale of data. In this paper, we employ this method to solve the community clustering problem with a vast and complex data set. We try to show that this method really does a better job on labeling and clustering large scale of data by comparing the result with the one achieved in the traditional way. Afterward, we may make some improvements and use it to deal with community detecting in the blogosphere.","PeriodicalId":6417,"journal":{"name":"2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering","volume":"15 1","pages":"418-421"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISKE.2010.5680843","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

When we are dealing with community structure detecting in the blogosphere, we have come to face some obstacles. The data in a blog may be updated frequently by its owner, making the whole blogosphere become very large during a short period of time. It can be very expensive to deal with such huge amount of data using those traditional methods. Meanwhile, few blogs in the blogosphere can be identified as members of a specify community clearly from their own characters, while we have to judge most blogs depending on the relationship with other neighboring blogs using centrality metrics. Recently, a new method that combines active learning and semi-supervised learning gives quite a good performance on improving the speed and accuracy of machine learning on large scale of data. In this paper, we employ this method to solve the community clustering problem with a vast and complex data set. We try to show that this method really does a better job on labeling and clustering large scale of data by comparing the result with the one achieved in the traditional way. Afterward, we may make some improvements and use it to deal with community detecting in the blogosphere.
使用经验风险最小化来检测博客圈中的社区结构
当我们在处理博客圈的社区结构侦测时,我们遇到了一些障碍。博客中的数据可能被其所有者频繁更新,使得整个博客圈在短时间内变得非常大。使用传统方法处理如此庞大的数据可能非常昂贵。与此同时,在博客圈中,很少有博客可以从其自身特征明确地识别为特定社区的成员,而我们必须使用中心性指标根据与其他邻近博客的关系来判断大多数博客。近年来,一种结合主动学习和半监督学习的新方法在提高机器学习的速度和准确性方面取得了很好的效果。在本文中,我们采用这种方法来解决一个庞大而复杂的数据集的社区聚类问题。通过与传统方法的结果比较,我们试图证明这种方法在大规模数据的标记和聚类方面确实做得更好。之后,我们可能会做一些改进,并使用它来处理博客圈中的社区检测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信