{"title":"从微博中挖掘在线社区的数据分析算法。","authors":"Hongfei Xiao, Suting Zhou, Min Zhao","doi":"10.1504/ijwbc.2020.10028247","DOIUrl":null,"url":null,"abstract":"Mining microblog data based on complex networks is conducive to the effective mining of useful information. This paper focuses on community mining. A complex network is introduced, followed by a community mining algorithm based on user similarity. Based on the similarity, different communities were divided, and experiments were carried out with real datasets. The experimental results showed that the accuracy of the algorithm was 87.5%, the recall rate was 87.1% and the operation time was 2.1 s. In the result of dataset 2, the average modularity of the designed algorithm was 0.532, which was better than the Girvan and Newman (GN) algorithm and there was no weak community structure, showing that the algorithm had better performance in community mining. The experimental results demonstrate the reliability of the mining algorithm and clarify the contributions of data mining for detecting communities from a microblog network.","PeriodicalId":39041,"journal":{"name":"International Journal of Web Based Communities","volume":"16 1","pages":"211-221"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Data Analysis Algorithms for Mining Online Communities from Microblogs.\",\"authors\":\"Hongfei Xiao, Suting Zhou, Min Zhao\",\"doi\":\"10.1504/ijwbc.2020.10028247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mining microblog data based on complex networks is conducive to the effective mining of useful information. This paper focuses on community mining. A complex network is introduced, followed by a community mining algorithm based on user similarity. Based on the similarity, different communities were divided, and experiments were carried out with real datasets. The experimental results showed that the accuracy of the algorithm was 87.5%, the recall rate was 87.1% and the operation time was 2.1 s. In the result of dataset 2, the average modularity of the designed algorithm was 0.532, which was better than the Girvan and Newman (GN) algorithm and there was no weak community structure, showing that the algorithm had better performance in community mining. The experimental results demonstrate the reliability of the mining algorithm and clarify the contributions of data mining for detecting communities from a microblog network.\",\"PeriodicalId\":39041,\"journal\":{\"name\":\"International Journal of Web Based Communities\",\"volume\":\"16 1\",\"pages\":\"211-221\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Web Based Communities\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijwbc.2020.10028247\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Web Based Communities","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijwbc.2020.10028247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 2
摘要
基于复杂网络的微博数据挖掘有利于有效挖掘有用信息。本文的研究重点是社区采矿。首先介绍了复杂网络,然后提出了一种基于用户相似度的社区挖掘算法。基于相似度划分不同群落,利用真实数据集进行实验。实验结果表明,该算法的准确率为87.5%,查全率为87.1%,操作时间为2.1 s。在数据集2的结果中,所设计算法的平均模块化度为0.532,优于Girvan and Newman (GN)算法,且不存在弱社团结构,表明该算法在社团挖掘方面具有更好的性能。实验结果证明了挖掘算法的可靠性,并阐明了数据挖掘对微博网络社区检测的贡献。
Data Analysis Algorithms for Mining Online Communities from Microblogs.
Mining microblog data based on complex networks is conducive to the effective mining of useful information. This paper focuses on community mining. A complex network is introduced, followed by a community mining algorithm based on user similarity. Based on the similarity, different communities were divided, and experiments were carried out with real datasets. The experimental results showed that the accuracy of the algorithm was 87.5%, the recall rate was 87.1% and the operation time was 2.1 s. In the result of dataset 2, the average modularity of the designed algorithm was 0.532, which was better than the Girvan and Newman (GN) algorithm and there was no weak community structure, showing that the algorithm had better performance in community mining. The experimental results demonstrate the reliability of the mining algorithm and clarify the contributions of data mining for detecting communities from a microblog network.