Junliang Bai, Jun Guo, Guang Chen, Weiran Xu, Gang Du
{"title":"一种有效的文本流热事件检测算法","authors":"Junliang Bai, Jun Guo, Guang Chen, Weiran Xu, Gang Du","doi":"10.1109/CYBERC.2010.65","DOIUrl":null,"url":null,"abstract":"Hot events detection in text streams has drawn increasing attention in recent sequential data mining works. Different from traditional TDT task which find all the real events’ cluster, hot events detection only identify hot events concerned by public. This paper proposes a novel approach to identify those events based on burst terms, terms co-occurrence and generative probabilistic model. Experiments with huge text stream sets crawled from WWW suggest that our algorithm can work on-line and identify hot events effectively and efficiently.","PeriodicalId":315132,"journal":{"name":"2010 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Efficient Algorithm of Hot Events Detection in Text Streams\",\"authors\":\"Junliang Bai, Jun Guo, Guang Chen, Weiran Xu, Gang Du\",\"doi\":\"10.1109/CYBERC.2010.65\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hot events detection in text streams has drawn increasing attention in recent sequential data mining works. Different from traditional TDT task which find all the real events’ cluster, hot events detection only identify hot events concerned by public. This paper proposes a novel approach to identify those events based on burst terms, terms co-occurrence and generative probabilistic model. Experiments with huge text stream sets crawled from WWW suggest that our algorithm can work on-line and identify hot events effectively and efficiently.\",\"PeriodicalId\":315132,\"journal\":{\"name\":\"2010 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery\",\"volume\":\"131 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CYBERC.2010.65\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CYBERC.2010.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Efficient Algorithm of Hot Events Detection in Text Streams
Hot events detection in text streams has drawn increasing attention in recent sequential data mining works. Different from traditional TDT task which find all the real events’ cluster, hot events detection only identify hot events concerned by public. This paper proposes a novel approach to identify those events based on burst terms, terms co-occurrence and generative probabilistic model. Experiments with huge text stream sets crawled from WWW suggest that our algorithm can work on-line and identify hot events effectively and efficiently.