{"title":"一种在Twitter上检测和跟踪突发新闻的方法","authors":"Anmol Shukla, Dhruv Aggarwal, R. Keskar","doi":"10.1145/2998476.2998491","DOIUrl":null,"url":null,"abstract":"Twitter is an interesting platform for the dissemination of news. The real-time nature and brevity of the tweets are conducive to sharing of information related to important events as they unfold. But, one of the greatest challenges is to find the tweets that we can characterize as news in the ocean of tweets. In this paper, we propose a novel method for detecting and tracking breaking news from Twitter in real-time. We filter the stream of incoming tweets to remove junk tweets using a text classification algorithm. We also compare the performance of different supervised text classification algorithms for this task. We then cluster similar tweets, so that, tweets in the same cluster relate to the same real-life event and can be termed as a breaking news. Finally, we rank the news using a dynamic scoring system which also allows us to track the news over a period of time.","PeriodicalId":171399,"journal":{"name":"Proceedings of the 9th Annual ACM India Conference","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A Methodology to Detect and Track Breaking News on Twitter\",\"authors\":\"Anmol Shukla, Dhruv Aggarwal, R. Keskar\",\"doi\":\"10.1145/2998476.2998491\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Twitter is an interesting platform for the dissemination of news. The real-time nature and brevity of the tweets are conducive to sharing of information related to important events as they unfold. But, one of the greatest challenges is to find the tweets that we can characterize as news in the ocean of tweets. In this paper, we propose a novel method for detecting and tracking breaking news from Twitter in real-time. We filter the stream of incoming tweets to remove junk tweets using a text classification algorithm. We also compare the performance of different supervised text classification algorithms for this task. We then cluster similar tweets, so that, tweets in the same cluster relate to the same real-life event and can be termed as a breaking news. Finally, we rank the news using a dynamic scoring system which also allows us to track the news over a period of time.\",\"PeriodicalId\":171399,\"journal\":{\"name\":\"Proceedings of the 9th Annual ACM India Conference\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th Annual ACM India Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2998476.2998491\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th Annual ACM India Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2998476.2998491","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Methodology to Detect and Track Breaking News on Twitter
Twitter is an interesting platform for the dissemination of news. The real-time nature and brevity of the tweets are conducive to sharing of information related to important events as they unfold. But, one of the greatest challenges is to find the tweets that we can characterize as news in the ocean of tweets. In this paper, we propose a novel method for detecting and tracking breaking news from Twitter in real-time. We filter the stream of incoming tweets to remove junk tweets using a text classification algorithm. We also compare the performance of different supervised text classification algorithms for this task. We then cluster similar tweets, so that, tweets in the same cluster relate to the same real-life event and can be termed as a breaking news. Finally, we rank the news using a dynamic scoring system which also allows us to track the news over a period of time.