{"title":"Hot topic extraction based on frequency, position, scattering and topical weight for time sliced news documents","authors":"Y. Jahnavi, Y. Radhika","doi":"10.1109/ICACT.2013.6710495","DOIUrl":null,"url":null,"abstract":"Internet based news documents are the basic information transmission media. In such a case detecting hot topics and tracking the event development is most important. However, it is almost impossible to view all the generated topics, due to its large amount of size. Therefore it is necessary to rank the topics. The topic ranking should be done on the importance basis. But this importance is determined by how frequently a topic appears and this importance varies in different time slots. For extracting hot topics, most of the text mining approaches with vector space model need to determine the weighting of the feature terms. Existing traditional algorithms can't achieve high accuracy for retrieving hot terms, because they have not considered position, scattering and topicality. This paper presents an innovative and effective hot term extraction by considering position, scattering and topicality of terms along with frequency.","PeriodicalId":302640,"journal":{"name":"2013 15th International Conference on Advanced Computing Technologies (ICACT)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 15th International Conference on Advanced Computing Technologies (ICACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACT.2013.6710495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Internet based news documents are the basic information transmission media. In such a case detecting hot topics and tracking the event development is most important. However, it is almost impossible to view all the generated topics, due to its large amount of size. Therefore it is necessary to rank the topics. The topic ranking should be done on the importance basis. But this importance is determined by how frequently a topic appears and this importance varies in different time slots. For extracting hot topics, most of the text mining approaches with vector space model need to determine the weighting of the feature terms. Existing traditional algorithms can't achieve high accuracy for retrieving hot terms, because they have not considered position, scattering and topicality. This paper presents an innovative and effective hot term extraction by considering position, scattering and topicality of terms along with frequency.