基于频率、位置、散射和话题权重的时间切片新闻文档热点话题提取

Y. Jahnavi, Y. Radhika
{"title":"基于频率、位置、散射和话题权重的时间切片新闻文档热点话题提取","authors":"Y. Jahnavi, Y. Radhika","doi":"10.1109/ICACT.2013.6710495","DOIUrl":null,"url":null,"abstract":"Internet based news documents are the basic information transmission media. In such a case detecting hot topics and tracking the event development is most important. However, it is almost impossible to view all the generated topics, due to its large amount of size. Therefore it is necessary to rank the topics. The topic ranking should be done on the importance basis. But this importance is determined by how frequently a topic appears and this importance varies in different time slots. For extracting hot topics, most of the text mining approaches with vector space model need to determine the weighting of the feature terms. Existing traditional algorithms can't achieve high accuracy for retrieving hot terms, because they have not considered position, scattering and topicality. This paper presents an innovative and effective hot term extraction by considering position, scattering and topicality of terms along with frequency.","PeriodicalId":302640,"journal":{"name":"2013 15th International Conference on Advanced Computing Technologies (ICACT)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Hot topic extraction based on frequency, position, scattering and topical weight for time sliced news documents\",\"authors\":\"Y. Jahnavi, Y. Radhika\",\"doi\":\"10.1109/ICACT.2013.6710495\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Internet based news documents are the basic information transmission media. In such a case detecting hot topics and tracking the event development is most important. However, it is almost impossible to view all the generated topics, due to its large amount of size. Therefore it is necessary to rank the topics. The topic ranking should be done on the importance basis. But this importance is determined by how frequently a topic appears and this importance varies in different time slots. For extracting hot topics, most of the text mining approaches with vector space model need to determine the weighting of the feature terms. Existing traditional algorithms can't achieve high accuracy for retrieving hot terms, because they have not considered position, scattering and topicality. This paper presents an innovative and effective hot term extraction by considering position, scattering and topicality of terms along with frequency.\",\"PeriodicalId\":302640,\"journal\":{\"name\":\"2013 15th International Conference on Advanced Computing Technologies (ICACT)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 15th International Conference on Advanced Computing Technologies (ICACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACT.2013.6710495\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 15th International Conference on Advanced Computing Technologies (ICACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACT.2013.6710495","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

基于互联网的新闻文件是基本的信息传播媒介。在这种情况下,发现热点话题并跟踪事件发展就显得尤为重要。然而,要查看所有生成的主题几乎是不可能的,因为它的大小很大。因此,有必要对主题进行排序。主题排名应根据重要性进行排序。但这种重要性是由一个话题出现的频率决定的,而且这种重要性在不同的时间段有所不同。为了提取热点话题,大多数基于向量空间模型的文本挖掘方法都需要确定特征项的权重。现有的传统算法由于没有考虑位置、散射和话题性等因素,无法达到较高的检索精度。本文提出了一种新颖有效的热词提取方法,该方法考虑了词的位置、散射和话题性以及频率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hot topic extraction based on frequency, position, scattering and topical weight for time sliced news documents
Internet based news documents are the basic information transmission media. In such a case detecting hot topics and tracking the event development is most important. However, it is almost impossible to view all the generated topics, due to its large amount of size. Therefore it is necessary to rank the topics. The topic ranking should be done on the importance basis. But this importance is determined by how frequently a topic appears and this importance varies in different time slots. For extracting hot topics, most of the text mining approaches with vector space model need to determine the weighting of the feature terms. Existing traditional algorithms can't achieve high accuracy for retrieving hot terms, because they have not considered position, scattering and topicality. This paper presents an innovative and effective hot term extraction by considering position, scattering and topicality of terms along with frequency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信