A Topic Detection and Tracking Method Combining NLP with Suffix Tree Clustering

Yaohong Jin
{"title":"A Topic Detection and Tracking Method Combining NLP with Suffix Tree Clustering","authors":"Yaohong Jin","doi":"10.1109/ICCSEE.2012.131","DOIUrl":null,"url":null,"abstract":"A topic detection and tracking method combining semantic analysis with Suffix Tree Clustering (STC) algorithm is presented. A feature selection using NLP algorithm was introduced to select the noun, verb and name entity as the input of STC. Focusing on the topic drifting, we formed the VSM of cluster by the key words extracted from the nodes of suffix tree by mutual information algorithm. After the similarity computing of clusters and topic detection and tracking, a semantic analysis was introduced to filter the words with same meaning and analyze the semantic structure of words in label of cluster. Finally a content-relevant description was generated for each topic. The experiments showed that this method can detect and track the topics from the news articles effectively.","PeriodicalId":132465,"journal":{"name":"2012 International Conference on Computer Science and Electronics Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on Computer Science and Electronics Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSEE.2012.131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

A topic detection and tracking method combining semantic analysis with Suffix Tree Clustering (STC) algorithm is presented. A feature selection using NLP algorithm was introduced to select the noun, verb and name entity as the input of STC. Focusing on the topic drifting, we formed the VSM of cluster by the key words extracted from the nodes of suffix tree by mutual information algorithm. After the similarity computing of clusters and topic detection and tracking, a semantic analysis was introduced to filter the words with same meaning and analyze the semantic structure of words in label of cluster. Finally a content-relevant description was generated for each topic. The experiments showed that this method can detect and track the topics from the news articles effectively.
一种结合词尾树聚类和自然语言处理的主题检测与跟踪方法
提出了一种将语义分析与后缀树聚类(STC)算法相结合的主题检测与跟踪方法。引入了一种基于NLP算法的特征选择方法,选择名词、动词和名称实体作为STC的输入。针对主题漂移问题,利用互信息算法从后缀树节点中提取关键词,形成聚类的VSM。在进行聚类相似度计算和主题检测与跟踪之后,引入语义分析,过滤同义词,分析聚类标签中词的语义结构。最后,为每个主题生成与内容相关的描述。实验表明,该方法可以有效地从新闻文章中检测和跟踪主题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信