{"title":"添加词汇链的关键字提取","authors":"Zefeng Li, Binlai He, Yangnan","doi":"10.1109/WISA.2014.53","DOIUrl":null,"url":null,"abstract":"Key phrase extraction is widely used in information retrieval, automatic summarizing, text clustering, etc. KEA is a traditional and classical algorithm. But it mainly uses the statistical information and ignores the semantic information. In our paper, we propose a method which combine semantic information with KEA by constructing lexical chain that based on Reget's thesaurus. In this method, we use the semantic similarity between terms to construct lexical chain, and then the length of the chain will be used as a feature to build the extraction model. The experiment results attest that the performance of our system has an obvious improvement compare with the KEA and Nguyen and Kan's method.","PeriodicalId":366169,"journal":{"name":"2014 11th Web Information System and Application Conference","volume":"156 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Adding Lexical Chain to Keyphrase Extraction\",\"authors\":\"Zefeng Li, Binlai He, Yangnan\",\"doi\":\"10.1109/WISA.2014.53\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Key phrase extraction is widely used in information retrieval, automatic summarizing, text clustering, etc. KEA is a traditional and classical algorithm. But it mainly uses the statistical information and ignores the semantic information. In our paper, we propose a method which combine semantic information with KEA by constructing lexical chain that based on Reget's thesaurus. In this method, we use the semantic similarity between terms to construct lexical chain, and then the length of the chain will be used as a feature to build the extraction model. The experiment results attest that the performance of our system has an obvious improvement compare with the KEA and Nguyen and Kan's method.\",\"PeriodicalId\":366169,\"journal\":{\"name\":\"2014 11th Web Information System and Application Conference\",\"volume\":\"156 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 11th Web Information System and Application Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WISA.2014.53\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 11th Web Information System and Application Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2014.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
关键短语提取广泛应用于信息检索、自动总结、文本聚类等领域。KEA算法是一种传统的经典算法。但它主要使用统计信息,而忽略了语义信息。在本文中,我们提出了一种基于Reget词表构建词汇链的方法,将语义信息与KEA相结合。该方法利用词汇之间的语义相似度来构建词汇链,然后以词汇链的长度作为特征来构建提取模型。实验结果表明,与KEA和Nguyen and Kan的方法相比,我们的系统性能有了明显的提高。
Key phrase extraction is widely used in information retrieval, automatic summarizing, text clustering, etc. KEA is a traditional and classical algorithm. But it mainly uses the statistical information and ignores the semantic information. In our paper, we propose a method which combine semantic information with KEA by constructing lexical chain that based on Reget's thesaurus. In this method, we use the semantic similarity between terms to construct lexical chain, and then the length of the chain will be used as a feature to build the extraction model. The experiment results attest that the performance of our system has an obvious improvement compare with the KEA and Nguyen and Kan's method.