基于多尺度卷积窗的科学文献关键词提取方法

Yuhong Zhang, Yuxin Xie, Peipei Li, Xuegang Hu
{"title":"基于多尺度卷积窗的科学文献关键词提取方法","authors":"Yuhong Zhang, Yuxin Xie, Peipei Li, Xuegang Hu","doi":"10.1109/CCIS53392.2021.9754645","DOIUrl":null,"url":null,"abstract":"The key-phrase extraction is important for the downstream tasks in natural language process, and has attracted a lot of attention. Compared with other documents, scientific literatures contain many long phrases. Most existing methods perform poor on these literatures. To address this problem, a key-phrase extraction method based on multi-size convolution windows (KE-MCW) is proposed for scientific literatures in this paper. More specifically, in order to represent more contextual information, a convolutional neural network(CNN) with multi-size filters is introduced to map the documents into distributed feature vectors, then each vector can represent different size phrases. Next, in order to determine whether each word is a part of a keyphrase, a deep recurrent neural network is used to mark the role of each word. Finally, the attention mechanism is used to further judge the importance of each phrase. Experimental results show that our proposed method performs better than some competitive methods for technology literatures.","PeriodicalId":191226,"journal":{"name":"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)","volume":"183 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Key-phrase Extraction Method Based on Multi-size Convolution Windows for Scientific Literatures\",\"authors\":\"Yuhong Zhang, Yuxin Xie, Peipei Li, Xuegang Hu\",\"doi\":\"10.1109/CCIS53392.2021.9754645\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The key-phrase extraction is important for the downstream tasks in natural language process, and has attracted a lot of attention. Compared with other documents, scientific literatures contain many long phrases. Most existing methods perform poor on these literatures. To address this problem, a key-phrase extraction method based on multi-size convolution windows (KE-MCW) is proposed for scientific literatures in this paper. More specifically, in order to represent more contextual information, a convolutional neural network(CNN) with multi-size filters is introduced to map the documents into distributed feature vectors, then each vector can represent different size phrases. Next, in order to determine whether each word is a part of a keyphrase, a deep recurrent neural network is used to mark the role of each word. Finally, the attention mechanism is used to further judge the importance of each phrase. Experimental results show that our proposed method performs better than some competitive methods for technology literatures.\",\"PeriodicalId\":191226,\"journal\":{\"name\":\"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)\",\"volume\":\"183 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCIS53392.2021.9754645\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems (CCIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCIS53392.2021.9754645","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

关键短语提取是自然语言过程中下游任务的重要组成部分,近年来受到广泛关注。与其他文献相比,科学文献包含许多长短语。大多数现有方法在这些文献上表现不佳。针对这一问题,本文提出了一种基于多尺度卷积窗(KE-MCW)的关键词提取方法。更具体地说,为了表示更多的上下文信息,引入带有多尺寸过滤器的卷积神经网络(CNN)将文档映射到分布式特征向量中,然后每个向量可以表示不同大小的短语。接下来,为了确定每个单词是否是关键短语的一部分,使用深度递归神经网络来标记每个单词的角色。最后,利用注意机制进一步判断每个短语的重要性。实验结果表明,本文提出的方法比现有的技术文献检索方法具有更好的检索效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Key-phrase Extraction Method Based on Multi-size Convolution Windows for Scientific Literatures
The key-phrase extraction is important for the downstream tasks in natural language process, and has attracted a lot of attention. Compared with other documents, scientific literatures contain many long phrases. Most existing methods perform poor on these literatures. To address this problem, a key-phrase extraction method based on multi-size convolution windows (KE-MCW) is proposed for scientific literatures in this paper. More specifically, in order to represent more contextual information, a convolutional neural network(CNN) with multi-size filters is introduced to map the documents into distributed feature vectors, then each vector can represent different size phrases. Next, in order to determine whether each word is a part of a keyphrase, a deep recurrent neural network is used to mark the role of each word. Finally, the attention mechanism is used to further judge the importance of each phrase. Experimental results show that our proposed method performs better than some competitive methods for technology literatures.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信