关键词提取:基于文档段落权重的方法

2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO) Pub Date : 2022-12-01 DOI:10.1109/ICCMSO58359.2022.00051

Lahbib Ajallouda, A. Zellou, Imane Ettahiri, Karim Doumi

{"title":"关键词提取:基于文档段落权重的方法","authors":"Lahbib Ajallouda, A. Zellou, Imane Ettahiri, Karim Doumi","doi":"10.1109/ICCMSO58359.2022.00051","DOIUrl":null,"url":null,"abstract":"In recent years, the exploitation of sentence embedding techniques in natural language processing field has encouraged the proposal of new methods for extracting keyphrases from documents based on these techniques. Most of these approaches select keyphrases from a set of candidate phrases based on their semantic proximity to the document. In general, most documents contain complementary paragraphs that are unrelated to the topics covered. This factor reduces the credibility of the semantic proximity of candidate keyphrases to the document. Exploitation of document paragraphs weights during the semantic similarity calculation will inevitably improve the performance of keyphrase extraction from document. In this paper, we propose a new method to extract keyphrases based on document paragraphs weights. Our method is based on sentence embedding techniques and semantic proximity of candidate key phrases from document paragraphs. We evaluated the proposed method on three datasets, Inspec, Semeval2010 and KPTimes, where our results showed that that using document paragraph weight improved the performance of keyphrases extraction.","PeriodicalId":209727,"journal":{"name":"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Keyphrases extraction: Approach Based on Document Paragraph Weights\",\"authors\":\"Lahbib Ajallouda, A. Zellou, Imane Ettahiri, Karim Doumi\",\"doi\":\"10.1109/ICCMSO58359.2022.00051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the exploitation of sentence embedding techniques in natural language processing field has encouraged the proposal of new methods for extracting keyphrases from documents based on these techniques. Most of these approaches select keyphrases from a set of candidate phrases based on their semantic proximity to the document. In general, most documents contain complementary paragraphs that are unrelated to the topics covered. This factor reduces the credibility of the semantic proximity of candidate keyphrases to the document. Exploitation of document paragraphs weights during the semantic similarity calculation will inevitably improve the performance of keyphrase extraction from document. In this paper, we propose a new method to extract keyphrases based on document paragraphs weights. Our method is based on sentence embedding techniques and semantic proximity of candidate key phrases from document paragraphs. We evaluated the proposed method on three datasets, Inspec, Semeval2010 and KPTimes, where our results showed that that using document paragraph weight improved the performance of keyphrases extraction.\",\"PeriodicalId\":209727,\"journal\":{\"name\":\"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCMSO58359.2022.00051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMSO58359.2022.00051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

近年来，句子嵌入技术在自然语言处理领域的广泛应用促进了基于这些技术的关键短语提取新方法的提出。这些方法中的大多数都是根据与文档的语义接近度从一组候选短语中选择关键短语。一般来说，大多数文档包含与所述主题无关的补充段落。这个因素降低了候选关键短语与文档的语义接近度的可信度。在语义相似度计算过程中利用文档段落权重，必然会提高从文档中提取关键短语的性能。本文提出了一种基于段落权重的关键词提取方法。该方法基于句子嵌入技术和文档段落候选关键短语的语义接近度。我们在Inspec、Semeval2010和KPTimes三个数据集上对所提出的方法进行了评估，结果表明使用文档段落权重提高了关键短语提取的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Keyphrases extraction: Approach Based on Document Paragraph Weights

In recent years, the exploitation of sentence embedding techniques in natural language processing field has encouraged the proposal of new methods for extracting keyphrases from documents based on these techniques. Most of these approaches select keyphrases from a set of candidate phrases based on their semantic proximity to the document. In general, most documents contain complementary paragraphs that are unrelated to the topics covered. This factor reduces the credibility of the semantic proximity of candidate keyphrases to the document. Exploitation of document paragraphs weights during the semantic similarity calculation will inevitably improve the performance of keyphrase extraction from document. In this paper, we propose a new method to extract keyphrases based on document paragraphs weights. Our method is based on sentence embedding techniques and semantic proximity of candidate key phrases from document paragraphs. We evaluated the proposed method on three datasets, Inspec, Semeval2010 and KPTimes, where our results showed that that using document paragraph weight improved the performance of keyphrases extraction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)

自引率

0.00%

发文量