Lahbib Ajallouda, A. Zellou, Imane Ettahiri, Karim Doumi
{"title":"关键词提取:基于文档段落权重的方法","authors":"Lahbib Ajallouda, A. Zellou, Imane Ettahiri, Karim Doumi","doi":"10.1109/ICCMSO58359.2022.00051","DOIUrl":null,"url":null,"abstract":"In recent years, the exploitation of sentence embedding techniques in natural language processing field has encouraged the proposal of new methods for extracting keyphrases from documents based on these techniques. Most of these approaches select keyphrases from a set of candidate phrases based on their semantic proximity to the document. In general, most documents contain complementary paragraphs that are unrelated to the topics covered. This factor reduces the credibility of the semantic proximity of candidate keyphrases to the document. Exploitation of document paragraphs weights during the semantic similarity calculation will inevitably improve the performance of keyphrase extraction from document. In this paper, we propose a new method to extract keyphrases based on document paragraphs weights. Our method is based on sentence embedding techniques and semantic proximity of candidate key phrases from document paragraphs. We evaluated the proposed method on three datasets, Inspec, Semeval2010 and KPTimes, where our results showed that that using document paragraph weight improved the performance of keyphrases extraction.","PeriodicalId":209727,"journal":{"name":"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Keyphrases extraction: Approach Based on Document Paragraph Weights\",\"authors\":\"Lahbib Ajallouda, A. Zellou, Imane Ettahiri, Karim Doumi\",\"doi\":\"10.1109/ICCMSO58359.2022.00051\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the exploitation of sentence embedding techniques in natural language processing field has encouraged the proposal of new methods for extracting keyphrases from documents based on these techniques. Most of these approaches select keyphrases from a set of candidate phrases based on their semantic proximity to the document. In general, most documents contain complementary paragraphs that are unrelated to the topics covered. This factor reduces the credibility of the semantic proximity of candidate keyphrases to the document. Exploitation of document paragraphs weights during the semantic similarity calculation will inevitably improve the performance of keyphrase extraction from document. In this paper, we propose a new method to extract keyphrases based on document paragraphs weights. Our method is based on sentence embedding techniques and semantic proximity of candidate key phrases from document paragraphs. We evaluated the proposed method on three datasets, Inspec, Semeval2010 and KPTimes, where our results showed that that using document paragraph weight improved the performance of keyphrases extraction.\",\"PeriodicalId\":209727,\"journal\":{\"name\":\"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCMSO58359.2022.00051\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Computational Modelling, Simulation and Optimization (ICCMSO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCMSO58359.2022.00051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Keyphrases extraction: Approach Based on Document Paragraph Weights
In recent years, the exploitation of sentence embedding techniques in natural language processing field has encouraged the proposal of new methods for extracting keyphrases from documents based on these techniques. Most of these approaches select keyphrases from a set of candidate phrases based on their semantic proximity to the document. In general, most documents contain complementary paragraphs that are unrelated to the topics covered. This factor reduces the credibility of the semantic proximity of candidate keyphrases to the document. Exploitation of document paragraphs weights during the semantic similarity calculation will inevitably improve the performance of keyphrase extraction from document. In this paper, we propose a new method to extract keyphrases based on document paragraphs weights. Our method is based on sentence embedding techniques and semantic proximity of candidate key phrases from document paragraphs. We evaluated the proposed method on three datasets, Inspec, Semeval2010 and KPTimes, where our results showed that that using document paragraph weight improved the performance of keyphrases extraction.