俄语科技论文摘要关键词生成

NSU Vestnik. Series: Linguistics and Intercultural Communication Pub Date : 2023-05-30 DOI:10.25205/1818-7935-2023-21-1-54-66

Dmitry Morozov, Anna Glazkova, M. A. Tyutyulnikov, B. Iomdin

{"title":"俄语科技论文摘要关键词生成","authors":"Dmitry Morozov, Anna Glazkova, M. A. Tyutyulnikov, B. Iomdin","doi":"10.25205/1818-7935-2023-21-1-54-66","DOIUrl":null,"url":null,"abstract":"In this paper, we attempted to adapt various well-known algorithms for keyword selection to a very specific text corpus containing abstracts of Russian academic papers from the mathematical and computer science domain. We faced several challenges including the lack of research in the field of keyword extraction for Russian, the absence of large text corpora of academic abstracts, and the insufficient length of the abstracts. Keywords are often found in the full text of the paper and can simply be highlighted, whereas abstracts may not include keywords in an explicit form. At the same time, it is abstracts that are usually in the public domain, so automatic selection of keywords from them would significantly facilitate the process of searching for papers. Moreover, an automatic keyword selection would be useful even for papers for which keywords were already specified by the authors. During the study, we found that authors often use unique keywords for their papers. This complicates their systematization on a given topic. For visualizing the results, we have created a web resource keyphrases.mca.nsu.ru, where young/beginning scholars can form an approximate list of keywords for their first research paper.","PeriodicalId":434662,"journal":{"name":"NSU Vestnik. Series: Linguistics and Intercultural Communication","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Keyphrase Generation for Abstracts of the Russian-Language Scientific Articles\",\"authors\":\"Dmitry Morozov, Anna Glazkova, M. A. Tyutyulnikov, B. Iomdin\",\"doi\":\"10.25205/1818-7935-2023-21-1-54-66\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we attempted to adapt various well-known algorithms for keyword selection to a very specific text corpus containing abstracts of Russian academic papers from the mathematical and computer science domain. We faced several challenges including the lack of research in the field of keyword extraction for Russian, the absence of large text corpora of academic abstracts, and the insufficient length of the abstracts. Keywords are often found in the full text of the paper and can simply be highlighted, whereas abstracts may not include keywords in an explicit form. At the same time, it is abstracts that are usually in the public domain, so automatic selection of keywords from them would significantly facilitate the process of searching for papers. Moreover, an automatic keyword selection would be useful even for papers for which keywords were already specified by the authors. During the study, we found that authors often use unique keywords for their papers. This complicates their systematization on a given topic. For visualizing the results, we have created a web resource keyphrases.mca.nsu.ru, where young/beginning scholars can form an approximate list of keywords for their first research paper.\",\"PeriodicalId\":434662,\"journal\":{\"name\":\"NSU Vestnik. Series: Linguistics and Intercultural Communication\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NSU Vestnik. Series: Linguistics and Intercultural Communication\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.25205/1818-7935-2023-21-1-54-66\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NSU Vestnik. Series: Linguistics and Intercultural Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25205/1818-7935-2023-21-1-54-66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们试图将各种众所周知的关键字选择算法应用于一个非常具体的文本语料库，该语料库包含来自数学和计算机科学领域的俄罗斯学术论文摘要。我们面临着一些挑战，包括缺乏对俄语关键字提取领域的研究，缺乏大型学术摘要文本语料库，以及摘要长度不足。关键词通常出现在论文全文中，可以简单地突出显示，而摘要可能不包括明确形式的关键词。同时，摘要通常属于公有领域，因此自动从摘要中选择关键词将大大简化论文搜索过程。此外，自动关键字选择将是有用的，即使论文的关键字已经由作者指定。在研究过程中，我们发现作者经常在他们的论文中使用独特的关键词。这使他们对给定主题的系统化变得复杂。为了使结果可视化，我们创建了一个网络资源keyphrases.mca.nsu.ru，在这里，年轻/初学的学者可以为他们的第一篇研究论文形成一个大致的关键词列表。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Keyphrase Generation for Abstracts of the Russian-Language Scientific Articles

In this paper, we attempted to adapt various well-known algorithms for keyword selection to a very specific text corpus containing abstracts of Russian academic papers from the mathematical and computer science domain. We faced several challenges including the lack of research in the field of keyword extraction for Russian, the absence of large text corpora of academic abstracts, and the insufficient length of the abstracts. Keywords are often found in the full text of the paper and can simply be highlighted, whereas abstracts may not include keywords in an explicit form. At the same time, it is abstracts that are usually in the public domain, so automatic selection of keywords from them would significantly facilitate the process of searching for papers. Moreover, an automatic keyword selection would be useful even for papers for which keywords were already specified by the authors. During the study, we found that authors often use unique keywords for their papers. This complicates their systematization on a given topic. For visualizing the results, we have created a web resource keyphrases.mca.nsu.ru, where young/beginning scholars can form an approximate list of keywords for their first research paper.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

NSU Vestnik. Series: Linguistics and Intercultural Communication

自引率

0.00%

发文量