Keyphrase Generation for Abstracts of the Russian-Language Scientific Articles

Dmitry Morozov, Anna Glazkova, M. A. Tyutyulnikov, B. Iomdin
{"title":"Keyphrase Generation for Abstracts of the Russian-Language Scientific Articles","authors":"Dmitry Morozov, Anna Glazkova, M. A. Tyutyulnikov, B. Iomdin","doi":"10.25205/1818-7935-2023-21-1-54-66","DOIUrl":null,"url":null,"abstract":"In this paper, we attempted to adapt various well-known algorithms for keyword selection to a very specific text corpus containing abstracts of Russian academic papers from the mathematical and computer science domain. We faced several challenges including the lack of research in the field of keyword extraction for Russian, the absence of large text corpora of academic abstracts, and the insufficient length of the abstracts. Keywords are often found in the full text of the paper and can simply be highlighted, whereas abstracts may not include keywords in an explicit form. At the same time, it is abstracts that are usually in the public domain, so automatic selection of keywords from them would significantly facilitate the process of searching for papers. Moreover, an automatic keyword selection would be useful even for papers for which keywords were already specified by the authors. During the study, we found that authors often use unique keywords for their papers. This complicates their systematization on a given topic. For visualizing the results, we have created a web resource keyphrases.mca.nsu.ru, where young/beginning scholars can form an approximate list of keywords for their first research paper.","PeriodicalId":434662,"journal":{"name":"NSU Vestnik. Series: Linguistics and Intercultural Communication","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NSU Vestnik. Series: Linguistics and Intercultural Communication","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25205/1818-7935-2023-21-1-54-66","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we attempted to adapt various well-known algorithms for keyword selection to a very specific text corpus containing abstracts of Russian academic papers from the mathematical and computer science domain. We faced several challenges including the lack of research in the field of keyword extraction for Russian, the absence of large text corpora of academic abstracts, and the insufficient length of the abstracts. Keywords are often found in the full text of the paper and can simply be highlighted, whereas abstracts may not include keywords in an explicit form. At the same time, it is abstracts that are usually in the public domain, so automatic selection of keywords from them would significantly facilitate the process of searching for papers. Moreover, an automatic keyword selection would be useful even for papers for which keywords were already specified by the authors. During the study, we found that authors often use unique keywords for their papers. This complicates their systematization on a given topic. For visualizing the results, we have created a web resource keyphrases.mca.nsu.ru, where young/beginning scholars can form an approximate list of keywords for their first research paper.
俄语科技论文摘要关键词生成
在本文中,我们试图将各种众所周知的关键字选择算法应用于一个非常具体的文本语料库,该语料库包含来自数学和计算机科学领域的俄罗斯学术论文摘要。我们面临着一些挑战,包括缺乏对俄语关键字提取领域的研究,缺乏大型学术摘要文本语料库,以及摘要长度不足。关键词通常出现在论文全文中,可以简单地突出显示,而摘要可能不包括明确形式的关键词。同时,摘要通常属于公有领域,因此自动从摘要中选择关键词将大大简化论文搜索过程。此外,自动关键字选择将是有用的,即使论文的关键字已经由作者指定。在研究过程中,我们发现作者经常在他们的论文中使用独特的关键词。这使他们对给定主题的系统化变得复杂。为了使结果可视化,我们创建了一个网络资源keyphrases.mca.nsu.ru,在这里,年轻/初学的学者可以为他们的第一篇研究论文形成一个大致的关键词列表。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信