On the distribution of key terms in scientific fields

S. Vlasova, N. Kalenov, I. Sobolevskaya
{"title":"On the distribution of key terms in scientific fields","authors":"S. Vlasova, N. Kalenov, I. Sobolevskaya","doi":"10.20948/abrau-2022-24","DOIUrl":null,"url":null,"abstract":"One One of the Common Digital Space of Scientific Knowledge (CDSSK) main components are the subject ontologies of individual thematic subspaces, which include the basic concepts related to this scientific area. The constructing subject ontologies task at the initial phase requires the an array of key terms formation in a given scientific are with the subsequent establishment of links between them. A similar task is in the encyclopedias formation in terms of the articles (slots) list generating that determines their content. One of the sources for the formation of the key terms array can be the metadata of articles published in the leading scientific journals. Namely, the author's key terms (\"keywords\" in the terminology of the journals editors) quoted by the article. To make a conclusion about the possibility of using this approach to the subject ontologies formation, it is necessary to conduct the author's key terms array preanalysis, both in terms of real correspondence to the main areas of research in this science branch and in terms of the distribution of the certain terms occurrence frequency. This article presents the results of the occurrence frequency analysis of the author's key terms in Russian and English, carried out on the software processing basis of several thousand articles from leading Russian journals in mathematics, computer science and physics, reflected in the MathNet database. An assessment was made of the distribution of key terms correspondence (as phrases) and individual words to the Bradford's law, and the key terms cores within the thematic direction were identified.","PeriodicalId":277406,"journal":{"name":"Proceedings of 24th Scientific Conference “Scientific Services & Internet – 2022”","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of 24th Scientific Conference “Scientific Services & Internet – 2022”","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20948/abrau-2022-24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

One One of the Common Digital Space of Scientific Knowledge (CDSSK) main components are the subject ontologies of individual thematic subspaces, which include the basic concepts related to this scientific area. The constructing subject ontologies task at the initial phase requires the an array of key terms formation in a given scientific are with the subsequent establishment of links between them. A similar task is in the encyclopedias formation in terms of the articles (slots) list generating that determines their content. One of the sources for the formation of the key terms array can be the metadata of articles published in the leading scientific journals. Namely, the author's key terms ("keywords" in the terminology of the journals editors) quoted by the article. To make a conclusion about the possibility of using this approach to the subject ontologies formation, it is necessary to conduct the author's key terms array preanalysis, both in terms of real correspondence to the main areas of research in this science branch and in terms of the distribution of the certain terms occurrence frequency. This article presents the results of the occurrence frequency analysis of the author's key terms in Russian and English, carried out on the software processing basis of several thousand articles from leading Russian journals in mathematics, computer science and physics, reflected in the MathNet database. An assessment was made of the distribution of key terms correspondence (as phrases) and individual words to the Bradford's law, and the key terms cores within the thematic direction were identified.
论科学领域关键词的分布
科学知识公共数字空间(CDSSK)的主要组成部分之一是单个主题子空间的主题本体,其中包括与该科学领域相关的基本概念。主体本体构建任务在初始阶段要求在给定的科学术语中形成一系列关键术语,并随后建立它们之间的联系。一个类似的任务是在百科全书形成方面,根据文章(插槽)列表生成来确定其内容。形成关键术语数组的来源之一可以是发表在主要科学期刊上的文章的元数据。即作者在文章中引用的关键词(期刊编辑术语中的“关键词”)。为了得出使用这种方法形成学科本体的可能性的结论,有必要对作者的关键术语数组进行预分析,既要考虑与该科学分支主要研究领域的真实对应关系,也要考虑某些术语出现频率的分布。本文介绍了作者关键词的俄语和英语出现频率分析的结果,这是在软件处理的基础上进行的,几千篇文章来自俄罗斯主要的数学、计算机科学和物理期刊,反映在MathNet数据库中。对关键词对应关系(如短语)和单字与布拉德福德定律的分布进行了评估,确定了主题方向内的关键词核心。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信