Comparative Analysis of Scientific Journals Collections

Q3 Mathematics

SPIIRAS Proceedings Pub Date : 2019-06-05 DOI:10.15622/SP.2019.18.3.766-792

F. Krasnov, M. Shvartsman, A. Dimentov

{"title":"Comparative Analysis of Scientific Journals Collections","authors":"F. Krasnov, M. Shvartsman, A. Dimentov","doi":"10.15622/SP.2019.18.3.766-792","DOIUrl":null,"url":null,"abstract":"The authors developed an approach to comparative analysis of scientific journals collections based on the analysis of co-authors graph and the text model. The use of time series of co-authorship graphs metrics allowed the authors to analyze trends in the development of journal authors. The text model was built using machine learning techniques. The journals content was classified to determine the authenticity degree of various journals and different issues of a single journal via a text model. The authors developed a metric of Content Authenticity Ratio, which allows quantifying the authenticity of journal collections in comparison. Comparative thematic analysis of journals collections was carried out using the thematic model with additive regularization. Based on the created thematic model, the authors constructed thematic profiles of the journals archives in a single thematic basis. The approach developed by the authors was applied to archives of two journals on the Rheumatology for the period 2000–2018. As a benchmark for comparing the co-author’s metrics, public data sets from the SNAP research laboratory at Stanford University were used. As a result, the authors adapted the existing examples of the effective functioning of the authors collaborations in order to improve the work of journals editorial staff. Quantitative comparison of large volumes of texts and metadata of scientific articles was carried out. As a result of the experiment conducted using the developed methods, it was shown that the content authenticity of the selected journals is 89%, co-authorships in one of the journals have a pronounced centrality, which is a distinctive feature of the policy editor. The clarity and consistency of the results confirm the effectiveness of the approach proposed by the authors. The code developed in the course of the experiment in the Python programming language can be used for comparative analysis of other collections of journals in the Russian language.","PeriodicalId":53447,"journal":{"name":"SPIIRAS Proceedings","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SPIIRAS Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15622/SP.2019.18.3.766-792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}

引用次数: 1

Abstract

The authors developed an approach to comparative analysis of scientific journals collections based on the analysis of co-authors graph and the text model. The use of time series of co-authorship graphs metrics allowed the authors to analyze trends in the development of journal authors. The text model was built using machine learning techniques. The journals content was classified to determine the authenticity degree of various journals and different issues of a single journal via a text model. The authors developed a metric of Content Authenticity Ratio, which allows quantifying the authenticity of journal collections in comparison. Comparative thematic analysis of journals collections was carried out using the thematic model with additive regularization. Based on the created thematic model, the authors constructed thematic profiles of the journals archives in a single thematic basis. The approach developed by the authors was applied to archives of two journals on the Rheumatology for the period 2000–2018. As a benchmark for comparing the co-author’s metrics, public data sets from the SNAP research laboratory at Stanford University were used. As a result, the authors adapted the existing examples of the effective functioning of the authors collaborations in order to improve the work of journals editorial staff. Quantitative comparison of large volumes of texts and metadata of scientific articles was carried out. As a result of the experiment conducted using the developed methods, it was shown that the content authenticity of the selected journals is 89%, co-authorships in one of the journals have a pronounced centrality, which is a distinctive feature of the policy editor. The clarity and consistency of the results confirm the effectiveness of the approach proposed by the authors. The code developed in the course of the experiment in the Python programming language can be used for comparative analysis of other collections of journals in the Russian language.

查看原文本刊更多论文

科技期刊馆藏的比较分析

作者提出了一种基于共同作者图分析和文本模型的科学期刊文集比较分析方法。使用时间序列的共同作者图表指标允许作者分析期刊作者的发展趋势。文本模型是使用机器学习技术构建的。对期刊内容进行分类，通过文本模型确定各类期刊和单一期刊不同期的真实性程度。作者开发了内容真实性比率的度量，它允许量化比较期刊收藏的真实性。采用加性正则化专题模型对期刊馆藏进行专题比较分析。在建立专题模型的基础上，以单一专题为基础，构建了期刊档案的专题概况。作者开发的方法应用于2000-2018年期间两本风湿病学期刊的档案。作为比较合著者指标的基准，使用了斯坦福大学SNAP研究实验室的公共数据集。因此，作者借鉴了现有的作者合作有效运作的实例，以提高期刊编辑人员的工作水平。对大量科学文章的文本和元数据进行了定量比较。使用开发的方法进行的实验结果表明，所选期刊的内容真实性为89%，其中一种期刊的共同作者具有明显的中心性，这是政策编辑的显著特征。结果的清晰性和一致性证实了作者提出的方法的有效性。实验过程中使用Python编程语言开发的代码可用于对其他俄语期刊集进行比较分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

SPIIRAS Proceedings Mathematics-Applied Mathematics

CiteScore

1.90

自引率

0.00%

发文量

审稿时长

14 weeks

期刊介绍： The SPIIRAS Proceedings journal publishes scientific, scientific-educational, scientific-popular papers relating to computer science, automation, applied mathematics, interdisciplinary research, as well as information technology, the theoretical foundations of computer science (such as mathematical and related to other scientific disciplines), information security and information protection, decision making and artificial intelligence, mathematical modeling, informatization.