{"title":"Comparative Analysis of Scientific Journals Collections","authors":"F. Krasnov, M. Shvartsman, A. Dimentov","doi":"10.15622/SP.2019.18.3.766-792","DOIUrl":null,"url":null,"abstract":"The authors developed an approach to comparative analysis of scientific journals collections based on the analysis of co-authors graph and the text model. The use of time series of co-authorship graphs metrics allowed the authors to analyze trends in the development of journal authors. The text model was built using machine learning techniques. The journals content was classified to determine the authenticity degree of various journals and different issues of a single journal via a text model. The authors developed a metric of Content Authenticity Ratio, which allows quantifying the authenticity of journal collections in comparison. Comparative thematic analysis of journals collections was carried out using the thematic model with additive regularization. Based on the created thematic model, the authors constructed thematic profiles of the journals archives in a single thematic basis. The approach developed by the authors was applied to archives of two journals on the Rheumatology for the period 2000–2018. As a benchmark for comparing the co-author’s metrics, public data sets from the SNAP research laboratory at Stanford University were used. As a result, the authors adapted the existing examples of the effective functioning of the authors collaborations in order to improve the work of journals editorial staff. Quantitative comparison of large volumes of texts and metadata of scientific articles was carried out. As a result of the experiment conducted using the developed methods, it was shown that the content authenticity of the selected journals is 89%, co-authorships in one of the journals have a pronounced centrality, which is a distinctive feature of the policy editor. The clarity and consistency of the results confirm the effectiveness of the approach proposed by the authors. The code developed in the course of the experiment in the Python programming language can be used for comparative analysis of other collections of journals in the Russian language.","PeriodicalId":53447,"journal":{"name":"SPIIRAS Proceedings","volume":"29 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SPIIRAS Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15622/SP.2019.18.3.766-792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1
Abstract
The authors developed an approach to comparative analysis of scientific journals collections based on the analysis of co-authors graph and the text model. The use of time series of co-authorship graphs metrics allowed the authors to analyze trends in the development of journal authors. The text model was built using machine learning techniques. The journals content was classified to determine the authenticity degree of various journals and different issues of a single journal via a text model. The authors developed a metric of Content Authenticity Ratio, which allows quantifying the authenticity of journal collections in comparison. Comparative thematic analysis of journals collections was carried out using the thematic model with additive regularization. Based on the created thematic model, the authors constructed thematic profiles of the journals archives in a single thematic basis. The approach developed by the authors was applied to archives of two journals on the Rheumatology for the period 2000–2018. As a benchmark for comparing the co-author’s metrics, public data sets from the SNAP research laboratory at Stanford University were used. As a result, the authors adapted the existing examples of the effective functioning of the authors collaborations in order to improve the work of journals editorial staff. Quantitative comparison of large volumes of texts and metadata of scientific articles was carried out. As a result of the experiment conducted using the developed methods, it was shown that the content authenticity of the selected journals is 89%, co-authorships in one of the journals have a pronounced centrality, which is a distinctive feature of the policy editor. The clarity and consistency of the results confirm the effectiveness of the approach proposed by the authors. The code developed in the course of the experiment in the Python programming language can be used for comparative analysis of other collections of journals in the Russian language.
期刊介绍:
The SPIIRAS Proceedings journal publishes scientific, scientific-educational, scientific-popular papers relating to computer science, automation, applied mathematics, interdisciplinary research, as well as information technology, the theoretical foundations of computer science (such as mathematical and related to other scientific disciplines), information security and information protection, decision making and artificial intelligence, mathematical modeling, informatization.