{"title":"A Method for Evaluating the Informative Value of Arguments of a Nonparametric Stochastic Dependence Model with Their Specific Values","authors":"A. V. Lapko, V. A. Lapko","doi":"10.3103/S0005105525700694","DOIUrl":"10.3103/S0005105525700694","url":null,"abstract":"<p>A method for evaluating the informative value of arguments for unambiguous stochastic dependence at their specific values under conditions of a priori uncertainty is described. Taking into account the asymptotic properties of a nonparametric collective, a consistent procedure for forming its structure is proposed. The considered collective, by contrast with traditional nonparametric regression, takes into account not only the information contained in the observations of the variables of the reconstructed dependence but also the relationships between them. The peculiarity of the nonparametric collective of linear approximations of the desired dependence is the possibility of its representation in a form sufficient to assess the informative value of arguments according to their specific values. From these positions, a criterion for ranking the arguments of the function being restored according to their significance is defined.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 4","pages":"252 - 255"},"PeriodicalIF":0.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Language Models for Texts Preprocessing in Machine Translation","authors":"A. V. Mylnikova, L. A. Mylnikov","doi":"10.3103/S0005105525700645","DOIUrl":"10.3103/S0005105525700645","url":null,"abstract":"<p>This paper examines a model for the use of syntactic parsing-based text skeleton structures for the preprocessing of text corpora before they are transferred to MT neural network models to enhance their performance quality. In the paper, a model is suggested for text corpora, which is based on parts-of-speech (POS) tagging and syntactic parsing; this model is implemented on BERT network-based language model and a set of rules. A limited POS tagging dataset is taken in this paper to describe how data are prepared for the training of the model and how its efficiency performance can be improved. POS tagging is used in the paper to obtain syntactic parsing and determine the type of a sentence and word order changes according to the predefined rules. The application of the model, suggested in the paper, together with the MT language models Google and Yandex, allowed MT quality metrics to be increased by 0.1–0.23 according to BLEU and TER for Russian–English and German–English language pairs.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 4","pages":"256 - 268"},"PeriodicalIF":0.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Information Modeling in Quantum Optics (Correction of Errors)","authors":"N. V. Serov","doi":"10.3103/S0005105525700669","DOIUrl":"10.3103/S0005105525700669","url":null,"abstract":"<p>The author has presented possibilities for mathematical approximation and the dimension theory in computer science. The received formulas have led to a conclusion concerning existence of energetically-information equivalents thanks to which the theory of linearization of the nonlinear data and continuum digitization have received substantiation. The author has processed the obtained data from the perspective of the theory of dimension, and also has discussed hypotheses concerning an “error method” of chemical elements with negative mass.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 4","pages":"217 - 230"},"PeriodicalIF":0.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tradition of Data-Intensive Use: The Example of Domain Thermophysics. Methods and Algorithms","authors":"A. O. Erkimbaev, V. Yu. Zitserman, G. A. Kobzev","doi":"10.3103/S0005105525700682","DOIUrl":"10.3103/S0005105525700682","url":null,"abstract":"<p>Using the example of thermophysics, the evolution of approaches to working with scientific data on substances and materials properties is traced. This paper shows that across all stages, thermophysics can be classified as a data-intensive science, characterized by a focus on working with data, including its storage, its organization, and the extraction of meaningful information. It presents improvements in processing methods that are associated with the use of new information technologies, including machine learning techniques. Their potential for the field of thermodynamics relative to traditional statistical methods is analyzed. In this context, the general issue of the relationship between statistics and data science which has generated extensive debate in literature and online is discussed. All the authors’ conclusions are based on an analysis of specific issues related to the prediction of the properties of substances and constructing the equation of state and thermodynamic models for multicomponent systems.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 4","pages":"231 - 251"},"PeriodicalIF":0.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Developing an Approach for Automated Data Collection and Mining Using Web Scraping Techniques and Large Language Models: A Case Study on Extracting Technology Readiness Level Assessments","authors":"F. M. Grozovskiy, I. V. Loginova","doi":"10.3103/S0005105525700670","DOIUrl":"10.3103/S0005105525700670","url":null,"abstract":"<p>The paper proposes an approach to the automated extraction and structuring of information from text, combining web scraping for data collection from online sources with a large language model for subsequent data mining. As a case study, texts from news publications on technology readiness levels from the CNews website were chosen to test the developed methodology in a specific domain. The model’s accuracy in identifying technology readiness assessments was 84–85%, which is comparable to similar results in other, less specialized tasks.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 4","pages":"269 - 278"},"PeriodicalIF":0.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Invariant-Framework Multilevel Model for Representation of Databases’ Subject Lists","authors":"A. N. Rodionov","doi":"10.3103/S0005105525700657","DOIUrl":"10.3103/S0005105525700657","url":null,"abstract":"<p>The subject domain, which can contain from one to several dozen or even hundreds (for large-scale systems) of subject lists, is a mandatory, core component of the vast majority of existing databases, around which all other components of the latter are built. The multilevel representation of objects in such databases is a standard practice aimed to simultaneously reflect both the numerous classifications that accompany subject lists and the key, basic relationship of materialization, which connects prototypes, models of subjects, and their instances. In spite of the existence of many methods for configuring subject schemes, the question of the invariance of such constructions is still relevant. In this paper, we show that the relationship of materialization is not the only constant in subject schemes. Multilevel models must include several standard relationships, such as relationships of variability and personification, which, together with the relationship of materialization, form an invariant framework of the subject scheme.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 4","pages":"201 - 216"},"PeriodicalIF":0.5,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145230174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Media Rubricator as a Tool for Effective Information Retrieval","authors":"N. G. Inshakova, I. S. Shtyrnik","doi":"10.3103/S0005105525700578","DOIUrl":"10.3103/S0005105525700578","url":null,"abstract":"<p>This article examines the concept of the <i>rubricator</i> (taxonomy) and its functions in media practice. The main shortcomings of content categorization in online media for mass audiences are identified. The role of the rubricator as a tool of information retrieval in the digital environment is emphasized.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 3","pages":"137 - 144"},"PeriodicalIF":0.5,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some Aspects of Working with Incomplete Information in Intelligent Data Analysis Systems","authors":"S. M. Gusakova","doi":"10.3103/S000510552570061X","DOIUrl":"10.3103/S000510552570061X","url":null,"abstract":"<p>Some possibilities of eliminating the consequences of the incompleteness of information in data analysis systems are considered, including through the exchange of tools between systems based on different theories.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 3","pages":"160 - 165"},"PeriodicalIF":0.5,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical Methods for Assessing the Competence of Experts: Application in the Selection of Documents in the Library Fund","authors":"I. V. Timoshenko","doi":"10.3103/S0005105525700633","DOIUrl":"10.3103/S0005105525700633","url":null,"abstract":"<p>The advantages and disadvantages of experts’ assessments of other experts’ competence in the task of selecting documents for library funds are analyzed, and it is proposed to use the methods of decision theory to improve the selection’s objectivity and quality. Statistical methods of assessing the experts’ competence used in libraries are described, allowing the assessment of the level of experts’ qualifications and the tracking of the dynamics of their competence in the process of conducting a collective examination by means of ILS. The importance of monitoring and adjusting the values of experts’ competence is shown to ensure a high level of quality in document selection, which will necessarily contribute to increasing the efficiency of library collections and the formation of high-quality collections that meet the needs of users.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 3","pages":"194 - 199"},"PeriodicalIF":0.5,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Quantum Transition in Cryptography through the Lens of Bibliometrics","authors":"A. I. Terekhov","doi":"10.3103/S0005105525700591","DOIUrl":"10.3103/S0005105525700591","url":null,"abstract":"<p>Cryptography, which has a history going back many centuries, is currently experiencing a transition from the classical paradigm to a quantum one. Using bibliometrics, the dynamics of development and the ratio between three branches of cryptography—classical, postquantum, and quantum—in the period 1990–2020 are considered; the restructuring of the crypto-sphere in favor of the quantum paradigm since 2015 is noted. The world leaders in quantum cryptography research are industrialized countries, including the G7 group and China, and at the institutional level, leading universities, academic organizations, technology institutes, as well as large corporations, especially from Japan and the United States, and military research institutions, especially from the United States and China. The features of the quantum-cryptographic ecosystems of several countries are identified. It is shown that, in Russia, research is concentrated in the academic sector, universities and specialized organizations—the Academy of Cryptography of the Russian Federation, the Russian Quantum Center. By contrast with leading countries, the contribution of domestic corporations and small businesses is still limited. Geopolitical aspects of creating quantum-cryptographic security have been discussed. The Web of Science Core Collection database is used as a source for bibliometric analysis.</p>","PeriodicalId":42995,"journal":{"name":"AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS","volume":"59 3","pages":"125 - 136"},"PeriodicalIF":0.5,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}