Doklady Mathematics最新文献

筛选
英文 中文
Activations and Gradients Compression for Model-Parallel Training 用于模型并行训练的激活和梯度压缩
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701314
M. I. Rudakov, A. N. Beznosikov, Ya. A. Kholodov, A. V. Gasnikov
{"title":"Activations and Gradients Compression for Model-Parallel Training","authors":"M. I. Rudakov,&nbsp;A. N. Beznosikov,&nbsp;Ya. A. Kholodov,&nbsp;A. V. Gasnikov","doi":"10.1134/S1064562423701314","DOIUrl":"10.1134/S1064562423701314","url":null,"abstract":"<p>Large neural networks require enormous computational clusters of machines. Model-parallel training, when the model architecture is partitioned sequentially between workers, is a popular approach for training modern models. Information compression can be applied to decrease workers’ communication time, as it is often a bottleneck in such systems. This work explores how simultaneous compression of activations and gradients in model-parallel distributed training setup affects convergence. We analyze compression methods such as quantization and TopK compression, and also experiment with error compensation techniques. Moreover, we employ TopK with AQ-SGD per-batch error feedback approach. We conduct experiments on image classification and language model fine-tuning tasks. Our findings demonstrate that gradients require milder compression rates than activations. We observe that <span>(K = 10% )</span> is the lowest TopK compression level, which does not harm model convergence severely. Experiments also show that models trained with TopK perform well only when compression is also applied during inference. We find that error feedback techniques do not improve model-parallel training compared to plain compression, but allow model inference without compression with almost no quality drop. Finally, when applied with the AQ-SGD approach, TopK stronger than with <span>(K = 30% )</span> worsens model performance significantly.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S272 - S281"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Network Approach to the Problem of Predicting Interest Rate Anomalies under the Influence of Correlated Noise 预测相关噪声影响下利率异常问题的神经网络方法
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701521
G. A. Zotov, P. P. Lukianchenko
{"title":"Neural Network Approach to the Problem of Predicting Interest Rate Anomalies under the Influence of Correlated Noise","authors":"G. A. Zotov,&nbsp;P. P. Lukianchenko","doi":"10.1134/S1064562423701521","DOIUrl":"10.1134/S1064562423701521","url":null,"abstract":"<p>The aim of this study is to analyze bifurcation points in financial models using colored noise as a stochastic component. The research investigates the impact of colored noise on change-points and approach to their detection via neural networks. The paper presents a literature review on the use of colored noise in complex systems. The Vasicek stochastic model of interest rates is the object of the research. The research methodology involves approximating numerical solutions of the model using the Euler–Maruyama method, calibrating model parameters, and adjusting the integration step. Methods for detecting bifurcation points and their application to the data are discussed. The study results include the outcomes of an LSTM model trained to detect change-points for models with different types of noise. Results are provided for comparison with various change-point windows and forecast step sizes.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S293 - S299"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do we Benefit from the Categorization of the News Flow in the Stock Price Prediction Problem? 在股价预测问题中,我们是否能从新闻流的分类中获益?
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701648
T. D. Kulikova, E. Yu. Kovtun, S. A. Budennyy
{"title":"Do we Benefit from the Categorization of the News Flow in the Stock Price Prediction Problem?","authors":"T. D. Kulikova,&nbsp;E. Yu. Kovtun,&nbsp;S. A. Budennyy","doi":"10.1134/S1064562423701648","DOIUrl":"10.1134/S1064562423701648","url":null,"abstract":"<p>The power of machine learning is widely leveraged in the task of company stock price prediction. It is essential to incorporate historical stock prices and relevant external world information for constructing a more accurate predictive model. The sentiments of the financial news connected with the company can become such valuable knowledge. However, financial news has different topics, such as <i>Macro</i>, <i>Markets</i>, or <i>Product news</i>. The adoption of such categorization is usually out of scope in a market research. In this work, we aim to close this gap and explore the effect of capturing the news topic differentiation in the stock price prediction problem. Initially, we classify the financial news stream into 20 pre-defined topics with the pre-trained model. Then, we get sentiments and explore the topic of news group sentiment labeling. Moreover, we conduct the experiments with the several well-proved models for time series forecasting, including the Temporal Convolutional Network (TCN), the D-Linear, the Transformer, and the Temporal Fusion Transformer (TFT). In the results of our research, utilizing the information from separate topic groups contributes to a better performance of deep learning models compared to the approach when we consider all news sentiments without any division.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S503 - S510"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Learning As a Tool to Accelerate the Search for New Materials for Metal-Ion Batteries 将机器学习作为加速寻找金属离子电池新材料的工具
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701612
V. T. Osipov, M. I. Gongola, Ye. A. Morkhova,  A. P. Nemudryi, A. A. Kabanov
{"title":"Machine Learning As a Tool to Accelerate the Search for New Materials for Metal-Ion Batteries","authors":"V. T. Osipov,&nbsp;M. I. Gongola,&nbsp;Ye. A. Morkhova,&nbsp; A. P. Nemudryi,&nbsp;A. A. Kabanov","doi":"10.1134/S1064562423701612","DOIUrl":"10.1134/S1064562423701612","url":null,"abstract":"<p>The search for new solid ionic conductors is an important topic of material science that requires significant resources, but can be accelerated using machine learning (ML) techniques. In this work, ML methods were applied to predict the migration energy of working ions. The training set is based on data on 225 lithium ion migration channels in 23 ion conductors. The descriptors were the parameters of free space in the crystal obtained by the Voronoi partitioning method. The accuracy of migration energy prediction was evaluated by comparison with the data obtained by the density functional theory method. Two methods of ML were applied in the work: support vector regression and ordinal regression. It is shown that the parameters of free space in a crystal correlate with the migration energy, while the best results are obtained by ordinal regression. The developed ML models can be used as an additional filter in the analysis of ionic conductivity in solids.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S476 - S483"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical Online Learning in Recurrent and Feedforward Quantum Neural Networks 递归和前馈量子神经网络中的统计在线学习
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701557
S. V. Zuev
{"title":"Statistical Online Learning in Recurrent and Feedforward Quantum Neural Networks","authors":"S. V. Zuev","doi":"10.1134/S1064562423701557","DOIUrl":"10.1134/S1064562423701557","url":null,"abstract":"<p>For adaptive artificial intelligence systems, the question of the possibility of online learning is especially important, since such training provides adaptation. The purpose of the work is to consider methods of quantum machine online learning for the two most common architectures of quantum neural networks: feedforward and recurrent. The work uses the quantumz module available on PyPI to emulate quantum computing and create artificial quantum neural networks. In addition, the genser module is used to transform data dimensions, which provides reversible transformation of dimensions without loss of information. The data for the experiments are taken from open sources. The paper implements the machine learning method without optimization, proposed by the author earlier. Online learning algorithms for recurrent and feedforward quantum neural network are presented and experimentally confirmed. The proposed learning algorithms can be used as data science tools, as well as a part of adaptive intelligent control systems. The developed software can fully unleash its potential only on quantum computers, but, in the case of a small number of quantum registers, it can also be used in systems that emulate quantum computing, or in photonic computers.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S317 - S324"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142413768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MTS Kion Implicit Contextualised Sequential Dataset for Movie Recommendation 用于电影推荐的 MTS Kion 隐含语境化序列数据集
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701594
I. Safilo, D. Tikhonovich, A. V. Petrov, D. I. Ignatov
{"title":"MTS Kion Implicit Contextualised Sequential Dataset for Movie Recommendation","authors":"I. Safilo,&nbsp;D. Tikhonovich,&nbsp;A. V. Petrov,&nbsp;D. I. Ignatov","doi":"10.1134/S1064562423701594","DOIUrl":"10.1134/S1064562423701594","url":null,"abstract":"<p>We present a new movie and TV show recommendation dataset collected from the real users of MTS Kion video-on-demand platform. In contrast to other popular movie recommendation datasets, such as MovieLens or Netflix, our dataset is based on the implicit interactions registered at the watching time, rather than on explicit ratings. We also provide rich contextual and side information including interactions characteristics (such as temporal information, watch duration and watch percentage), user demographics and rich movies meta-information. In addition, we describe the MTS Kion Challenge—an online recommender systems challenge that was based on this dataset—and provide an overview of the best performing solutions of the winners. We keep the competition sandbox open, so the researchers are welcome to try their own recommendation algorithms and measure the quality on the private part of the dataset.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S456 - S464"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Data Splitting in Distributed Optimization for Machine Learning 机器学习分布式优化中的最佳数据分割
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701600
D. Medyakov, G. Molodtsov, A. Beznosikov, A. Gasnikov
{"title":"Optimal Data Splitting in Distributed Optimization for Machine Learning","authors":"D. Medyakov,&nbsp;G. Molodtsov,&nbsp;A. Beznosikov,&nbsp;A. Gasnikov","doi":"10.1134/S1064562423701600","DOIUrl":"10.1134/S1064562423701600","url":null,"abstract":"<p>The distributed optimization problem has become increasingly relevant recently. It has a lot of advantages such as processing a large amount of data in less time compared to non-distributed methods. However, most distributed approaches suffer from a significant bottleneck—the cost of communications. Therefore, a large amount of research has recently been directed at solving this problem. One such approach uses local data similarity. In particular, there exists an algorithm provably optimally exploiting the similarity property. But this result, as well as results from other works solve the communication bottleneck by focusing only on the fact that communication is significantly more expensive than local computing and does not take into account the various capacities of network devices and the different relationship between communication time and local computing expenses. We consider this setup and the objective of this study is to achieve an optimal ratio of distributed data between the server and local machines for any costs of communications and local computations. The running times of the network are compared between uniform and optimal distributions. The superior theoretical performance of our solutions is experimentally validated.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S465 - S475"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140299620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
1-Dimensional Topological Invariants to Estimate Loss Surface Non-Convexity 估算损失面非凸性的一维拓扑不变式
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701569
D. S. Voronkova, S. A. Barannikov, E. V. Burnaev
{"title":"1-Dimensional Topological Invariants to Estimate Loss Surface Non-Convexity","authors":"D. S. Voronkova,&nbsp;S. A. Barannikov,&nbsp;E. V. Burnaev","doi":"10.1134/S1064562423701569","DOIUrl":"10.1134/S1064562423701569","url":null,"abstract":"<p>We utilize the framework of topological data analysis to examine the geometry of loss landscape. With the use of topology and Morse theory, we propose to analyse 1-dimensional topological invariants as a measure of loss function non-convexity up to arbitrary re-parametrization. The proposed approach uses optimization of 2-dimensional simplices in network weights space and allows to conduct both qualitative and quantitative evaluation of loss landscape to gain insights into behavior and optimization of neural networks. We provide geometrical interpretation of the topological invariants and describe the algorithm for their computation. We expect that the proposed approach can complement the existing tools for analysis of loss landscape and shed light on unresolved issues in the field of deep learning.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S325 - S332"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Safe Pretraining of Deep Language Models in a Synthetic Pseudo-Language 在合成伪语言中安全预训练深度语言模型
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701636
T. E. Gorbacheva, I. Y. Bondarenko
{"title":"Safe Pretraining of Deep Language Models in a Synthetic Pseudo-Language","authors":"T. E. Gorbacheva,&nbsp;I. Y. Bondarenko","doi":"10.1134/S1064562423701636","DOIUrl":"10.1134/S1064562423701636","url":null,"abstract":"<p>This paper compares the pretraining of a transformer on natural language texts and on sentences of a synthetic pseudo-language. The artificial texts are automatically generated according to the rules written in a context-free grammar. The results of fine-tuning to complete tasks of the RussianSuperGLUE project statistically reliably showed that the models had the same scores. That is, the use of artificial texts facilitates the AI safety, because it can completely control the composition of the dataset. In addition, at the pretraining stage of a RoBERTa-like model, it is enough to learn recognizing only the syntactic and morphological patterns of the language, which can be successfully created in a fairly simple way, such as a context-free grammar.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S494 - S502"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Analysis of Method with Batching for Monotone Stochastic Finite-Sum Variational Inequalities 单调随机有限和变分不等式的批处理方法优化分析
IF 0.5 4区 数学
Doklady Mathematics Pub Date : 2024-03-25 DOI: 10.1134/S1064562423701582
A. Pichugin, M. Pechin, A. Beznosikov, A. Savchenko, A. Gasnikov
{"title":"Optimal Analysis of Method with Batching for Monotone Stochastic Finite-Sum Variational Inequalities","authors":"A. Pichugin,&nbsp;M. Pechin,&nbsp;A. Beznosikov,&nbsp;A. Savchenko,&nbsp;A. Gasnikov","doi":"10.1134/S1064562423701582","DOIUrl":"10.1134/S1064562423701582","url":null,"abstract":"<p>Variational inequalities are a universal optimization paradigm that is interesting in itself, but also incorporates classical minimization and saddle point problems. Modern realities encourage to consider stochastic formulations of optimization problems. In this paper, we present an analysis of a method that gives optimal convergence estimates for monotone stochastic finite-sum variational inequalities. In contrast to the previous works, our method supports batching and does not lose the oracle complexity optimality. The effectiveness of the algorithm, especially in the case of small but not single batches is confirmed experimentally.</p>","PeriodicalId":531,"journal":{"name":"Doklady Mathematics","volume":"108 2 supplement","pages":"S348 - S359"},"PeriodicalIF":0.5,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140884735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信