G. Kupriyanov, I. Isaev, K. Laptinskiy, T. Dolenko, S. Dolenko
{"title":"Solution of an Inverse Problem of Optical Spectroscopy Using Kolmogorov-Arnold Networks","authors":"G. Kupriyanov, I. Isaev, K. Laptinskiy, T. Dolenko, S. Dolenko","doi":"10.3103/S1060992X24700747","DOIUrl":"10.3103/S1060992X24700747","url":null,"abstract":"<p>Kolmogorov-Arnold Networks (KAN), introduced in May 2024, are a novel type of artificial neural networks, whose abilities and properties are now being actively investigated by the machine learning community. In this study, we test application of KAN to solve an inverse problem for development of multimodal carbon luminescent nanosensors of ions dissolved in water, including heavy metal cations. We compare the results of solving this problem with four various machine learning methods—random forest, gradient boosting over decision trees, multi-layer perceptron neural networks, and KAN. Advantages and disadvantages of KAN are discussed, and it is demonstrated that KAN has high chance to become one of the algorithms most recommended for use in solving highly non-linear regression problems with moderate number of input features.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S475 - S482"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mastering Long-Context Multi-Task Reasoning with Transformers and Recurrent Memory","authors":"A. Bulatov, Y. Kuratov, M. Burtsev","doi":"10.3103/S1060992X24700735","DOIUrl":"10.3103/S1060992X24700735","url":null,"abstract":"<p>Recent advancements have significantly improved the skills and performance of language models, but have also increased computational demands due to the increasing number of parameters and the quadratic complexity of the attention mechanism. As context sizes expand into millions of tokens, making long-context processing more accessible and efficient becomes a critical challenge. Furthermore, modern benchmarks such as BABILong [1] underscore the inefficiency of even the most powerful LLMs in long context reasoning. In this paper, we employ finetuning and multi-task learning to train a model capable of mastering multiple BABILong long-context reasoning skills. We demonstrate that even models with fewer than 140 million parameters can outperform much larger counterparts by learning multiple essential tasks simultaneously. By conditioning Recurrent Memory Transformer [2] on task description, we achieve state-of-the-art results on multi-task BABILong QA1–QA5 set for up to 32k tokens. The proposed model also shows generalization abilities to new lengths and tasks, along with increased robustness to input perturbations.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S466 - S474"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sea-SHINE: Semantic-Aware 3D Neural Mapping Using Implicit Representations","authors":"V. Bezuglyj, D. A. Yudin","doi":"10.3103/S1060992X24700711","DOIUrl":"10.3103/S1060992X24700711","url":null,"abstract":"<p>Semantic-aware mapping is crucial for advancing robotic navigation and interaction within complex environments. Traditional 3D mapping techniques primarily capture geometric details, missing the semantic richness necessary for autonomous systems to understand their surroundings comprehensively. This paper presents Sea-SHINE, a novel approach that integrates semantic information within a neural implicit mapping framework for large-scale environments. Our method enhances the utility and navigational relevance of 3D maps by embedding semantic awareness into the mapping process, allowing robots to recognize, understand, and reconstruct environments effectively. The proposed system leverages dual decoders and a semantic awareness module, which utilizes Feature-wise Linear Modulation (FiLM) to condition mapping on semantic labels. Extensive experiments on datasets such as SemanticKITTI, KITTI-360, and ITLP-Campus demonstrate significant improvements in map precision and recall, particularly in recognizing crucial objects like road signs. Our implementation bridges the gap between geometric accuracy and semantic understanding, fostering a deeper interaction between robots and their operational environments. The code is publicly available at https://github.com/VitalyyBezuglyj/Sea-SHINE.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S445 - S456"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Samarin, A. Savelev, A. Toropov, A. Nazarenko, A. Motyko, E. Kotenko, A. Dozorceva, A. Dzestelova, E. Mikhailova, V. Malykh
{"title":"Streptococci Recognition in Microscope Images Using Taxonomy-based Visual Features","authors":"A. Samarin, A. Savelev, A. Toropov, A. Nazarenko, A. Motyko, E. Kotenko, A. Dozorceva, A. Dzestelova, E. Mikhailova, V. Malykh","doi":"10.3103/S1060992X24700693","DOIUrl":"10.3103/S1060992X24700693","url":null,"abstract":"<p>This study explores the development of classifiers for microbial images, specifically focusing on streptococci captured via microscopy of live samples. Our approach uses AutoML-based techniques and automates the creation and analysis of feature spaces to produce optimal descriptors for classifying these microscopic images. This technique leverages interpretable taxonomic features based on the external geometric attributes of various microorganisms. We have released an annotated dataset we assembled to validate our solution, featuring microbial images from unfixed microscopic scenes. Additionally, we assessed the classification performance of our method against several classifiers, including those employing deep neural networks. Our approach outperformed all others tested, achieving the highest Precision (0.980), Recall (0.979), and F1-score (0.980).</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S424 - S434"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classifier of Motor EEG Images for Real Time BCI","authors":"L. A. Stankevich, S. A. Kolesov","doi":"10.3103/S1060992X24700772","DOIUrl":"10.3103/S1060992X24700772","url":null,"abstract":"<p>The work is devoted to the development of a classifier of motor activity patterns based on electroencephalograms (EEG) for a real-time brain-computer interface (BCI), which can be used in contactless control systems. Conducted studies of various methods for classifying motor EEG images have shown that their effectiveness significantly depends on the implementation of the stages of information processing in the BCI. The most effective classification method turned out to be the support vector machine. However, its long operating time and lack of accuracy make it difficult to use for implementing real-time BCI. Therefore, a classifier was developed using an ensemble of detectors, each of which is trained to recognize its own motor EEG image. A new EEG analysis algorithm based on event functions was applied. A study of the classifier showed that it is possible to achieve detection accuracy of 98.5% with an interface delay of 230 ms.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S497 - S503"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Role of the Transcendental in the Life of Artificial Beings","authors":"V. B. Kotov, Z. B. Sokhova","doi":"10.3103/S1060992X24700760","DOIUrl":"10.3103/S1060992X24700760","url":null,"abstract":"<p>The paper considers the effect of transcendental factors on behavior of artificial beings, agents and robots. The boundary between the rational and transcendental relies on the type of an individual or more specifically, his sensory and intellectual abilities. For agents and simple robots all, except for tangible elements of the environment and manipulations with them, is transcendental. Such transcendental factors as environmental changes and algorithm modifications determined by the programmer, supervisor, or operator have significant effects on agents and communities of agents. Hardware malfunctions (transcendental events from an agent’s point of view) can be crucial for agents. Agents can take advantages from transcendental effects if the programmer realizes a feedback. Generation of a mental copy of an agent for making new agents allows continuity and social development. For intellectual robots the boundaries of the transcendental move away because of their ability to accommodate to new environment. However, in most cases the role of the transcendental even increases with the improvement of robots because there are consciousness and growth of communication possibilities. The consciousness changes as a result of learning of transcendental information, making robots change the behavior. Robot’s communication abilities enable transcendental (along with rational) information to be received from the data base in any amount. For people living together with intelligent robots, this sort of communication can become a tool for introducing human culture in the community of robots. This in turn would result in humanization of robots and establishment of good relations between robots and human beings.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S490 - S496"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Robust Adversarial Model against Evasion Attacks on Intrusion Detection Systems","authors":"R. N. Anaedevha, A. G. Trofimov","doi":"10.3103/S1060992X24700681","DOIUrl":"10.3103/S1060992X24700681","url":null,"abstract":"<p>This research develops improved Robust Adversarial Models (RAM) to enhance Intrusion Detection Systems’ (IDS) robustness against evasion attacks. Malicious packets crafted using Scapy were infused into open-source datasets NSL-KDD and CICIDS obtained from Kaggle. Experiments involved passing this traffic through baseline IDS model such as in a free open-source IDS Snort and the improved RAM. Training processes employed perturbations using Generative Adversarial Networks (GAN), Fast Gradient Sign Methods (FGSM), and Projected Gradient Descent (PGD) against reinforcement learning of features and labels from the autoencoder model. The robust adversarial model showed 34.52% higher accuracy, 59.06% higher F1-score and 85.26% higher recall than the baseline IDS Snort model across datasets. Comparative analysis demonstrated the improved RAM’s enhanced resilience, performance, and reliability in real-world scenarios, advancing IDS models' and network infrastructures' security posture.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S414 - S423"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Curriculum Learning: Optimizing Reinforcement Learning through Dynamic Task Sequencing","authors":"M. Nesterova, A. Skrynnik, A. Panov","doi":"10.3103/S1060992X2470070X","DOIUrl":"10.3103/S1060992X2470070X","url":null,"abstract":"<p>Curriculum learning in reinforcement learning utilizes a strategy that sequences simpler tasks in order to optimize the learning process for more complex problems. Typically, existing methods are categorized into two distinct approaches: one that develops a teacher (a curriculum strategy) policy concurrently with a student (a learning agent) policy, and another that utilizes selective sampling based on the student policy’s experiences across a task distribution. The main issue with the first approach is the substantial computational demand, as it requires simultaneous training of both the low-level (student) and high-level (teacher) reinforcement learning policies. On the other hand, methods based on selective sampling presuppose that the agent is capable of maximizing reward accumulation across all tasks, which may lead to complications when the primary mission is to master a specific target task. This makes those models less effective in scenarios requiring focused learning. Our research addresses a particular scenario where a teacher needs to train a new student in a new short episode. This constraint compels the teacher to rapidly master the curriculum planning by identifying the most appropriate tasks. We evaluated our framework across several complex scenarios, including a partially observable grid-world navigation environment, and in procedurally generated open-world environment Crafter.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S435 - S444"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpretable Sentiment Analysis and Text Segmentation for Chinese Language","authors":"Hou Zhenghao, A. Kolonin","doi":"10.3103/S1060992X24700759","DOIUrl":"10.3103/S1060992X24700759","url":null,"abstract":"<p>In this paper, we explored the performance of interpretable sentiment analysis models of different combinations for the Chinese text in social media. We made experiment to study how performance varies with the change of combination of different segmentation strategies and dictionary of words or n-grams. We found that with some good combination of segmentation strategies and dictionary of words or n-grams, the result can be improved and overtake the performance of ordinary sentiment analysis model of Chinese language. This way we show the importance of selection of segmentation strategies and dictionary for the sentiment analysis of Chinese text.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S483 - S489"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Kniaz, V. Knyaz, T. Skrypitsyna, P. Moshkantsev, A. Bordodymov
{"title":"Deep Learning for Single Photo 3D Reconstruction of Cultural Heritage","authors":"V. Kniaz, V. Knyaz, T. Skrypitsyna, P. Moshkantsev, A. Bordodymov","doi":"10.3103/S1060992X24700723","DOIUrl":"10.3103/S1060992X24700723","url":null,"abstract":"<p>In this paper, we propose a new single-photo 3D reconstruction model <span>DiffuseVoxels</span> focused on 3D inpainting of destroyed parts of a building. We use frustum-voxel model 3D reconstruction pipeline as a starting point for our research. Our main contribution is an iterative estimation of destroyed parts from a Gaussian noise inspired by diffusion models. Our input is twofold. Firstly, we mask the destroyed region in the input 2D image with a Gaussian noise. Secondly, we remove the noise through many iterations to improve the 3D reconstruction. The resulting model is represented as a semantic frustum voxel model, where each voxel represents the class of the reconstructed scene. Unlike classical voxel models, where each unit represents a cube, frustum voxel models divides the scene space into trapezium shaped units. Such approach allows us to keep the direct contour correspondence between the input 2D image, input 3D feature maps, and the output 3D frustum voxel model.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S457 - S465"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}