R. Sreemathy, Param Chordiya, Soumya Khurana, Mousami Turuk
{"title":"Sign Language Video Generation from Text Using Generative Adversarial Networks","authors":"R. Sreemathy, Param Chordiya, Soumya Khurana, Mousami Turuk","doi":"10.3103/S1060992X24700851","DOIUrl":"10.3103/S1060992X24700851","url":null,"abstract":"<p>This work presents a technique developed by utilizing Generative Adversarial Networks (GANs) to generate Sign Language videos. Sign Language is the main mode of communication for people in the hearing impaired community. The process of teaching sign language is difficult as there are not a lot of tools available for this purpose. Generative artificial intelligence can be very helpful for this task as it is able to learn from the limited data and is able to generate various images and videos. In this work, Conditional GANs (cGANs) were employed to generate videos for Indian Sign Language (ISL) based on a text input. It is found that the results obtained from cGANs exhibit superior quality and control based on the performance metrics such as SSIM, FID and MSE values. The effectiveness of the cGANs in generating accurate and visually appealing sign language videos highlights their potential for teaching sign language and improving sign language communication systems.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"466 - 476"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advanced Attention-Based Pre-Trained Transfer Learning Model for Accurate Brain Tumor Detection and Classification from MRI Images","authors":"A. Priya, V. Vasudevan","doi":"10.3103/S1060992X24700863","DOIUrl":"10.3103/S1060992X24700863","url":null,"abstract":"<p>Brain tumor identification using MRI images involves the detailed examination of brain tissues to detect and characterize tumors. Conventional ML and DL algorithms sometimes encounter difficulties due to a lack of labelled data, resulting in inferior performance and poor generalization. To address these issues, this study introduces an Advanced Attention-based Pre-trained Transfer Learning (TL) model that enhances accuracy and resilience in identifying and categorizing brain tumors using MRI images. The methodology starts with pre-processing, which includes image scaling and noise reduction with an adaptive median filter. After pre-processing, the images are fed into a CNN-based framework called Pre-trained Attention-fused Image SpectraNet. This framework comprises of five convolutional layers, after which Rectified Linear Unit (ReLU) activation and pooling layers are added to learn progressively more complex features. A novel self-attention layer is implemented to capture deep features that reveal aberrant tissue patterns, hence increasing model interpretability and accuracy. A globally average pooling layer is employed to reduce computational complexity, and it is accompanied by a fully connected layer with batch normalization to assure stability and convergence during training. The last layer uses softmax to categorize normal, pituitary, glioma, and meningioma. Utilizing the Adam optimizer, the suggested approach enhances performance, yielding excellent metrics such as 98.33% accuracy, 98.35% precision, 98.28% recall, and a 98.31% F1-score. These measures show considerable increases over existing ML and DL methods, demonstrating the system’s ability to improve brain tumor detection accuracy. The advancement of these treatments has significant implications for medical professionals who specialize in the timely identification of brain tumors.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"477 - 491"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hariharan Ramesh, Faridoddin Shariaty, Sanjiban Sekhar Roy
{"title":"IFDRF: Advancing Anomaly Detection with a Hybrid Machine Learning Model","authors":"Hariharan Ramesh, Faridoddin Shariaty, Sanjiban Sekhar Roy","doi":"10.3103/S1060992X24700474","DOIUrl":"10.3103/S1060992X24700474","url":null,"abstract":"<p>Anomaly detection is the identification of aberrations in the dataset using statistical methods or machine learning algorithms. It is widely performed using unsupervised learning algorithms because labelling the data manually can be expensive. While unsupervised anomaly detection is sufficient for data cleaning, this is not the case in real-world applications, where accuracy is of the utmost importance. For example, it would be unacceptable to misdiagnose someone as not having breast cancer and not provide them with treatment because our model failed to recognize it as an anomaly. In this paper, we propose an optimized model—IFDRF (Isolation Forest, DBSCAN, and Random Forest) that has incorporated feedback (corrections) into the unsupervised detection model. IFDRF is a novel hybrid model combining an unsupervised learning model at the first layer followed by a clustering model at the second layer and a supervised learning model at the end. The proposed model tunes the unsupervised learning model followed by a model fitting with the help of the feedback mechanism. It obviates the need to label the entire dataset and thus increases the scope of anomaly detection applications. We have compared our proposed model to the existing state-of-the-art anomaly detection baseline models to show its efficacy. The proposed model performed significantly (<span>(P{text{-value}} < 2.2 times {{10}^{{ - 16}}})</span>) better than the other algorithms, with an AUC score of 0.875.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"385 - 400"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huafeng Chen, A. Krytsky, Shiping Ye, Rykhard Bohush, S. Ablameyko
{"title":"Tracking and Computation of Characteristics of the Movement of People in Groups on Video Using Convolutional Neural Networks","authors":"Huafeng Chen, A. Krytsky, Shiping Ye, Rykhard Bohush, S. Ablameyko","doi":"10.3103/S1060992X24700802","DOIUrl":"10.3103/S1060992X24700802","url":null,"abstract":"<p>This paper proposes an approach for tracking the behavior of people in a group on video by using convolutional neural networks. At the beginning, definitions of group movement of people are given, and features for accompaniment are defined that can be used to analyze people’s behavior. Next, an algorithm is proposed for calculating the distance between people in video, which includes three stages: detection and tracking of objects, coordinate transformation, calculation of the distance between people and detection of distance violations. The results of experimental studies and comparison with known algorithms are presented, which confirms the effectiveness of the algorithm.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"373 - 384"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Network Model for Cardiac Image Segmentation Using MRI Images","authors":"A. Rasmi","doi":"10.3103/S1060992X24700498","DOIUrl":"10.3103/S1060992X24700498","url":null,"abstract":"<p>Cardiac magnetic resonance imaging (MRI) commonly yields numerous images per scan, and manually delineating structures from these images is a laborious and time-intensive task. The automation of this process is highly desirable as it would enable the generation of crucial clinical measurements like ejection fraction and stroke volume. However, due to variations in scanning settings and patient characteristics, automated segmentation faces several challenges that lead to a high degree of variability in picture statistics and quality. Our study presents a neural network approach that utilizes the UNet and ResNet-50 architectures to efficiently partition the left and right ventricles' endocardial and epicardial boundaries. The Dice metric is used as the loss function in our strategy to maximize the trainable parameters in the network. Additionally, in the neural network’s predicted binary picture, we employed a preprocessing step to save just the segmentation labels' most connected component. Using datasets from the Multi-Vendor & Multi-Disease Cardiac Image Segmentation Challenge, the suggested method was learned. The test set of 160 that had been reserved for testing was used by the challenge organizers to evaluate the approach.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"447 - 454"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Abnormal Sound Event Detection Method Based on Time-Spectrum Information Fusion","authors":"Changgeng Yu, Chaowen He, Dashi Lin","doi":"10.3103/S1060992X24700814","DOIUrl":"10.3103/S1060992X24700814","url":null,"abstract":"<p>In this paper, we propose an abnormal sound event detection method based on Time-Frequency Spectral Information Fusion Neural Network (TFSIFNN), addressing the problem that the time structure and frequency information of sound events in real environment are widely varied, resulting in poor performance of abnormal sound event detection. First, we construct a TCN-BiLSTM network based on Temporal Convolutional Networks (TCN) and Bidirectional Long Short-Term Memory (BiLSTM) networks to extract the temporal context information from sound events. Next, we enhance the feature learning capability of the MobileNetV3 network through Efficient Channel Attention (ECA), culminating in the design of an ECA-MobileNetV3 network to capture the spectral information within sound events. Finally, a TFSIFNN model was established based on TCN-BiLSTM and ECA-MobileNetV3 to improve the performance of abnormal sound event detection. The experimental results, conducted on the Urbansound8K and TUT Rare Sound Events 2017 datasets, demonstrate that our TFSIFNN model achieved notable performance improvements. Specifically, it reached an accuracy of 93.93% and an <i>F</i>1<i>-Score</i> of 94.15% on the Urbansound8K dataset. On the TUT Rare Sound Events 2017 dataset, compared to the baseline method, the error rate on the evaluation set decreased by 0.55, and the <i>F</i>1<i>-Score</i> improved by 29.69%.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"411 - 421"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computer Analysis of EPR Spectra of 31P Atom Quantum Pair Embedded in Spinless Isotope 28Si Substrate","authors":"S. N. Dobryakov, V. V. Privezentsev","doi":"10.3103/S1060992X24700826","DOIUrl":"10.3103/S1060992X24700826","url":null,"abstract":"<p>In this paper we use EPR spectrums to explore interactions between elements of a quantum pair <sup>31</sup>P–<sup>31</sup>P embedded into <sup>28</sup>Si isotope substrate supposing that several silicon atoms separate phosphorus isotopes. The EPR method allows us to identify at a quantum level mechanisms of interaction between the phosphorus atoms and to analyze the influence of the silicon substrate on the spin-spin interaction between <sup>31</sup>P atoms in the quantum pairs. We also examined possibilities to control these interactions. When simulating, we take into account scalar and vector exchange interactions as well as a dipole interaction between unpaired electrons of <sup>31</sup>P atoms. We suppose that an indirect dipole-dipole interaction is carried out via a system of conjugated 3<i>d</i>-orbits and by means of a polarization of the medium (the <sup>28</sup>Si isotope substrate). The exchange interaction between the spins (the magnetic moments) of electrons of the two phosphorus atoms also is carried out via the polarized medium. We discuss the obtained simulated EPR spectrums.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 4","pages":"422 - 428"},"PeriodicalIF":1.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Korsakov, V. Ivanova, A. Demcheva, R. Eidelman, I. Fomin, A. Bakhshiev
{"title":"Development and Implementation of Neuromorphic Elements of the Information and Control System of a Mobile Robot","authors":"A. Korsakov, V. Ivanova, A. Demcheva, R. Eidelman, I. Fomin, A. Bakhshiev","doi":"10.3103/S1060992X24700784","DOIUrl":"10.3103/S1060992X24700784","url":null,"abstract":"<p>The task of developing and applying neuromorphic elements of an information control system for mobile robots is considered. The description of the compartmental spiking neuron model used in the work and the algorithm of its structural learning is given. The elements of the information control system used in the work are described: a neuromorphic emergency detector, a neuromorphic extrapolator, and a neuromorphic model for the formation of associative connections. Based on these elements, a scheme for the formation of a conditioned reflex reaction with negative reinforcement is proposed. In addition, a scheme is considered that allows a mobile robot to move at a given distance from the wall. The first of these schemes was tested on a real mobile robotics platform. The conclusion is made about the possibility of constructing neuromorphic information control systems from the presented elements and the prospects for the development of this approach.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S504 - S512"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combined Use of Dynamic Inversion and Reinforcement Learning for Motion Control of an Supersonic Transport Aircraft","authors":"Gaurav Dhiman, Yu. V. Tiumentsev, R. A. Tskhai","doi":"10.3103/S1060992X2470067X","DOIUrl":"10.3103/S1060992X2470067X","url":null,"abstract":"<p>The task of aircraft motion control has to be solved under conditions of numerous heterogeneous uncertainties both in the aircraft motion model and in the environment in which the aircraft is flying. These uncertainties, in particular, are caused by the fact that in the flight of the aircraft can occur various kinds of abnormal situations caused by failures of equipment and systems of the aircraft, damage to the airframe and propulsion system of the aircraft. Some of these failures and damages have a direct impact on the dynamic characteristics of the aircraft as a control object. In this regard, the problem arises of such an adjustment of aircraft control algorithms that would provide the ability to adapt to the changed dynamics of the aircraft. It is extremely difficult, and in some cases impossible, to foresee in advance all possible damages, failures and their combinations. Hence, it is necessary to implement adaptive flight control algorithms that are able to adjust to the changing situation. One of the effective tools for solving such problems is reinforcement learning in the Approximate Dynamic Programming (ADP) variant, in combination with artificial neural networks. In the last decade, a family of methods known as Adaptive Critic Design (ACD) has been actively developed within the ADP approach to control the behavior of complex dynamic systems. In our paper we consider the application of one of the variants of the ACD approach, namely SNAC (Single Network Adaptive Critic) and its development through its joint use with the method of dynamic inversion. The effectiveness of this approach is demonstrated on the example of longitudinal motion control of a supersonic transport airplane.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S399 - S413"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143108934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. V. Demidovskij, I. G. Salnikov, A. M. Tugaryov, A. I. Trutnev, I. A. Novikova
{"title":"Comprehensive Weight Decomposition Analysis of Modern Parameter-Efficient Methods","authors":"A. V. Demidovskij, I. G. Salnikov, A. M. Tugaryov, A. I. Trutnev, I. A. Novikova","doi":"10.3103/S1060992X24700796","DOIUrl":"10.3103/S1060992X24700796","url":null,"abstract":"<p>Large Language Models fine-tuning is an essential part of modern artificial intelligent systems that solve numerous tasks, such as natural language processing and computer vision. Among the various fine-tuning strategies, the most prominent approach for Large Language Model fine-tuning is Parameter-Efficient Fine-Tuning (PEFT), as it allows to achieve state-of-the-art performance on multiple tasks while minimizing computational resources and training time. Recently, an increasing number of PEFT methodologies have been developed, each asserting superiority based on performance metrics. However, a critical evaluation of how these methods align with the tuning dynamic of the full fine-tuning (FT) remains largely unexplored. This study focuses on bridging this gap by analyzing the learning behavior of such PEFT approaches as LoRA, LoRA+, AdaLoRA, DoRA, VeRA, PiSSA, LoKr and LoHa in comparison to FT. This work provides a comprehensive comparative analysis aimed at identifying which PEFT methods diverge significantly in weights update dynamic from the FT standard. The findings reveal insights into the underlying causes of these discrepancies, offering a deeper understanding of each method’s behavior and efficacy.</p>","PeriodicalId":721,"journal":{"name":"Optical Memory and Neural Networks","volume":"33 3 supplement","pages":"S513 - S522"},"PeriodicalIF":1.0,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143109182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}