NeurocomputingPub Date : 2025-02-03DOI: 10.1016/j.neucom.2025.129531
Hao Fang , Huanyu Liu , Jiazheng Wen , Zhonglin Yang , Junbao Li , Qi Han
{"title":"Automatic visual enhancement of PTZ camera based on reinforcement learning","authors":"Hao Fang , Huanyu Liu , Jiazheng Wen , Zhonglin Yang , Junbao Li , Qi Han","doi":"10.1016/j.neucom.2025.129531","DOIUrl":"10.1016/j.neucom.2025.129531","url":null,"abstract":"<div><div>Video surveillance systems have become indispensable for enhancing border security and effectively addressing threats such as illegal border crossings. Object detection plays a crucial role in real-time monitoring and event response in these systems. However, challenges arise in border scenarios, where targets exhibit small sizes, low resolutions, and limited extractable features, resulting in lower detection confidence. To overcome these limitations, we propose an advanced pan–tilt–zoom camera control method that does not require intrinsic camera parameters. The objective of this study is to accomplish visual enhancement tasks for low-confidence targets. The proposed method employs deep reinforcement learning techniques, integrating both discrete and continuous action spaces to enhance the generalization capability of agent decisions across diverse scenarios, thereby achieving optimal target monitoring. In addition, the introduction of the cutout feature fusion filter enables the agent to focus equally on each target in multitarget scenarios. In our experiments, we compared the proposed method with other approaches. The results demonstrate the superiority of the proposed method in various scenarios and object detection algorithms.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129531"},"PeriodicalIF":5.5,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143377895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-03DOI: 10.1016/j.neucom.2025.129549
Ziyi Wu, Yanduo Zhang, Tao Lu, Kanghui Zhao, Jiaming Wang
{"title":"Contour-texture preservation transformer for face super-resolution","authors":"Ziyi Wu, Yanduo Zhang, Tao Lu, Kanghui Zhao, Jiaming Wang","doi":"10.1016/j.neucom.2025.129549","DOIUrl":"10.1016/j.neucom.2025.129549","url":null,"abstract":"<div><div>In the field of computer vision, face super-resolution (FSR) technology is an important tool for enhancing the performance of basic tasks such as face recognition and video surveillance. However, when faced with complex face images, existing FSR methods often rely on the Transformer model to improve image quality through its powerful global modeling capabilities. Yet, they tend to be slightly insufficient in local feature extraction due to a lack of adequate local detail capture capabilities. To alleviate these problems, we propose a novel contour-texture preservation Transformer (CTP) method for FSR. This method consists of two key components: the multi-scale attention enhancement block (MSAEB), which captures and fuses image features of different scales to improve the detail level of feature representation, and provides high-quality input for the contour-texture Transformer enhancement block (CTTEB). Additionally, CTTEB integrates convolution operations to enhance local feature extraction and improve feature expression. The feed-forward network (FFN) we introduced ensures the full fusion of global and local information. By combining MSAEB and CTTEB into a residual progressive attention group (RPAG), the network gradually extracts and fuses multi-scale features, ultimately achieving dual preservation of contour structure and texture details. Experiments show that our method achieves the best results on LFW, FFHQ, CelebA, and Helen, with a 0.32 dB increase in PSNR, a 0.0126 increase in SSIM, and a 0.0056 increase in FSIM over the second-best model. Experiments on real-world datasets SCface and Chokepoint confirm that the CTP method excels in both FSR reconstruction and face recognition, verifying its effectiveness.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129549"},"PeriodicalIF":5.5,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143314261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-03DOI: 10.1016/j.neucom.2025.129538
Ximin Li , An Yan , Shengqi Zhu , Dengxiu Yu , C.L. Philip Chen
{"title":"Neural-network-based adaptive fixed-time control for stochastic multi-agent systems","authors":"Ximin Li , An Yan , Shengqi Zhu , Dengxiu Yu , C.L. Philip Chen","doi":"10.1016/j.neucom.2025.129538","DOIUrl":"10.1016/j.neucom.2025.129538","url":null,"abstract":"<div><div>This article deals with the fixed-time control design issue for stochastic multi-agent systems (MASs). First of all, a new practical fixed-time stability criterion in probability is proposed. Compared with existing works, the settling time is exclusively determined by design parameters, signifying that it can be calculated with precision. Utilizing this stability criterion, a fixed-time control strategy for stochastic MASs is designed, principally leveraging the backstepping control techniques and the radial basis function neural networks (RBF NNs). Additionally, the singularity problem in the control scheme is avoided by exploiting L’Hôpital’s rule. With the designed control strategy, the stochastic MASs achieve practical fixed-time stability. Furthermore, the tracking errors converge to an adjustable range near zero. The effectiveness of the proposed control strategy is verified by a series numerical simulation.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"627 ","pages":"Article 129538"},"PeriodicalIF":5.5,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143430115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bilinear Self-Representation for Unsupervised Feature Selection with Structure Learning","authors":"Hossein Nasser Assadi , Faranges Kyanfar , Farid Saberi-Movahed , Abbas Salemi","doi":"10.1016/j.neucom.2025.129557","DOIUrl":"10.1016/j.neucom.2025.129557","url":null,"abstract":"<div><div>Current feature selection approaches based on self-representation focus solely on representing either the sample space or the feature space. Consequently, a common challenge in such methods arises from the limited interaction between the feature and sample spaces, which can potentially result in incomplete discovery of hidden data information. To address this challenge, this paper introduces a novel mixture-level self-representation method called Bilinear Self-representation for Unsupervised Feature Selection with Structure Learning (BSUFSL). The newly proposed method, BSUFSL, distinguishes itself from existing feature selection approaches based on self-representation by utilizing the concept of bilinearity. This allows BSUFSL to capture self-representation in both the feature and sample spaces, integrating information from both domains and establishing an effective interaction between them to identify the most significant features. Meanwhile, BSUFSL incorporates local preservation to maintain geometric structure among features and global preservation to maximize data variance, both of which are essential aspects of structure learning. An efficient iterative optimization algorithm with theoretically guaranteed convergence is also developed to solve the BSUFSL model. Experimental results on eight benchmark datasets demonstrate the superior effectiveness of BSUFSL compared to state-of-the-art methods. Specifically, the numerical results reveal a significant improvement in clustering performance across several datasets. Furthermore, results from experiments confirm that BSUFSL enhances the clarity and accuracy of clustering by selecting discriminative features, thanks to its bilinear structure, which integrates feature and sample self-representation into a unified framework.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"625 ","pages":"Article 129557"},"PeriodicalIF":5.5,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143152027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-02DOI: 10.1016/j.neucom.2025.129550
Yiran Pang, Zhen Ni, Xiangnan Zhong
{"title":"A fast federated reinforcement learning approach with phased weight-adjustment technique","authors":"Yiran Pang, Zhen Ni, Xiangnan Zhong","doi":"10.1016/j.neucom.2025.129550","DOIUrl":"10.1016/j.neucom.2025.129550","url":null,"abstract":"<div><div>Federated reinforcement learning (FRL) enables multiple agents to learn collaboratively without directly sharing their local data. This method addresses the data privacy concerns in the distributed systems. However, FRL faces challenges such as high communication costs, since it requires extensive interactions to achieve satisfied performance. Therefore, this paper develops a fast FRL method with a dynamic aggregation coefficient to reduce the communication load during the learning process. Diverging from traditional FRL techniques which rely on static averaging, our approach begins by setting the initial aggregation coefficient to the logarithm of the number of participating agents. This elevation can enhance the early integration of updates from distributed agents and facilitate a rapid initial learning phase. As communications progress, the aggregation coefficient linearly decreases, transitioning to an average aggregation by the end of the specified interval. This gradual reduction aligns individual learning updates more closely over time, shifting towards a unified global learning model. Furthermore, we implement a value-clipping strategy to constrain global updates within a predefined safe range, thus safeguarding against the potential overflow issues. The aggregation coefficient stabilizes after the initial aggressive integration phase to ensure the training stability. The boundedness analysis of the model aggregation confirms that, despite the high initial coefficient, the parameters of the global model remain within the manageable limits on the FRL server. This strategy is applicable to both tabular and deep learning methods. We validate the designed algorithm on navigation and control tasks, including heterogeneous environments where distinct state transitions and dynamics are designed for each agent. The experimental results demonstrate that our proposed approach achieves faster convergence across various environments.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129550"},"PeriodicalIF":5.5,"publicationDate":"2025-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143355502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-01DOI: 10.1016/j.neucom.2025.129593
Shoucheng Yan , Yang Chen , Wenfei Cao , Huibin Li
{"title":"Enhancing U-Net with low-rank attention skip block for 3D point cloud segmentation","authors":"Shoucheng Yan , Yang Chen , Wenfei Cao , Huibin Li","doi":"10.1016/j.neucom.2025.129593","DOIUrl":"10.1016/j.neucom.2025.129593","url":null,"abstract":"<div><div>The U-Net framework has been widely applied in the field of 3D point cloud segmentation. Most existing methods primarily focus on employing more powerful encoders to enhance the understanding of point cloud data. However, some inherent operations in traditional encoders, such as down-sampling and local neighborhood feature aggregation, restrict the effective receptive field of the model to a relatively small region. To address this issue, we introduce a global Attention Skip Block (ASBlock) to improve the skip connection in the U-Net framework, facilitating feature fusion between each point and the global context. Moreover, inspired by the dense-connection idea exploited by some well-known works such as DenseNet and 3D DenseNet, we further extend the proposed ASBlock to a new version with dense connection that can integrate more global contextual information for better segmentation performance. Additionally, to reduce computational complexity, we compress the global attention model using approximate low-rank matrix decomposition, leading to the development of the Low-rank Attention Skip Block (LrASBlock). This module can be efficiently applied to large-scale datasets and seamlessly integrated into existing U-Net segmentation networks as a universal plug-and-play tool. Finally, extensive experimental results on multiple datasets demonstrate that the integration of LrASBlock can significantly improve segmentation performance of several typical U-Net-based methods. Code is available at <span><span>https://github.com/Ysc156/LrASBlock</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129593"},"PeriodicalIF":5.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143314461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-01DOI: 10.1016/j.neucom.2025.129542
Tingming Bai , Zhiyu Xiang , Xijun Zhao , Peng Xu , Tianyu Pu , Jingyun Fu
{"title":"LiDAR semantic segmentation with local consistency constrained KPConv LSTM","authors":"Tingming Bai , Zhiyu Xiang , Xijun Zhao , Peng Xu , Tianyu Pu , Jingyun Fu","doi":"10.1016/j.neucom.2025.129542","DOIUrl":"10.1016/j.neucom.2025.129542","url":null,"abstract":"<div><div>As a fundamental task for autonomous driving, LiDAR point cloud semantic segmentation has been intensively studied in recent years. Despite the great progress, achieving satisfactory semantic segmentation is still very challenging due to the sparsity of LiDAR points and the shape diversity of the classes in the open world. In this paper we propose a local consistency constrained KPConv-LSTM module to benefit the existing methods. It enhances point-based LSTM with several KPConvs to strengthen and align the previous hidden features, thus improving the temporal feature propagation. A temporal weighting block is designed within the module to further reduce the error caused by the misalignment of moving objects. In addition, a special local consistency loss is proposed to encourage the local smoothness of the feature, thereby providing more consistent feature for temporal propagation in LSTM. We apply our method to various existing LiDAR semantic segmentation models. The experimental results on multiple datasets show that our method can produce notable improvements on all of them, validating the effectiveness and generality of the method.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129542"},"PeriodicalIF":5.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143348775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-01DOI: 10.1016/j.neucom.2025.129607
Xiaoying Mao , Ye Tian , Tairan Jin , Bo Di
{"title":"Enhancing music audio signal recognition through CNN-BiLSTM fusion with De-noising autoencoder for improved performance","authors":"Xiaoying Mao , Ye Tian , Tairan Jin , Bo Di","doi":"10.1016/j.neucom.2025.129607","DOIUrl":"10.1016/j.neucom.2025.129607","url":null,"abstract":"<div><div>This study presents an advanced framework for music audio signal recognition that combines Convolutional Neural Networks (CNNs), Bidirectional Long Short-Term Memory (BiLSTM) networks, and Noise Reduction Auto-encoder models to significantly improve accuracy and robustness. The core innovation is a novel noise reduction auto-encoder that integrates CNN and BiLSTM architectures, enabling superior recognition performance under varying noise levels and environmental conditions. The proposed framework, validated on several datasets including the Zhvoice, Common Voice, and LibriSpeech, demonstrates higher accuracy compared to existing methods. In addition, an optimized CNN architecture called Faster Region-based CNN with Multi-scale Information (FRCNN-MSI) is developed for efficient speech feature extraction, which shows significant improvements in noisy environments. The BiLSTM model is further enhanced with an attention mechanism that improves sequence modeling and contextual relationship capture. Together, these advances establish our approach as a robust solution to real-world speech recognition challenges, with potential implications for improving speech recognition systems in diverse applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"625 ","pages":"Article 129607"},"PeriodicalIF":5.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143210691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-01DOI: 10.1016/j.neucom.2025.129583
Manuel Jesús Jiménez-Navarro , Jose Miguel Riquelme-Dominguez , Manuel Carranza-García , Francisco M. González-Longatt
{"title":"A real-time machine learning-based methodology for short-term frequency nadir prediction in low-inertia power systems","authors":"Manuel Jesús Jiménez-Navarro , Jose Miguel Riquelme-Dominguez , Manuel Carranza-García , Francisco M. González-Longatt","doi":"10.1016/j.neucom.2025.129583","DOIUrl":"10.1016/j.neucom.2025.129583","url":null,"abstract":"<div><div>In the modern era, electricity is vital for societal advancement, driving economic growth and essential functions. However, the landscape of power systems is swiftly changing due to the integration of renewable energy sources and the decline of traditional synchronous generation, which reduces the total rotational inertia of the systems. This reduction in inertia leads to more frequent and severe frequency deviations, directly impacting power system behavior. Therefore, there is a pressing need to anticipate frequency grid disturbances to maintain stability and prevent disruptions. A machine learning approach is proposed to address this issue, providing accurate and responsive frequency forecasting in power systems. This paper introduces a novel methodology that leverages machine learning for short-term minimum frequency prediction, emphasizing efficiency and rapid response. A comprehensive experimentation process was conducted using several popular machine learning models, with their hyperparameters optimized through a Bayesian algorithm and evaluated via cross-validation. Results highlight the effectiveness of Decision Trees, offering a balance between efficiency and efficacy. Validation was conducted using the SCADA of a Typhoon HIL real-time simulator, verifying that the proposed methodology is suitable for real-time applications.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129583"},"PeriodicalIF":5.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143348777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2025-02-01DOI: 10.1016/j.neucom.2025.129539
Chenyun Yu, Junfeng Zhao, Xuan Wu, Yingle Luo, Yan Xiao
{"title":"TPGRec: Text-enhanced and popularity-smoothing graph collaborative filtering for long-tail item recommendation","authors":"Chenyun Yu, Junfeng Zhao, Xuan Wu, Yingle Luo, Yan Xiao","doi":"10.1016/j.neucom.2025.129539","DOIUrl":"10.1016/j.neucom.2025.129539","url":null,"abstract":"<div><div>GNN-based graph collaborative filtering methods have shown significant potential in recommendation systems, but they are often challenged by the long-tail effect due to exposure bias. While existing methods utilize techniques like contrastive learning, data augmentation and resampling as countermeasures, their reliance on ID-based embeddings can result in less informative representations and limit the model’s grasp of intricate neighbor relationships. Recent researches have attempted to improve overall recommendation performance by incorporating text information for items, but they usually rely on extra graph structures or complex calculations, increasing computational costs and lacking adequate consideration for long-tail items. In this paper, we propose TPGRec, a novel Graph collaborative filtering method jointly from the text enhancement and popularity smoothing perspectives, which simultaneously improves both overall and long-tail recommendation performance. Initially, we introduce a balancing mechanism applied to the graph structure and training set to reduce the influence of popular items. Upon this, a structural-level contrastive learning technique is proposed for graph representation learning, which captures complex structural relationships without introducing excessive noise to node representations. Furthermore, we develop a semantic-level contrastive learning strategy that effectively and economically integrates ID embeddings with textual data, establishing implicit semantic relationships and deepening the model’s understanding of items. Ultimately, we develop a popularity-balanced BPR optimization module to facilitate fair recommendation opportunities for items of varying popularity and promote the model’s discriminative power over hard negative samples. Comprehensive experiments on four real-world datasets have demonstrated the superiority of TPGRec compared with the state-of-the-art baselines. Our codes and datasets are available at Github: <span><span>https://github.com/ycy89/MyTPGRec</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"626 ","pages":"Article 129539"},"PeriodicalIF":5.5,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143314264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}