NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128954
Lina Xia , Qing Li , Ruizhuo Song
{"title":"Fully distributed adaptive optimization event-triggered/self-triggered synchronization for multi-agent systems","authors":"Lina Xia , Qing Li , Ruizhuo Song","doi":"10.1016/j.neucom.2024.128954","DOIUrl":"10.1016/j.neucom.2024.128954","url":null,"abstract":"<div><div>This paper investigates the distributed adaptive optimization synchronization problem of multi-agent systems (MASs) with general linear dynamics on undirected graphs. The goal is to fulfill the synchronization among agents and synergistically optimize the team cost function formed by a family of local convex functions. The tracking servo signal is first generated by sampling the implicit state of the axillary system, and its sampling events are governed by the triggering mechanism I. Meanwhile, the disagreement vector is sampled if the triggering mechanism II is violated. An adaptive event-triggered scheme is then constructed by the gradient term whose input is the tracking servo signal and the relative sampling information among followers, which fulfills the synchronization as the desired one and minimizes the team cost function. It proves that Zeno behavior is excluded under the triggering mechanisms I and II, respectively. Moreover, a self-triggered strategy is leveraged that depends only on the partial derivative of the local cost function in the implicit state sampling and the relative sampling information of itself and its neighbors; thus, continuously monitoring the information of neighbors is avoided. It is noted that the proposed scheme incorporates adaptive event-triggered control, which makes it possible to implement the fully distributed control manner. The efficacy and advantage of the presented theoretical results are finally demonstrated using a non-trivial simulation example.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128954"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128955
Lei Chen , Qingbo Xiong , Wei Zhang , Xiaoli Liang , Zhihua Gan , Liqiang Li , Xin He
{"title":"Multi-modal degradation feature learning for unified image restoration based on contrastive learning","authors":"Lei Chen , Qingbo Xiong , Wei Zhang , Xiaoli Liang , Zhihua Gan , Liqiang Li , Xin He","doi":"10.1016/j.neucom.2024.128955","DOIUrl":"10.1016/j.neucom.2024.128955","url":null,"abstract":"<div><div>In this paper, we address the unified image restoration challenge by reframing it as a contrastive learning-based classification problem. Despite the significant strides made by deep learning methods in enhancing image restoration quality, their limited capacity to generalize across diverse degradation types and intensities necessitates the training of separate models for each specific degradation scenario. We proposes an all-encompassing approach that can restore images from various unknown corruption types and levels. We devise a method that learns representations of the latent sharp image’s degradation and accompanying textual features (such as dataset categories and image content descriptions), converting these into prompts which are then embedded within a reconstruction network model to enhance cross-database restoration performance. This culminates in a unified image reconstruction framework. The study involves two stages: In the first stage, we design a MultiContentNet that learns multi-modal features (MMFs) of the latent sharp image. This network encodes the visual degradation expressions and contextual text features into latent variables, thereby exerting a guided classification effect. Specifically, MultiContentNet is trained as an auxiliary controller capable of taking the degraded input image and, through contrastive learning, extracts MMFs of the latent target image. This effectively generates natural classifiers tailored for different degradation types. The second phase integrates the learned MMFs into an image restoration network via cross-attention mechanisms. This guides the restoration model to learn high-fidelity image recovery. Experiments conducted on six blind image restoration tasks demonstrate that the proposed method achieves state-of-the-art performance, highlighting the potential significance of large-scale pretrained vision-language models’ MMFs in advancing high-quality unified image reconstruction.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128955"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-based context learning network for infrared small target detection","authors":"Yiwei Shen , Qingwu Li , Chang Xu , Chenkai Chang , Qiyun Yin","doi":"10.1016/j.neucom.2024.128949","DOIUrl":"10.1016/j.neucom.2024.128949","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) have shown remarkable performance in the field of infrared small target detection. However, due to the limitation of local receptive field, existing methods find it challenging to effectively model the contextual information associated with small targets. In this paper, we propose a Graph-based Context Learning Network (GCLNet) that addresses this issue by integrating global graph reasoning with local feature learning to perceive context information across multiple scales. Specifically, local feature learning blocks are embedded into the encoder to extract detailed textures that are crucial for small targets detection. At deep layers, the multiple graph reasoning module leverages a multi-graph interaction structure to promote significant information transfer, allowing for the optimization of global context learning. Moreover, the patch-based graph reasoning module divides the low-level features into multiple patches where the context information is explored to capture the saliency of small targets. The experimental results demonstrate that the proposed method outperforms the state-of-the-art methods, achieving the intersection over union (IoU) of 80.26% and 94.84% on the NUAA-SIRST and NUDT-SIRST datasets, respectively. Our code will be available at <span><span>https://github.com/studymonster0/GCLNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128949"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128950
Yudong Li, Yunlin Lei, Xu Yang
{"title":"Rethinking residual connection in training large-scale spiking neural networks","authors":"Yudong Li, Yunlin Lei, Xu Yang","doi":"10.1016/j.neucom.2024.128950","DOIUrl":"10.1016/j.neucom.2024.128950","url":null,"abstract":"<div><div>Spiking Neural Network (SNN) is known as the most famous brain-inspired model, but the non-differentiable spiking mechanism makes it hard to train large-scale SNNs. To facilitate the training of large-scale SNNs, many training methods are borrowed from Artificial Neural Networks (ANNs), among which deep residual learning is the most commonly used. But the unique features of SNNs make prior intuition built upon ANNs not available for SNNs. Although there are a few studies that have made some pioneer attempts on the topology of Spiking ResNet, the advantages of different connections remain unclear. To tackle this issue, we analyze the merits and limitations of various residual connections and empirically demonstrate our ideas with extensive experiments. Then, based on our observations, we abstract the best-performing connections into densely additive (DA) connection, extend such a concept to other topologies, and propose four architectures for training large-scale SNNs, termed DANet, which brings up to 13.24<span><math><mtext>%</mtext></math></span> accuracy gain on ImageNet. Besides, in order to present a detailed methodology for designing the topology of large-scale SNNs, we further conduct in-depth discussions on their applicable scenarios in terms of their performance on various scales of datasets and demonstrate their advantages over prior architectures. At a low training expense, our best-performing ResNet-50/101/152 obtain 73.71<span><math><mtext>%</mtext></math></span>/76.13<span><math><mtext>%</mtext></math></span>/77.22<span><math><mtext>%</mtext></math></span> top-1 accuracy on ImageNet with 4 time steps. We believe that this work shall give more insights for future works to design the topology of their networks and promote the development of large-scale SNNs. The code will be publicly available.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128950"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human key point detection method based on enhanced receptive field and transformer","authors":"Hongyu Liang , Jianfeng Yang , Wenjuan Xie , Jinsheng Xiao","doi":"10.1016/j.neucom.2024.128894","DOIUrl":"10.1016/j.neucom.2024.128894","url":null,"abstract":"<div><div>The existing key point detection network models have complex structures, so it is difficult to deploy on edge devices. Meanwhile, the convolution is localized and limited by the size of convolution kernels, which cannot effectively capture long-range dependencies. To address this problem, this paper introduces a lightweight Convolutional Transformer network (LHFormer Net) for human pose estimation. Considering that the sampling area of convolution kernels for different dimensional feature maps is fixed and the contextual information is singular, the enhanced receptive field block is designed to extract richer feature information and reduce information loss in feature maps. Based on the global modeling features of Transformer encoder, convolutional position encoding and multi-head self-attention are used to capture the spatial constraint relationship between key points in the deep feature extraction. Finally, a lightweight deconvolution module is used to generate higher resolution features to achieve multi-resolution supervision, which can effectively solve the problem of scale variation in pose estimation, more accurately locate key points of small and medium-sized people, ad further improve the detection accuracy of the network. Compared with other networks, the experimental results on the open access datasets COCO2017 and MPII show that the proposed network achieves a good balance between model complexity and detection performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128894"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128965
Giang H. Le , Anh Q. Nguyen , Byeongkeun Kang , Yeejin Lee
{"title":"Content-aware preserving image generation","authors":"Giang H. Le , Anh Q. Nguyen , Byeongkeun Kang , Yeejin Lee","doi":"10.1016/j.neucom.2024.128965","DOIUrl":"10.1016/j.neucom.2024.128965","url":null,"abstract":"<div><div>Remarkable progress has been achieved in image generation with the introduction of generative models. However, precisely controlling the content in generated images remains a challenging task due to their fundamental training objective. This paper addresses this challenge by proposing a novel image generation framework explicitly designed to incorporate desired content in output images. The framework utilizes advanced encoding techniques, integrating subnetworks called content fusion and frequency encoding modules. The frequency encoding module first captures features and structures of reference images by exclusively focusing on selected frequency components. Subsequently, the content fusion module generates a content-guiding vector that encapsulates desired content features. During the image generation process, content-guiding vectors from real images are fused with projected noise vectors. This ensures the production of generated images that not only maintain consistent content from guiding images but also exhibit diverse stylistic variations. To validate the effectiveness of the proposed framework in preserving content attributes, extensive experiments are conducted on widely used benchmark datasets, including Flickr-Faces-High Quality, Animal Faces High Quality, and Large-scale Scene Understanding datasets.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128965"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128974
Kai Shang , Mingwen Shao , Chao Wang , Yuanjian Qiao , Yecong Wan
{"title":"Training-free prior guided diffusion model for zero-reference low-light image enhancement","authors":"Kai Shang , Mingwen Shao , Chao Wang , Yuanjian Qiao , Yecong Wan","doi":"10.1016/j.neucom.2024.128974","DOIUrl":"10.1016/j.neucom.2024.128974","url":null,"abstract":"<div><div>Images captured under poor illumination not only struggle to provide satisfactory visual information but also adversely affect high-level visual tasks. Therefore, we delve into low-light image enhancement. We mainly focus on two practical challenges: (1) previous methods predominantly require supervised training with paired data, tending to learn mappings specific to the training data, which limits their generalization ability on unseen images. (2) existing unsupervised methods usually yield sub-optimal image quality due to the insufficient utilization of image priors. To address these challenges, we propose a training-free Prior Guided Diffusion model, namely <strong>PGDiff</strong>, for zero-reference low-light image enhancement. Specifically, to leverage the implicit information within the degraded image, we propose a frequency-guided mechanism to obtain low-frequency features through bright channel prior, which combined with the generative prior of the pre-trained diffusion model to recover high-frequency details. To improve the quality of generated images, we further introduce the gradient guidance based on image exposure and color priors. Benefiting from this dual-guided mechanism, PGDiff can produce high-quality restoration results without requiring tedious training or paired reference images. Extensive experiments on paired and unpaired datasets show that our training-free method achieves competitive performance against existing learning-based methods, surpassing the state-of-the-art method QuadPrior by 0.25 dB in PSNR on the LOL dataset.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128974"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"10-minute forest early wildfire detection: Fusing multi-type and multi-source information via recursive transformer","authors":"Qiang Zhang , Jian Zhu , Yushuai Dong , Enyu Zhao , Meiping Song , Qiangqiang Yuan","doi":"10.1016/j.neucom.2024.128963","DOIUrl":"10.1016/j.neucom.2024.128963","url":null,"abstract":"<div><div>Forest wildfire has great impacts on both nature and human society. While disrupts the ecosystems, wildfire leads to significant economic loss and poses a threat to local communities. To detect forest wildfire, remote sensing technology has become an essential and powerful tool. Compared with polar-orbiting satellite, the new generation of geostationary satellite provides higher temporal resolution and faster response capability. In this study, we utilize the near real-time data of Himawari-8/9 satellite, to achieve 10-min forest early wildfire detection. A recursive transformer model is proposed in this work. It fuses multi-type and multi-source information for Himawari-8/9 satellite. By leveraging the spectral, temporal and spatial features of fire pixels and considering land cover information, the proposed method reduces interference factors like cloud and terrain, resulting in minute-level and near real-time detection of forest wildfire. In 21 ground truth forest wildfire scenarios and MODIS-based cross-validation dataset, the proposed method achieves better results compared to the JAXA wildfire product, in terms of overall fire detection accuracy, early fire detection rate, omission rate, and real-time performance. Furthermore, the proposed framework effectively lowers the emergency response time for early forest wildfire detection, thereby reducing the loss caused by forest wildfire.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128963"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128948
Jialong Qian, Shiqi Zhang, Yuzhuang Pian, Xinyi Chen, Yonghong Liu
{"title":"Spatiotemporal subspace variational autoencoder with repair mechanism for traffic data imputation","authors":"Jialong Qian, Shiqi Zhang, Yuzhuang Pian, Xinyi Chen, Yonghong Liu","doi":"10.1016/j.neucom.2024.128948","DOIUrl":"10.1016/j.neucom.2024.128948","url":null,"abstract":"<div><div>High-quality spatial–temporal traffic data is crucial for the functioning of modern smart transportation systems. However, the collection and storage of traffic data in real-world scenarios are often hindered by many factors, causing data loss that greatly affects decision-making. Different modes of data absence result in varying degrees of information loss, which introduces considerable challenges to the precise imputation of traffic data. Many existing studies are concentrate on two main aspects: the examination of data distribution and the extraction of spatiotemporal relationships. On the one hand, methods that focus on distribution fitting do not require a large volume of observational data but often fail to capture spatial–temporal relationships, leading to overly smooth results. On the other hand, methods that aim to identify spatial–temporal relationships, while offering higher accuracy in fitting, demand a substantial amount of high-quality historical data. Taking into account the merits and demerits of both two paradigm, we developed a novel unsupervised two-stage model simultaneously takes into account the spatiotemporal distribution and relationships, termed Spatiotemporal Subspace Variational Autoencoder with Repair Mechanism (SVAE-R). In stage one, we introduced the concept of spatiotemporal subspace, which not only mitigates the noise impact caused by data sparsity but also reduces the cost for the model to find the distribution. In stage two, we designed a simple repair structure to capture spatial–temporal relationships among data through graph convolution network(GCN) and gated recurrent units(GRU), revising the details of the data. We have evaluated our model on two authentic datasets, and it has exhibited a high degree of robustness, maintaining effective performance even under extreme data loss conditions.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128948"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-22DOI: 10.1016/j.neucom.2024.128971
Arwa Amaira , Hend Koubaa , Faouzi Zarai
{"title":"DRL for handover in 6G-vehicular networks: A survey","authors":"Arwa Amaira , Hend Koubaa , Faouzi Zarai","doi":"10.1016/j.neucom.2024.128971","DOIUrl":"10.1016/j.neucom.2024.128971","url":null,"abstract":"<div><div>3GPP is working on technology improvements related to sixth-generation (6G) wireless communication networks to keep pace. 6G networks are being developed as the next phase forward. Compared to their predecessor wireless technologies, 6G networks are predicted to offer better coverage and flexibility by supporting higher throughput, faster velocities, lower latency, and higher capacity. 6G targets the health, education, industry, and transport sectors. The transport field is undergoing rapid change. 6G and artificial intelligence (AI) will have an essential impact in this area, bringing users new services and functionalities. Within this field, the handover (HO) mechanism remains a concern that researchers must consider for achieving excellent communication quality since HO technology is crucial for ensuring seamless connectivity during user transfers between cells. Numerous proposed Machine Learning (ML) approaches, including Deep Reinforcement Learning (DRL), were discussed to solve HO issues. Recently, DRL methods have garnered significant interest in prospective wireless networks. They can surmount the escalating obstacles of the wireless environment and the constraints of conventional approaches. Moreover, DRL is crucial in wireless networks because of its capability to overcome the specific threats and dynamic nature of wireless settings. Elaborating on a comprehensive survey of these approaches related to HO and DRL can provide a unified analysis of the current advancements. This overview will help to improve understanding of this topic. This survey provides an overview of requirements and usage scenarios for 6G. It highlights the impact of this new wireless technology on the transportation field or the Vehicle-to-everything (V2X). In addition, we provide a deep study of HO management in 6G networks and elaborate on the various DRL literature solutions for HO in mobile and vehicular networks.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128971"},"PeriodicalIF":5.5,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142745434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}