NeurocomputingPub Date : 2024-11-20DOI: 10.1016/j.neucom.2024.128966
Yangchuan Wang , Lianhong Ding , Peng Shi , Juntao Li , Ruiping Yuan
{"title":"Improving generalization performance of adaptive gradient method via bounded step sizes","authors":"Yangchuan Wang , Lianhong Ding , Peng Shi , Juntao Li , Ruiping Yuan","doi":"10.1016/j.neucom.2024.128966","DOIUrl":"10.1016/j.neucom.2024.128966","url":null,"abstract":"<div><div>While adaptive gradient methods such as Adam have been widely used in the training of deep neural networks, a recent study has provided a synthetic function that shows the non-convergence problem of Adam. This issue stems from the existence of extreme gradients and the mismatch between the first and second moments. Several adaptive optimizers have been continuously developed. However, designing a fast optimizer with excellent generalization capability is still challenging. We propose an adaptive method with bounded step sizes, named AdaBS, which removes the extreme step sizes and ensures that it appropriately adjusts adaptive step sizes to mitigate the over-adaptation of step sizes in Adam. In particular, AdaBS effectively clips step sizes that are too large or too small by using two static bounds with a predetermined boundary to control updates. When determining the step size, static bound clipping will be used if the preconditioner is outside the modest boundary, and vanilla Adam will be used if the preconditioner is inside the boundary. AdaBS establishes a trust region around the basic step size and obtains benefits of both Adam and SGD, i.e. fast convergence and better generalization. Finally, we conduct extensive experiments on a variety of practical tasks with benchmark datasets, including image classification and modeling language tasks. Empirical results demonstrate AdaBS’s promising performance with remarkably fast convergence, superior generalization, and robustness.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128966"},"PeriodicalIF":5.5,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic synchronous graph transformer network for region-level air-quality forecasting","authors":"Hanzhong Xia , Xiaoxia Chen , Binjie Chen , Yue Hu","doi":"10.1016/j.neucom.2024.128924","DOIUrl":"10.1016/j.neucom.2024.128924","url":null,"abstract":"<div><div>Accurate forecasting of air quality aids in mitigating air pollution, enhancing the well-being of residents, and supporting the city’s sustainable growth. Recent works have utilized graph neural network for spatial dependency modeling in air-quality forecasting task. However, many existing methods rely on separate components to individually capture temporal and spatial correlations, which makes it difficult to synchronously capture the multiscale spatiotemporal correlation (MSTCs) from the spatiotemporal graph. This paper proposed a dynamic synchronous graph transformer (DSGT) based on the Encoder-Decoder structure to forecast air quality of urban regions. It captures time-varying observed station readings through dynamic graph convolution operations and can learn the influence of auxiliary features. We designed a multiscale dynamic synchronous graph constructing way to construct graphs which can effectively encode the MSTCs. There is a multiscale spatiotemporal synchronous graph convolution component in DSGT for extracting multiscale spatiotemporal representation from the constructed graphs. The synchronous graph attention mechanism and temporal attention mechanism were designed to integrated into Encoder-Decoder structure to focus the long-term influence of auxiliary features and the short-term influence of multiscale spatiotemporal representation. Via extensive experiments on two real-world datasets, it is demonstrated that the proposed model outperforms existing methods in both short- and long-term forecasting.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128924"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-training: A survey","authors":"Massih-Reza Amini , Vasilii Feofanov , Loïc Pauletto , Liès Hadjadj , Émilie Devijver , Yury Maximov","doi":"10.1016/j.neucom.2024.128904","DOIUrl":"10.1016/j.neucom.2024.128904","url":null,"abstract":"<div><div>Self-training methods have gained significant attention in recent years due to their effectiveness in leveraging small labeled datasets and large unlabeled observations for prediction tasks. These models identify decision boundaries in low-density regions without additional assumptions about data distribution, using the confidence scores of a learned classifier. The core principle of self-training involves iteratively assigning pseudo-labels to unlabeled samples with confidence scores above a certain threshold, enriching the labeled dataset and retraining the classifier. This paper presents self-training methods for binary and multi-class classification, along with variants and related approaches such as consistency-based methods and transductive learning. We also briefly describe self-supervised learning and reinforced self-training. Furthermore, we highlight popular applications of self-training and discuss the importance of dynamic thresholding and reducing pseudo-label noise for performance improvement.</div><div>To the best of our knowledge, this is the first thorough and complete survey on self-training.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128904"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128825
Xuewei Cheng , Ke Huang , Shujie Ma
{"title":"Generalization and risk bounds for recurrent neural networks","authors":"Xuewei Cheng , Ke Huang , Shujie Ma","doi":"10.1016/j.neucom.2024.128825","DOIUrl":"10.1016/j.neucom.2024.128825","url":null,"abstract":"<div><div>Recurrent Neural Networks (RNNs) have achieved great success in the prediction of sequential data. However, their theoretical studies are still lagging behind because of their complex interconnected structures. In this paper, we establish a new generalization error bound for vanilla RNNs, and provide a unified framework to calculate the Rademacher complexity that can be applied to a variety of loss functions. When the ramp loss is used, we show that our bound is tighter than the existing bounds based on the same assumptions on the Frobenius and spectral norms of the weight matrices and a few mild conditions. Our numerical results show that our new generalization bound is the tightest among all existing bounds in three public datasets. Our bound improves the second tightest one by an average percentage of 13.80% and 3.01% when the <span><math><mo>tanh</mo></math></span> and ReLU activation functions are used, respectively. Moreover, we derive a sharp estimation error bound for RNN-based estimators obtained through empirical risk minimization (ERM) in multi-class classification problems when the loss function satisfies a Bernstein condition.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128825"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128922
Xiaojun Zhou , Zheng Wang , Tingwen Huang
{"title":"A fast optimization approach for seeking Nash equilibrium based on Nikaido–Isoda function, state transition algorithm and Gauss–Seidel technique","authors":"Xiaojun Zhou , Zheng Wang , Tingwen Huang","doi":"10.1016/j.neucom.2024.128922","DOIUrl":"10.1016/j.neucom.2024.128922","url":null,"abstract":"<div><div>This paper proposes a fast optimization approach for non-cooperative games with complicated payoff functions (non-smooth, non-concave, etc.). The Nikaido–Isoda function is employed to convert knotty Nash equilibrium problems (NEPs) into large-scale optimization problems with complex objective functions. To efficiently seek Nash equilibrium, the resulting optimization problems are decomposed into many subproblems where each player tries to maximize its payoff when observing others’ current strategies. All players’ strategies are updated iteratively until reaching Nash equilibrium. Specifically, a dynamic state transition algorithm (STA) is proposed to seek global optima of subproblems at each iteration, and the sequential quadratic programming (SQP) is embedded into dynamic STA for convergence acceleration. A Gauss–Seidel technique is utilized for players’ strategy updates to improve computational efficiency further. Numerical examples drawn from multidisciplinary contexts validate that the proposed approach could effectively seek out Nash equilibrium for simultaneously decreasing the time-consuming remarkably.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128922"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128908
Jianming Zhang , Jing Yang , Zikang Liu , Jin Wang
{"title":"RGBT tracking via frequency-aware feature enhancement and unidirectional mixed attention","authors":"Jianming Zhang , Jing Yang , Zikang Liu , Jin Wang","doi":"10.1016/j.neucom.2024.128908","DOIUrl":"10.1016/j.neucom.2024.128908","url":null,"abstract":"<div><div>RGBT object tracking is widely used due to the complementary nature of RGB and TIR modalities. However, RGBT trackers based on Transformer or CNN face significant challenges in effectively enhancing and extracting features from one modality and fusing them into another modality. To achieve effective regional feature representation and adequate information fusion, we propose a novel tracking method that employs frequency-aware feature enhancement and bidirectional multistage feature fusion. Firstly, we propose an Early Region Feature Enhancement (ERFE) module, which is comprised of the Frequency-aware Self-region Feature Enhancement (FSFE) block and the Cross-attention Cross-region Feature Enhancement (CCFE) block. The FFT-based FSFE block can enhance the feature of the template or search region separately, while the CCFE block can improve feature representation by considering the template and search region jointly. Secondly, we propose a Bidirectional Multistage Feature Fusion (BMFF) module, with the Complementary Feature Extraction Attention (CFEA) module as its core component. The CFEA module including the Unidirectional Mixed Attention (UMA) block and the Context Focused Attention (CFA) block, can extract information from one modality. When RGB is the primary modality, TIR is the auxiliary modality, and vice versa. The auxiliary modal features processed by CFEA are added to the primary modal features. This information fusion process is bidirectional and multistage. Thirdly, extensive experiments on three benchmark datasets — RGBT234, LaSHeR, and GTOT — demonstrate that our tracker outperforms the advanced RGBT tracking methods.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128908"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128842
Jian-Qiao Wang , Jin-Liang Wang , Xueming Dong
{"title":"Observer-based fully distributed bipartite consensus of multiagent systems with disturbance rejection","authors":"Jian-Qiao Wang , Jin-Liang Wang , Xueming Dong","doi":"10.1016/j.neucom.2024.128842","DOIUrl":"10.1016/j.neucom.2024.128842","url":null,"abstract":"<div><div>Observer-based output feedback control method is utilized to deal with the bipartite consensus problem of multiagent systems (MASs) suffering deterministic disturbances. Based on the leaderless and leader-follower methods, two fully distributed observer-based output feedback controllers are devised to guarantee the bipartite consensus of MASs. Moreover, due to the limited bandwidth of communication channels in practical systems, an event-triggered output feedback controller for the bipartite consensus of MASs is also developed and can guarantee that Zeno behavior does not occur. Finally, the effectiveness and advantages of the control protocols are verified via illustrate examples.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128842"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128905
Chaoguang Luo , Liuying Wen , Yong Qin , Philip S. Yu , Liangwei Yang , Zhineng Hu
{"title":"Diversified recommendation with weighted hypergraph embedding: Case study in music","authors":"Chaoguang Luo , Liuying Wen , Yong Qin , Philip S. Yu , Liangwei Yang , Zhineng Hu","doi":"10.1016/j.neucom.2024.128905","DOIUrl":"10.1016/j.neucom.2024.128905","url":null,"abstract":"<div><div>Recommender systems serve a dual purpose for users: sifting out inappropriate or mismatched information while accurately identifying items that align with their preferences. Numerous recommendation algorithms rely on rich feature data to deliver personalized suggestions. However, in scenarios without explicit features, balancing accuracy and diversity in recommendations is a pressing concern. To address this challenge, exemplified by music recommendation, we introduce the Diversified Weighted Hypergraph Recommendation algorithm (DWHRec). In DWHRec, the initial connections between users and items are modeled using a weighted hypergraph, where additional entities linked to users and items, such as artists, albums, and tags, are simultaneously integrated into the hypergraph structure. To capture users’ latent preferences, a random-walk embedding method is applied to the hypergraph. Accuracy is measured by the match between users and items, and diversity is gauged by the variety of recommended item types. Extensive experiments conducted on two real-world music datasets show that DWHRec substantially outperforms eight state-of-the-art algorithms in terms of accuracy and diversity. Beyond music recommendation, DWHRec is a versatile framework that can be applied to other domains with similar data structures. The algorithm code is available on GitHub.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128905"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128910
Shenghao Li, Zhanpeng Wang, Zhongxuan Luo, Na Lei
{"title":"An optimal transport-guided diffusion framework with mitigating mode mixture","authors":"Shenghao Li, Zhanpeng Wang, Zhongxuan Luo, Na Lei","doi":"10.1016/j.neucom.2024.128910","DOIUrl":"10.1016/j.neucom.2024.128910","url":null,"abstract":"<div><div>Diffusion probability models (DPMs) have achieved excellent results in image generation; however, their inference process is slow and tends to produce more mixed images. The autoencoder optimal transport (OT) model addresses the mode collapse/mixture problem from the OT perspective but produces low-quality images. Therefore, to generate high-quality images and mitigate mode mixture, we propose an innovative OT-guided diffusion framework. The key is to find the optimal truncation step <span><math><mi>M</mi></math></span> to ensure that the class boundaries of the original data do not intersect during the forward process, ensuring that the generated image belongs to the same class as the initial point in the reverse process. The value of <span><math><mi>M</mi></math></span> is determined by evaluating the Peak Signal-to-Noise Ratio, enabling us to mitigate the generation of mixed images. Specifically, our approach first involves embedding the images’ manifold into the latent space through an encoder. The images are subsequently decoded using latent codes, which are generated through an OT map from the Gaussian distribution to the empirical latent distribution. Finally, the trained <span><math><mi>M</mi></math></span>-step DPM is utilized to refine the image generated by the decoder. Experimental results demonstrate that our method not only improves image quality but also alleviates mode mixture in diffusion models. Additionally, it enhances sampling efficiency and reduces training cost compared to classical diffusion models.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128910"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-19DOI: 10.1016/j.neucom.2024.128953
Xuepeng Zhang , Jinrui Wang , Xue Jiang , Zongzhen Zhang , Baokun Han , Huaiqian Bao , Xingxing Jiang
{"title":"Working condition decoupling adversarial network: A novel method for multi-target domain fault diagnosis","authors":"Xuepeng Zhang , Jinrui Wang , Xue Jiang , Zongzhen Zhang , Baokun Han , Huaiqian Bao , Xingxing Jiang","doi":"10.1016/j.neucom.2024.128953","DOIUrl":"10.1016/j.neucom.2024.128953","url":null,"abstract":"<div><div>In the practical application of rotating machinery, the change of working conditions can meet different manufacturing requirements. When fault diagnosis is performed on monitoring data with different working conditions, the change of data distribution will bring interference information which is highly related to working conditions and inconsistent matching problems in the process of multi-target domain transfer. In order to solve these problems, a working condition decoupling adversarial network (WCDAN) is proposed for multi-target domain fault diagnosis. Specifically, the prototype discrepancy alignment module is constructed following a weight-shared wavelet convolution feature extractor to ensure a clear prototype representation boundary. Then, the adaptive domain discriminator weight, along with the acquired multi-domain discrepancy, are utilized to decouple the working conditions. This process filters out interference information that highly associated with the source domain working conditions while preserving the inherent fault characteristics. Furthermore, the strategy of multi-domain hybrid alignment aims to minimize the disparity between different domains and solve the inconsistent matching issue. Based on two gearbox fault datasets under stable and unstable conditions, the comparative experimental results show that the WCDAN can be generalized from a single source domain to multiple target domains at the same time and achieve excellent fault diagnosis performance.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"616 ","pages":"Article 128953"},"PeriodicalIF":5.5,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}