Neural NetworksPub Date : 2024-11-22DOI: 10.1016/j.neunet.2024.106931
Shengzhong Zhang , Yimin Zhang , Bisheng Li , Wenjie Yang , Min Zhou , Zengfeng Huang
{"title":"Graph Batch Coarsening framework for scalable graph neural networks","authors":"Shengzhong Zhang , Yimin Zhang , Bisheng Li , Wenjie Yang , Min Zhou , Zengfeng Huang","doi":"10.1016/j.neunet.2024.106931","DOIUrl":"10.1016/j.neunet.2024.106931","url":null,"abstract":"<div><div>Due to the neighborhood explosion phenomenon, scaling up graph neural networks to large graphs remains a huge challenge. Various sampling-based mini-batch approaches, such as node-wise, layer-wise, and subgraph sampling, have been proposed to alleviate this issue. However, intensive random sampling incurs additional overhead during training and often fails to deliver good performance consistently. To surmount these limitations, we propose Graph Batch Coarsening (GBC), a simple and general graph batching framework designed to facilitate scalable training of arbitrary GNN models. GBC preprocesses the input graph and generates a set of much smaller subgraphs to be used as mini-batches; then any GNN model can be trained only on those small graphs. This framework avoids random sampling completely and makes no extra change on the backbone GNN models including hyperparameters. To implement the framework, we present a graph decomposition method based on label propagation and a novel graph coarsening algorithm designed for training GNN. Empirically, GBC demonstrates superior performance in accuracy, training time and memory usage on various small to large-scale graphs.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"Article 106931"},"PeriodicalIF":6.0,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142757041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-21DOI: 10.1016/j.neunet.2024.106798
Lingzhong Zhang , Shengyuan Xu
{"title":"Static pinning synchronization control of self-triggered coupling dynamical networks","authors":"Lingzhong Zhang , Shengyuan Xu","doi":"10.1016/j.neunet.2024.106798","DOIUrl":"10.1016/j.neunet.2024.106798","url":null,"abstract":"<div><div>In this paper, a new static pinning intermittent control based on resource awareness triggering is proposed. A multi-layer control technique is used to synchronize the coupled neural network. First, a hierarchical network structure including pinned and interaction layers is induced using each pinning strategy. Second, using the ideas of average aperiodic intermittent control (AIC) rate method and constructing an auxiliary function, a new lemma is proposed for the pinning intermittent synchronization of coupled networks, where the upper/lower bound restrictions on each control width for AIC are relaxed. Third, to obtain the desired synchronization behavior, a self-triggering mechanism (STM) is proposed to execute the AIC of the pinned and interaction layers. Moreover, the proposed STM is effective for the actuation of the static pinning impulsive control. The static pinning method modifies the single pinning and switching pinning impulsive control. Finally, the proposed results are applied to Chua’s circuits, oscillators and small-world networks. Experimental results show the performance of the proposed STM which reduces 34.58% the number of control updates compared to a periodically intermittent event-triggered scheme. Further, for large-scale coupled networks, the <span><math><mi>N</mi></math></span>-dimensional Laplacian matrix <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>V</mi></mrow></msub></math></span> can be decomposed into <span><math><mrow><msub><mrow><mi>s</mi></mrow><mrow><mi>p</mi></mrow></msub><mo>×</mo><msub><mrow><mi>s</mi></mrow><mrow><mi>p</mi></mrow></msub></mrow></math></span> and <span><math><mrow><mi>N</mi><mo>−</mo><msub><mrow><mi>s</mi></mrow><mrow><mi>p</mi></mrow></msub><mo>×</mo><mi>N</mi><mo>−</mo><msub><mrow><mi>s</mi></mrow><mrow><mi>p</mi></mrow></msub></mrow></math></span> dimensions by hierarchical method, thus reducing the complexity of calculation.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106798"},"PeriodicalIF":6.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142724091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106907
Qing Tian , Yi Zhao , Keyang Cheng , Tinghuai Ma
{"title":"Enhancing Open-Set Domain Adaptation through Optimal Transport and Adversarial Learning","authors":"Qing Tian , Yi Zhao , Keyang Cheng , Tinghuai Ma","doi":"10.1016/j.neunet.2024.106907","DOIUrl":"10.1016/j.neunet.2024.106907","url":null,"abstract":"<div><div>Open-Set Domain Adaptation (OSDA) is designed to facilitate the transfer of knowledge from a source domain to a target domain, where the class space of the source is a subset of the target. The primary challenge in OSDA is the identification of shared samples in the target domain to achieve domain alignment while effectively segregating private samples in the target domain. In attempts to address this challenge, numerous existing methods leverage weighted classifiers to mitigate the negative transfer issue induced by private classes in the target domain and recognize all these samples as a whole unknown class. However, this strategy may result in inadequate acquisition of discriminative information within the target domain and an unclear decision boundaries. To overcome these limitations, we propose a novel framework termed Optimal Transport and Adversarial Learning (OTAL). Our approach innovatively introduces Optimal Transport (OT) with a similarity matrix for feature-to-prototype mapping in clustering, enabling the model to learn discriminative information and capturing the intrinsic structure of the target domain. Furthermore, we introduce a three-way domain discriminator to aid in the construction of decision boundary between known and unknown classes, while simultaneously aligning the distribution of known samples. Experimental results on three image classification datasets (Office-31, Office-Home and VisDA-2017) demonstrate the superior performance of OTAL when compared to existing state-of-the-art methods.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106907"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning extreme expected shortfall and conditional tail moments with neural networks. Application to cryptocurrency data","authors":"Michaël Allouche , Stéphane Girard , Emmanuel Gobet","doi":"10.1016/j.neunet.2024.106903","DOIUrl":"10.1016/j.neunet.2024.106903","url":null,"abstract":"<div><div>We propose a neural networks method to estimate extreme Expected Shortfall, and even more generally, extreme conditional tail moments as functions of confidence levels, in heavy-tailed settings. The convergence rate of the uniform error between the log-conditional tail moment and its neural network approximation is established leveraging extreme-value theory (in particular the high-order condition on the distribution tails) and using critically two activation functions (eLU and ReLU) for neural networks. The finite sample performance of the neural network estimator is compared to bias-reduced extreme-value competitors using synthetic heavy-tailed data. The experiments reveal that our method largely outperforms others. In addition, the selection of the anchor point appears to be much easier and stabler than for other methods. Finally, the neural network estimator is tested on real data related to extreme loss returns in cryptocurrencies: here again, the accuracy obtained by cross-validation is excellent, and is much better compared with competitors.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106903"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142724092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106915
Chenchen Zhang , Zhan Su , Qiuchi Li , Dawei Song , Prayag Tiwari
{"title":"Quantum-inspired neural network with hierarchical entanglement embedding for matching","authors":"Chenchen Zhang , Zhan Su , Qiuchi Li , Dawei Song , Prayag Tiwari","doi":"10.1016/j.neunet.2024.106915","DOIUrl":"10.1016/j.neunet.2024.106915","url":null,"abstract":"<div><div>Quantum-inspired neural networks (QNNs) have shown potential in capturing various non-classical phenomena in language understanding, e.g., the emgerent meaning of concept combinations, and represent a leap beyond conventional models in cognitive science. However, there are still two limitations in the existing QNNs: (1) Both storing and invoking the complex-valued embeddings may lead to prohibitively expensive costs in memory consumption and storage space. (2) The use of entangled states can fully capture certain non-classical phenomena, which are described by the tensor product with powerful compression ability. This approach shares many commonalities with the process of word formation from morphemes, but such connection has not been further exploited in the existing work. To mitigate these two limitations, we introduce a Quantum-inspired neural network with Hierarchical Entanglement Embedding (QHEE) based on finer-grained morphemes. Our model leverages the <em>intra-word</em> and <em>inter-word</em> entanglement embeddings to learn a multi-grained semantic representation. The intra-word entanglement embedding is employed to aggregate the constituent morphemes from multiple perspectives, while the inter-word entanglement embedding is utilized to combine different words based on unitary transformation to reveal their non-classical correlations. Both the number of morphemes and the dimensionality of the morpheme embedding vectors are far smaller than the counterparts of words, which would compress embedding parameters efficiently. Experimental results on four benchmark datasets of different downstream tasks show that our model outperforms strong quantum-inspired baselines in terms of effectiveness and compression ability.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106915"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106924
Giuseppe Alessio D’Inverno, Monica Bianchini, Franco Scarselli
{"title":"VC dimension of Graph Neural Networks with Pfaffian activation functions","authors":"Giuseppe Alessio D’Inverno, Monica Bianchini, Franco Scarselli","doi":"10.1016/j.neunet.2024.106924","DOIUrl":"10.1016/j.neunet.2024.106924","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion. Based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked to the Weisfeiler–Lehman (WL) test for graph isomorphism, to which they were demonstrated to be equivalent (Morris et al., 2019 and Xu et al., 2019). From a theoretical point of view, GNNs have been shown to be universal approximators, and their generalization capability — related to the Vapnik Chervonekis (VC) dimension (Scarselli et al., 2018) — has recently been investigated for GNNs with piecewise polynomial activation functions (Morris et al., 2023). The aim of our work is to extend this analysis on the VC dimension of GNNs to other commonly used activation functions, such as the sigmoid and hyperbolic tangent, using the framework of Pfaffian function theory. Bounds are provided with respect to the architecture parameters (depth, number of neurons, input size) as well as with respect to the number of colors resulting from the 1–WL test applied on the graph domain. The theoretical analysis is supported by a preliminary experimental study.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106924"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106917
Zi-En Fan, Feng Lian, Xin-Ran Li
{"title":"Rethinking density ratio estimation based hyper-parameter optimization","authors":"Zi-En Fan, Feng Lian, Xin-Ran Li","doi":"10.1016/j.neunet.2024.106917","DOIUrl":"10.1016/j.neunet.2024.106917","url":null,"abstract":"<div><div>Hyper-parameter optimization (HPO) aims to improve the performance of machine learning algorithms by identifying appropriate hyper-parameters. By converting the computation of expected improvement into density-ratio estimation problems, existing works use binary classifiers to estimate these ratio and determine the next point by maximizing the class posterior probabilities. However, these methods tend to treat different points equally and ignore some important regions, because binary classifiers are unable to capture more information about search spaces and highlight important regions. In this work, we propose a hyper-parameter optimization method by estimating ratios and selecting the next point using multi-class classifiers. First, we divide all samples into multiple classes and train multi-class classifiers. The decision boundaries of the trained classifiers allow for a finer partitioning of search spaces, offering richer insights into the distribution of hyper-parameters within search spaces. We then define an acquisition function as a weighted sum of multi-class classifiers’ outputs, with these weights determined by samples in each class. By assigning different weights to each class posterior probability in our acquisition function, points within search spaces are no longer treated equally. Experimental results on three representative tasks demonstrate that our method achieves a significant improvement in immediate regrets and convergence speed.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106917"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106920
Hamed Hemati , Lorenzo Pellegrini , Xiaotian Duan , Zixuan Zhao , Fangfang Xia , Marc Masana , Benedikt Tscheschner , Eduardo Veas , Yuxiang Zheng , Shiji Zhao , Shao-Yuan Li , Sheng-Jun Huang , Vincenzo Lomonaco , Gido M. van de Ven
{"title":"Continual learning in the presence of repetition","authors":"Hamed Hemati , Lorenzo Pellegrini , Xiaotian Duan , Zixuan Zhao , Fangfang Xia , Marc Masana , Benedikt Tscheschner , Eduardo Veas , Yuxiang Zheng , Shiji Zhao , Shao-Yuan Li , Sheng-Jun Huang , Vincenzo Lomonaco , Gido M. van de Ven","doi":"10.1016/j.neunet.2024.106920","DOIUrl":"10.1016/j.neunet.2024.106920","url":null,"abstract":"<div><div>Continual learning (CL) provides a framework for training models in ever-evolving environments. Although re-occurrence of previously seen objects or tasks is common in real-world problems, the concept of <em>repetition</em> in the data stream is not often considered in standard benchmarks for CL. Unlike with the rehearsal mechanism in buffer-based strategies, where sample repetition is controlled by the strategy, repetition in the data stream naturally stems from the environment. This report provides a summary of the CLVision challenge at CVPR 2023, which focused on the topic of repetition in class-incremental learning. The report initially outlines the challenge objective and then describes three solutions proposed by finalist teams that aim to effectively exploit the repetition in the stream to learn continually. The experimental results from the challenge highlight the effectiveness of ensemble-based solutions that employ multiple versions of similar modules, each trained on different but overlapping subsets of classes. This report underscores the transformative potential of taking a different perspective in CL by employing repetition in the data stream to foster innovative strategy design.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"Article 106920"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106921
Chen Chen , Yong Chen , Weiwei Li , Deyun Chen
{"title":"Deep temporal representation learning for language identification","authors":"Chen Chen , Yong Chen , Weiwei Li , Deyun Chen","doi":"10.1016/j.neunet.2024.106921","DOIUrl":"10.1016/j.neunet.2024.106921","url":null,"abstract":"<div><div>Language identification (LID) is a key component in downstream tasks. Recently, the self-supervised speech representation learned by Wav2Vec 2.0 (W2V2) has been demonstrated to be very effective for various speech-related tasks. In LID, it is commonly used as a feature extractor for frame-level feature extraction. However, there is currently no effective method for extracting temporal information from frame-level features to enhance the performance of LID systems. To deal with this issue, we propose a LID framework based on deep temporal representation (DTR) learning. First, the W2V2 model is used as a front-end feature extractor. This model can capture contextual representations from continuous raw audio in which temporal dependencies are embedded. Then, a temporal network responsible for learning temporal dependencies is proposed to process the output of W2V2. This temporal network comprises a temporal representation extractor for extracting utterance-level representations and a temporal regularization term to impose constraints on temporal dynamics. Finally, the temporal dependencies are used as utterance-level representations for the subsequent classification. The proposed DTR method is evaluated on the OLR2020 database and compared to other state-of-the-art methods. The results show that the proposed method achieves decent experimental performance on all the three tasks of OLR2020 database.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106921"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142723735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2024-11-20DOI: 10.1016/j.neunet.2024.106922
Peilin Liu , Yuqing Liu , Xiang Zhou , Ding-Xuan Zhou
{"title":"Approximation of functionals on Korobov spaces with Fourier Functional Networks","authors":"Peilin Liu , Yuqing Liu , Xiang Zhou , Ding-Xuan Zhou","doi":"10.1016/j.neunet.2024.106922","DOIUrl":"10.1016/j.neunet.2024.106922","url":null,"abstract":"<div><div>Learning from functional data with deep neural networks has become increasingly useful, and numerous neural network architectures have been developed to tackle high-dimensional problems raised in practical domains. Despite the impressive practical achievements, theoretical foundations underpinning the ability of neural networks to learn from functional data largely remain unexplored. In this paper, we investigate the approximation capacity of a functional neural network, called Fourier Functional Network, consisting of Fourier neural operators and deep convolutional neural networks with a great reduction in parameters. We establish rates of approximating by Fourier Functional Networks nonlinear continuous functionals defined on Korobov spaces of periodic functions. Finally, our results demonstrate dimension-independent convergence rates, which overcomes the curse of dimension.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"182 ","pages":"Article 106922"},"PeriodicalIF":6.0,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142723740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}