NeurocomputingPub Date : 2024-11-04DOI: 10.1016/j.neucom.2024.128775
{"title":"An efficient re-parameterization feature pyramid network on YOLOv8 to the detection of steel surface defect","authors":"","doi":"10.1016/j.neucom.2024.128775","DOIUrl":"10.1016/j.neucom.2024.128775","url":null,"abstract":"<div><div>In the field of steel production, the detection of steel surface defects is one of the most important guarantees for the quality of steel production. In the process of defect detection, there are problems regarding the noise of the acquisition background, the scale of defects, and the detection speed. At present, in the face of complex steel surface defects, realizing efficient real-time steel surface defect detection has become a difficult problem. In this paper, we propose a lightweight and efficient real-time defect detection method, LDE-YOLO, based on YOLOv8. First, we propose a lightweight multi-scale feature extraction module, LighterMSMC, which not only achieves a lightweight backbone network, but also effectively guarantees the long range dependence of the features, so as to realize multi-scale feature extraction more efficiently. Secondly, we propose lightweight re-parameterized feature pyramid, DE-FPN, in which the sparse patterns of the overall features and the detailed features of the local features are efficiently captured by the DE-Block, and then efficiently fused by the PAN feature fusion structure. Finally, we propose Efficient Head, which lightens the model by group convolution while its improves the diagonal correlation of the feature maps on some specific datasets, thus enhancing the detection performance. Our proposed LDE-YOLO obtains 80.8 mAP and 75.5 FPS on NEU-DET , 80.5 mAP and 75.5 FPS on GC10-DET. It obtains 2.5 mAP and 4.7 mAP enhancement compared to the baseline model, and the detection speed is also improved by 10.4 FPS, while in terms of the number of floating point operations and parameters of the model reduced by 60.2% and 49.1%, which is sufficient to illustrate its lightweight effectiveness and realize an efficient real-time steel surface defect detection model.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-11-01DOI: 10.1016/j.neucom.2024.128738
{"title":"Multi-contrast image clustering via multi-resolution augmentation and momentum-output queues","authors":"","doi":"10.1016/j.neucom.2024.128738","DOIUrl":"10.1016/j.neucom.2024.128738","url":null,"abstract":"<div><div>Contrastive clustering has emerged as an efficacious technique in the domain of deep clustering, leveraging the interplay between paired samples and the learning capabilities of deep network architectures. However, the augmentation strategies employed in the existing methods do not fully utilize the information of images, coupled with the limitation of the number of negative samples makes the clustering performance suffer. In this study, we propose a novel clustering approach that incorporates momentum-output queues and multi-resolution augmentation strategies to effectively address these limitations. Initially, we deploy a multi-resolution augmentation strategy, transforming conventional augmentations into distinct global and local perspectives across various resolutions. This approach comprehensively harnesses inter-image information to construct a multi-contrast model with multi-view inputs. Subsequently, we introduce momentum-output queues, which are designed to store a large number of negative samples without increasing the computational cost, thereby enhancing the clustering effect. Within our joint optimization framework, sample features are derived from both the original and momentum encoders for instance-level contrastive learning. Simultaneously, features produced exclusively by the original encoder within the same batch are employed for cluster-level contrastive learning. Our experimental results on five challenging datasets substantiate the superior performance of our method over existing state-of-the-art techniques.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-29DOI: 10.1016/j.neucom.2024.128766
{"title":"Double machine learning for partially linear mediation models with high-dimensional confounders","authors":"","doi":"10.1016/j.neucom.2024.128766","DOIUrl":"10.1016/j.neucom.2024.128766","url":null,"abstract":"<div><div>To estimate and statistically infer the direct and indirect effects of exposure and mediator variables while accounting for high-dimensional confounding variables, we propose a partially linear mediation model to incorporate a flexible mechanism of confounders. To obtain asymptotically efficient estimators for the effects of interest under the influence of the nuisance functions with high-dimensional confounders, we construct two Neyman-orthogonal score functions to remove regularization bias. Flexible machine learning methods and data splitting with cross-fitting are employed to address the overfitting issue and estimate unknown nuisance functions efficiently. We rigorously investigate the asymptotic expressions of the proposed estimators for the direct, indirect and total effects and then derive their asymptotic normality properties. In addition, two Wald statistics are constructed to test the direct and indirect effects, respectively, and their limiting distributions are obtained. The satisfactory performance of our proposed estimators is demonstrated by simulation results and a genome-wide analysis of blood DNA methylation dataset.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-29DOI: 10.1016/j.neucom.2024.128783
{"title":"Bipartite containment tracking for nonlinear MASs under FDI attack based on model-free adaptive iterative learning control","authors":"","doi":"10.1016/j.neucom.2024.128783","DOIUrl":"10.1016/j.neucom.2024.128783","url":null,"abstract":"<div><div>The bipartite containment control problem for a type of heterogeneous multi-agent systems (MASs) under false data injection (FDI) attack is handled in this work by using the distributed model-free adaptive iterative learning control scheme with attack compensation. The unknown non-affine nonlinear dynamics of each agent is first transformed into an equivalent attack-related data model along the iteration axis using a compact form dynamic linearization method. Then, a distributed model-free adaptive iterative learning bipartite containment control (DMFAILBCC) scheme is constructed by employing I/O data from MASs, and the convergence is proved by rigorous mathematical analysis In addition, the updated control method and the convergence analysis will be extended to iteration switching topologies. Finally, the performance of the two proposed schemes is validated through numerical simulations and comparisons with different control schemes.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-29DOI: 10.1016/j.neucom.2024.128795
{"title":"Augmented ELBO regularization for enhanced clustering in variational autoencoders","authors":"","doi":"10.1016/j.neucom.2024.128795","DOIUrl":"10.1016/j.neucom.2024.128795","url":null,"abstract":"<div><div>With significant advances in deep neural networks, various new algorithms have emerged that effectively model latent structures within data, surpassing traditional clustering methods. Each data point is expected to belong to a single cluster in a typical clustering algorithm. However, clustering based on variational autoencoders (VAEs) represents the expectation of the overall clusters, denoted as <span><math><mrow><mi>c</mi><mo>=</mo><mn>1</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>K</mi></mrow></math></span> in the KL divergence term. Consequently, the latent embedding <span><math><mi>z</mi></math></span> can be learned to exist across multiple clusters with relatively balanced probabilities, rather than being strongly associated with a specific cluster. This study introduces an additional regularizer to encourage the latent embedding <span><math><mi>z</mi></math></span> to have a strong affiliation with specific clusters. We introduce optimization methods to maximize the ELBO that includes the newly added regularization term and explore methods to eliminate computationally challenging terms. The positive impact of this regularization on clustering accuracy was verified by examining the variance of the final cluster probabilities. Furthermore, an enhancement in the clustering performance was observed when regularization was introduced.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-29DOI: 10.1016/j.neucom.2024.128797
{"title":"Learning from different perspectives for regret reduction in reinforcement learning: A free energy approach","authors":"","doi":"10.1016/j.neucom.2024.128797","DOIUrl":"10.1016/j.neucom.2024.128797","url":null,"abstract":"<div><div>Reinforcement learning (RL) is the core method for interactive learning in living and artificial creatures. Nevertheless, in contrast to humans and animals, artificial RL agents are very slow in learning and suffer from the curse of dimensionality. This is partially due to using RL in isolation; i.e. lack of social learning and social diversity. We introduce a free energy-based social RL for learning novel tasks. Society is formed by the learning agent and some diverse virtual ones. That diversity is in their perception while all agents use the same interaction samples for learning and share the same action set. Individual difference in perception is mostly the cause of perceptual aliasing however, it can result in virtual agents’ faster learning in early trials. Our free energy method provides a knowledge integration method for the main agent to benefit from that diversity to reduce its regret. It rests upon Thompson sampling policy and behavioral policy of main and virtual agents. Therefore, it is applicable to a variety of tasks, discrete or continuous state space, model-free, and model-based tasks as well as to different reinforcement learning methods. Through a set of experiments, we show that this general framework highly improves learning speed and is clearly superior to previous existing methods. We also provide convergence proof.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-28DOI: 10.1016/j.neucom.2024.128785
{"title":"Multi-attention associate prediction network for visual tracking","authors":"","doi":"10.1016/j.neucom.2024.128785","DOIUrl":"10.1016/j.neucom.2024.128785","url":null,"abstract":"<div><div>Classification-regression prediction networks have realized impressive success in several modern deep trackers. However, there is an inherent difference between classification and regression tasks, so they have diverse even opposite demands for feature matching. Existed models always ignore the key issue and only employ a unified matching block in two task branches, decaying the decision quality. Besides, these models also struggle with decision misalignment situation. In this paper, we propose a multi-attention associate prediction network (MAPNet) to tackle the above problems. Concretely, two novel matchers, i.e., category-aware matcher and spatial-aware matcher, are first designed for feature comparison by integrating self, cross, channel or spatial attentions organically. They are capable of fully capturing the category-related semantics for classification and the local spatial contexts for regression, respectively. Then, we present a dual alignment module to enhance the correspondences between two branches, which is useful to find the optimal tracking solution. Finally, we describe a Siamese tracker built upon the proposed prediction network, which achieves the leading performance on five tracking benchmarks, consisting of LaSOT, TrackingNet, GOT-10k, TNL2k and UAV123, and surpasses other state-of-the-art approaches.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-28DOI: 10.1016/j.neucom.2024.128764
{"title":"Diffusion model conditioning on Gaussian mixture model and negative Gaussian mixture gradient","authors":"","doi":"10.1016/j.neucom.2024.128764","DOIUrl":"10.1016/j.neucom.2024.128764","url":null,"abstract":"<div><div>Diffusion models (DMs) are a type of generative model that has had a significant impact on image synthesis and beyond. They can incorporate a wide variety of conditioning inputs — such as text or bounding boxes — to guide generation. In this work, we introduce a novel conditioning mechanism that applies Gaussian mixture models (GMMs) for feature conditioning, which helps steer the denoising process in DMs. Drawing on set theory, our comprehensive theoretical analysis reveals that the conditional latent distribution based on features differs markedly from that based on classes. Consequently, feature-based conditioning tends to generate fewer defects than class-based conditioning. Experiments are designed and carried out and the experimental results support our theoretical findings as well as effectiveness of proposed feature conditioning mechanism. Additionally, we propose a new gradient function named the Negative Gaussian Mixture Gradient (NGMG) and incorporate it into the training of diffusion models alongside an auxiliary classifier. We theoretically demonstrate that NGMG offers comparable advantages to the Wasserstein distance, serving as a more effective cost function when learning distributions supported by low-dimensional manifolds, especially in contrast to many likelihood-based cost functions, such as KL divergences.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-28DOI: 10.1016/j.neucom.2024.128761
{"title":"A survey of graph neural networks and their industrial applications","authors":"","doi":"10.1016/j.neucom.2024.128761","DOIUrl":"10.1016/j.neucom.2024.128761","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have emerged as a powerful tool for analyzing and modeling graph-structured data. In recent years, GNNs have gained significant attention in various domains. This review paper aims to provide an overview of the state-of-the-art graph neural network techniques and their industrial applications. First, we introduce the fundamental concepts and architectures of GNNs, highlighting their ability to capture complex relationships and dependencies in graph data. We then delve into the variants and evolution of graphs, including directed graphs, heterogeneous graphs, dynamic graphs, and hypergraphs. Next, we discuss the interpretability of GNN, and GNN theory including graph augmentation, expressivity, and over-smoothing. Finally, we introduce the specific use cases of GNNs in industrial settings, including finance, biology, knowledge graphs, recommendation systems, Internet of Things (IoT), and knowledge distillation. This review paper highlights the immense potential of GNNs in solving real-world problems, while also addressing the challenges and opportunities for further advancement in this field.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NeurocomputingPub Date : 2024-10-28DOI: 10.1016/j.neucom.2024.128758
{"title":"Simulation-based effective comparative analysis of neuron circuits for neuromorphic computation systems","authors":"","doi":"10.1016/j.neucom.2024.128758","DOIUrl":"10.1016/j.neucom.2024.128758","url":null,"abstract":"<div><div>The spiking neural networks (SNN) that are inspired by the human brain offers wider scope for application in the growth of neuromorphic computing systems due to their brain level computational capabilities, reduced power consumption, and minimal data movement cost, among other advantages. Spike-based neurons and synapses are the essential building blocks of SNN, and their efficient implementation is vital to their performance enhancement. In this regard, the design and implementation of spiking neurons have been the major focus among the researchers. In this paper, functioning of different leaky integrate fire (LIF)-based spiking neuron circuits like frequency adaptable CMOS-based LIF, resistor-capacitor-based (RC) LIF, and volatile memristor-based LIF are subjected to comparison. The work mainly focuses on revealing analysis of spike duration and amplitude, number of spikes produced during excitation period, threshold operation, field of application, and various other significant parameters of aforementioned neuron circuits. Extensive simulations of these circuits are carried out utilizing the Cadence Virtuoso simulation environment in order to validate their behavior. Further, a brief comparative analysis is executed considering into account the attributes like circuit complexity, supply voltage, firing rate, membrane capacitance, nature of input/output, refractory mechanism, and energy consumption per spike. This work seeks to assist researchers in selecting an appropriate LIF model to efficiently construct memristors and/or non-memristors based SNN for certain application.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}