Neurocomputing最新文献

筛选
英文 中文
Sparse discriminant manifold projections for automatic depression recognition 用于自动抑郁识别的稀疏判别流形投影
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-11-04 DOI: 10.1016/j.neucom.2024.128765
Lu Zhang , Jitao Zhong , Qinglin Zhao , Shi Qiao , Yushan Wu , Bin Hu , Sujie Ma , Hong Peng
{"title":"Sparse discriminant manifold projections for automatic depression recognition","authors":"Lu Zhang ,&nbsp;Jitao Zhong ,&nbsp;Qinglin Zhao ,&nbsp;Shi Qiao ,&nbsp;Yushan Wu ,&nbsp;Bin Hu ,&nbsp;Sujie Ma ,&nbsp;Hong Peng","doi":"10.1016/j.neucom.2024.128765","DOIUrl":"10.1016/j.neucom.2024.128765","url":null,"abstract":"<div><div>In recent years, depression has become an increasingly serious problem globally. Previous research have shown that EEG-based depression recognition is a promising technique to serve as auxiliary diagnosis methods that provide assistance to clinicians. Typically, in clinical studies, due to the multichannel nature of EEG, the extracted features usually are high-dimensional and contain many redundant information. Therefore, it is necessary to perform dimensionality reduction before classification to improve the performance of machine learning algorithms. However,existing dimensionality reduction techniques do not design the objective function based on the characteristics of EEG signal and the goal of depression recognition, so they are less suitable for dimensionality reduction of EEG features. To solve this problem, in this paper we propose a novel dimensionality reduction technique called sparse discriminant manifold projections(SDMP) for depression recognition. Specifically, the use of the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm instead of the squared <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm as a similarity measure in the objective function reduces sensitivity to noise and outliers. Moreover, the local geometric structure and global discriminative properties of data are integrated, which makes the extracted features more discriminative. Finally, the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>1</mn></mrow></msub></math></span>-norm regularization is introduced to achieve feature selection. Furthermore, The formulation is extended to the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn><mo>,</mo><mi>p</mi></mrow></msub></math></span>-norm regularization case, which is more likely to offer better sparsity when <span><math><mrow><mn>0</mn><mo>&lt;</mo><mi>p</mi><mo>&lt;</mo><mn>1</mn></mrow></math></span>. Extensive experiments on EEG data show that the SDMP achieves the competitive performance compared with other state-of-the-art dimensionality reduction methods. It also shows the practical application value of our method in detecting depression.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128765"},"PeriodicalIF":5.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142660755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An efficient re-parameterization feature pyramid network on YOLOv8 to the detection of steel surface defect 基于 YOLOv8 的高效重参数化特征金字塔网络用于检测钢铁表面缺陷
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-11-04 DOI: 10.1016/j.neucom.2024.128775
Weining Xie, Weifeng Ma, Xiaoyong Sun
{"title":"An efficient re-parameterization feature pyramid network on YOLOv8 to the detection of steel surface defect","authors":"Weining Xie,&nbsp;Weifeng Ma,&nbsp;Xiaoyong Sun","doi":"10.1016/j.neucom.2024.128775","DOIUrl":"10.1016/j.neucom.2024.128775","url":null,"abstract":"<div><div>In the field of steel production, the detection of steel surface defects is one of the most important guarantees for the quality of steel production. In the process of defect detection, there are problems regarding the noise of the acquisition background, the scale of defects, and the detection speed. At present, in the face of complex steel surface defects, realizing efficient real-time steel surface defect detection has become a difficult problem. In this paper, we propose a lightweight and efficient real-time defect detection method, LDE-YOLO, based on YOLOv8. First, we propose a lightweight multi-scale feature extraction module, LighterMSMC, which not only achieves a lightweight backbone network, but also effectively guarantees the long range dependence of the features, so as to realize multi-scale feature extraction more efficiently. Secondly, we propose lightweight re-parameterized feature pyramid, DE-FPN, in which the sparse patterns of the overall features and the detailed features of the local features are efficiently captured by the DE-Block, and then efficiently fused by the PAN feature fusion structure. Finally, we propose Efficient Head, which lightens the model by group convolution while its improves the diagonal correlation of the feature maps on some specific datasets, thus enhancing the detection performance. Our proposed LDE-YOLO obtains 80.8 mAP and 75.5 FPS on NEU-DET , 80.5 mAP and 75.5 FPS on GC10-DET. It obtains 2.5 mAP and 4.7 mAP enhancement compared to the baseline model, and the detection speed is also improved by 10.4 FPS, while in terms of the number of floating point operations and parameters of the model reduced by 60.2% and 49.1%, which is sufficient to illustrate its lightweight effectiveness and realize an efficient real-time steel surface defect detection model.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128775"},"PeriodicalIF":5.5,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-contrast image clustering via multi-resolution augmentation and momentum-output queues 通过多分辨率增强和动量输出队列进行多对比度图像聚类
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-11-01 DOI: 10.1016/j.neucom.2024.128738
Sheng Jin, Shuisheng Zhou, Dezheng Kong, Banghe Han
{"title":"Multi-contrast image clustering via multi-resolution augmentation and momentum-output queues","authors":"Sheng Jin,&nbsp;Shuisheng Zhou,&nbsp;Dezheng Kong,&nbsp;Banghe Han","doi":"10.1016/j.neucom.2024.128738","DOIUrl":"10.1016/j.neucom.2024.128738","url":null,"abstract":"<div><div>Contrastive clustering has emerged as an efficacious technique in the domain of deep clustering, leveraging the interplay between paired samples and the learning capabilities of deep network architectures. However, the augmentation strategies employed in the existing methods do not fully utilize the information of images, coupled with the limitation of the number of negative samples makes the clustering performance suffer. In this study, we propose a novel clustering approach that incorporates momentum-output queues and multi-resolution augmentation strategies to effectively address these limitations. Initially, we deploy a multi-resolution augmentation strategy, transforming conventional augmentations into distinct global and local perspectives across various resolutions. This approach comprehensively harnesses inter-image information to construct a multi-contrast model with multi-view inputs. Subsequently, we introduce momentum-output queues, which are designed to store a large number of negative samples without increasing the computational cost, thereby enhancing the clustering effect. Within our joint optimization framework, sample features are derived from both the original and momentum encoders for instance-level contrastive learning. Simultaneously, features produced exclusively by the original encoder within the same batch are employed for cluster-level contrastive learning. Our experimental results on five challenging datasets substantiate the superior performance of our method over existing state-of-the-art techniques.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128738"},"PeriodicalIF":5.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142592863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PIDNODEs: Neural ordinary differential equations inspired by a proportional–integral–derivative controller PIDNODEs:受比例-积分-派生控制器启发的神经常微分方程
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-30 DOI: 10.1016/j.neucom.2024.128769
Pengkai Wang , Song Chen , Jiaxu Liu , Shengze Cai , Chao Xu
{"title":"PIDNODEs: Neural ordinary differential equations inspired by a proportional–integral–derivative controller","authors":"Pengkai Wang ,&nbsp;Song Chen ,&nbsp;Jiaxu Liu ,&nbsp;Shengze Cai ,&nbsp;Chao Xu","doi":"10.1016/j.neucom.2024.128769","DOIUrl":"10.1016/j.neucom.2024.128769","url":null,"abstract":"<div><div>Neural Ordinary Differential Equations (NODEs) are a novel family of infinite-depth neural-net models through solving ODEs and their adjoint equations. In this paper, we present a strategy to enhance the training and inference of NODEs by integrating a Proportional–Integral–Derivative (PID) controller into the framework of Heavy Ball NODE, resulting in the proposed PIDNODEs and its generalized version, GPIDNODEs. By leveraging the advantages of control, PIDNODEs and GPIDNODEs can address the stiff ODE challenges by adjusting the parameters (i.e., <span><math><msub><mrow><mi>K</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span>, <span><math><msub><mrow><mi>K</mi></mrow><mrow><mi>i</mi></mrow></msub></math></span> and <span><math><msub><mrow><mi>K</mi></mrow><mrow><mi>d</mi></mrow></msub></math></span>) in the PID module. The experiments confirm the superiority of PIDNODEs/GPIDNODEs over other NODE baselines on different computer vision and pattern recognition tasks, including image classification, point cloud separation and learning long-term dependencies from irregular time-series data for a physical dynamic system. These experiments demonstrate that the proposed models have higher accuracy and fewer function evaluations while alleviating the dilemma of exploding and vanishing gradients, particularly when learning long-term dependencies from a large amount of data.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128769"},"PeriodicalIF":5.5,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142660747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breaking the gap between label correlation and instance similarity via new multi-label contrastive learning 通过新型多标签对比学习打破标签相关性与实例相似性之间的差距
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-29 DOI: 10.1016/j.neucom.2024.128719
Xin Wang , Wang Zhang , Yuhong Wu , Xingpeng Zhang , Chao Wang , Huayi Zhan
{"title":"Breaking the gap between label correlation and instance similarity via new multi-label contrastive learning","authors":"Xin Wang ,&nbsp;Wang Zhang ,&nbsp;Yuhong Wu ,&nbsp;Xingpeng Zhang ,&nbsp;Chao Wang ,&nbsp;Huayi Zhan","doi":"10.1016/j.neucom.2024.128719","DOIUrl":"10.1016/j.neucom.2024.128719","url":null,"abstract":"<div><div>Multi-label text classification (MLTC) is a fundamental yet challenging task in natural language processing. Existing MLTC models mostly learn text representations and label correlations, separately; while the instance-level correlation, which is crucial for the classification is ignored. To rectify this, we propose a new multi-label contrastive learning model, that captures instance-level correlations, for the MLTC task. Specifically, we first learn label representations by using Graph Convolutional Network (GCN) on label co-occurrence graphs. We next learn text representations by taking label correlations into consideration. Through an attention mechanism, instance-level correlation can be established. To better utilize label correlations, we propose a new contrastive learning model, whose learning is guided by a new learning objective, to further refine label representations. We finally implement a <span><math><mi>k</mi></math></span>-NN mechanism, that identifies <span><math><mi>k</mi></math></span> nearest neighbors of a given text for final prediction. Intensive experimental studies over benchmark multi-label datasets demonstrate the effectiveness of our approach.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128719"},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142660135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Double machine learning for partially linear mediation models with high-dimensional confounders 具有高维混杂因素的部分线性中介模型的双重机器学习
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-29 DOI: 10.1016/j.neucom.2024.128766
Jichen Yang, Yujing Shao, Jin Liu, Lei Wang
{"title":"Double machine learning for partially linear mediation models with high-dimensional confounders","authors":"Jichen Yang,&nbsp;Yujing Shao,&nbsp;Jin Liu,&nbsp;Lei Wang","doi":"10.1016/j.neucom.2024.128766","DOIUrl":"10.1016/j.neucom.2024.128766","url":null,"abstract":"<div><div>To estimate and statistically infer the direct and indirect effects of exposure and mediator variables while accounting for high-dimensional confounding variables, we propose a partially linear mediation model to incorporate a flexible mechanism of confounders. To obtain asymptotically efficient estimators for the effects of interest under the influence of the nuisance functions with high-dimensional confounders, we construct two Neyman-orthogonal score functions to remove regularization bias. Flexible machine learning methods and data splitting with cross-fitting are employed to address the overfitting issue and estimate unknown nuisance functions efficiently. We rigorously investigate the asymptotic expressions of the proposed estimators for the direct, indirect and total effects and then derive their asymptotic normality properties. In addition, two Wald statistics are constructed to test the direct and indirect effects, respectively, and their limiting distributions are obtained. The satisfactory performance of our proposed estimators is demonstrated by simulation results and a genome-wide analysis of blood DNA methylation dataset.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128766"},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bipartite containment tracking for nonlinear MASs under FDI attack based on model-free adaptive iterative learning control 基于无模型自适应迭代学习控制的 FDI 攻击下非线性 MAS 的两端遏制跟踪
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-29 DOI: 10.1016/j.neucom.2024.128783
Xinning He, Zhongsheng Hou
{"title":"Bipartite containment tracking for nonlinear MASs under FDI attack based on model-free adaptive iterative learning control","authors":"Xinning He,&nbsp;Zhongsheng Hou","doi":"10.1016/j.neucom.2024.128783","DOIUrl":"10.1016/j.neucom.2024.128783","url":null,"abstract":"<div><div>The bipartite containment control problem for a type of heterogeneous multi-agent systems (MASs) under false data injection (FDI) attack is handled in this work by using the distributed model-free adaptive iterative learning control scheme with attack compensation. The unknown non-affine nonlinear dynamics of each agent is first transformed into an equivalent attack-related data model along the iteration axis using a compact form dynamic linearization method. Then, a distributed model-free adaptive iterative learning bipartite containment control (DMFAILBCC) scheme is constructed by employing I/O data from MASs, and the convergence is proved by rigorous mathematical analysis In addition, the updated control method and the convergence analysis will be extended to iteration switching topologies. Finally, the performance of the two proposed schemes is validated through numerical simulations and comparisons with different control schemes.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128783"},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142586923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Augmented ELBO regularization for enhanced clustering in variational autoencoders 增强 ELBO 正则化,提高变异自动编码器的聚类能力
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-29 DOI: 10.1016/j.neucom.2024.128795
Kwangtek Na , Ju-Hong Lee , Eunchan Kim
{"title":"Augmented ELBO regularization for enhanced clustering in variational autoencoders","authors":"Kwangtek Na ,&nbsp;Ju-Hong Lee ,&nbsp;Eunchan Kim","doi":"10.1016/j.neucom.2024.128795","DOIUrl":"10.1016/j.neucom.2024.128795","url":null,"abstract":"<div><div>With significant advances in deep neural networks, various new algorithms have emerged that effectively model latent structures within data, surpassing traditional clustering methods. Each data point is expected to belong to a single cluster in a typical clustering algorithm. However, clustering based on variational autoencoders (VAEs) represents the expectation of the overall clusters, denoted as <span><math><mrow><mi>c</mi><mo>=</mo><mn>1</mn><mo>,</mo><mo>…</mo><mo>,</mo><mi>K</mi></mrow></math></span> in the KL divergence term. Consequently, the latent embedding <span><math><mi>z</mi></math></span> can be learned to exist across multiple clusters with relatively balanced probabilities, rather than being strongly associated with a specific cluster. This study introduces an additional regularizer to encourage the latent embedding <span><math><mi>z</mi></math></span> to have a strong affiliation with specific clusters. We introduce optimization methods to maximize the ELBO that includes the newly added regularization term and explore methods to eliminate computationally challenging terms. The positive impact of this regularization on clustering accuracy was verified by examining the variance of the final cluster probabilities. Furthermore, an enhancement in the clustering performance was observed when regularization was introduced.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128795"},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning from different perspectives for regret reduction in reinforcement learning: A free energy approach 从不同角度学习,减少强化学习中的遗憾:自由能方法
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-29 DOI: 10.1016/j.neucom.2024.128797
Milad Ghorbani, Reshad Hosseini, Seyed Pooya Shariatpanahi, Majid Nili Ahmadabadi
{"title":"Learning from different perspectives for regret reduction in reinforcement learning: A free energy approach","authors":"Milad Ghorbani,&nbsp;Reshad Hosseini,&nbsp;Seyed Pooya Shariatpanahi,&nbsp;Majid Nili Ahmadabadi","doi":"10.1016/j.neucom.2024.128797","DOIUrl":"10.1016/j.neucom.2024.128797","url":null,"abstract":"<div><div>Reinforcement learning (RL) is the core method for interactive learning in living and artificial creatures. Nevertheless, in contrast to humans and animals, artificial RL agents are very slow in learning and suffer from the curse of dimensionality. This is partially due to using RL in isolation; i.e. lack of social learning and social diversity. We introduce a free energy-based social RL for learning novel tasks. Society is formed by the learning agent and some diverse virtual ones. That diversity is in their perception while all agents use the same interaction samples for learning and share the same action set. Individual difference in perception is mostly the cause of perceptual aliasing however, it can result in virtual agents’ faster learning in early trials. Our free energy method provides a knowledge integration method for the main agent to benefit from that diversity to reduce its regret. It rests upon Thompson sampling policy and behavioral policy of main and virtual agents. Therefore, it is applicable to a variety of tasks, discrete or continuous state space, model-free, and model-based tasks as well as to different reinforcement learning methods. Through a set of experiments, we show that this general framework highly improves learning speed and is clearly superior to previous existing methods. We also provide convergence proof.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128797"},"PeriodicalIF":5.5,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Diffusion model conditioning on Gaussian mixture model and negative Gaussian mixture gradient 以高斯混合模型和负高斯混合梯度为条件的扩散模型
IF 5.5 2区 计算机科学
Neurocomputing Pub Date : 2024-10-28 DOI: 10.1016/j.neucom.2024.128764
Weiguo Lu , Xuan Wu , Deng Ding , Jinqiao Duan , Jirong Zhuang , Gangnan Yuan
{"title":"Diffusion model conditioning on Gaussian mixture model and negative Gaussian mixture gradient","authors":"Weiguo Lu ,&nbsp;Xuan Wu ,&nbsp;Deng Ding ,&nbsp;Jinqiao Duan ,&nbsp;Jirong Zhuang ,&nbsp;Gangnan Yuan","doi":"10.1016/j.neucom.2024.128764","DOIUrl":"10.1016/j.neucom.2024.128764","url":null,"abstract":"<div><div>Diffusion models (DMs) are a type of generative model that has had a significant impact on image synthesis and beyond. They can incorporate a wide variety of conditioning inputs — such as text or bounding boxes — to guide generation. In this work, we introduce a novel conditioning mechanism that applies Gaussian mixture models (GMMs) for feature conditioning, which helps steer the denoising process in DMs. Drawing on set theory, our comprehensive theoretical analysis reveals that the conditional latent distribution based on features differs markedly from that based on classes. Consequently, feature-based conditioning tends to generate fewer defects than class-based conditioning. Experiments are designed and carried out and the experimental results support our theoretical findings as well as effectiveness of proposed feature conditioning mechanism. Additionally, we propose a new gradient function named the Negative Gaussian Mixture Gradient (NGMG) and incorporate it into the training of diffusion models alongside an auxiliary classifier. We theoretically demonstrate that NGMG offers comparable advantages to the Wasserstein distance, serving as a more effective cost function when learning distributions supported by low-dimensional manifolds, especially in contrast to many likelihood-based cost functions, such as KL divergences.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"614 ","pages":"Article 128764"},"PeriodicalIF":5.5,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142573192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信