{"title":"EBM-WGF:用Wasserstein梯度流训练基于能量的模型","authors":"Ben Wan , Cong Geng , Tianyi Zheng , Jia Wang","doi":"10.1016/j.neunet.2025.107300","DOIUrl":null,"url":null,"abstract":"<div><div>Energy-based models (EBMs) show their efficiency in density estimation. However, MCMC sampling in traditional EBMs suffers from expensive computation. Although EBMs with minimax game avoid the above drawback, the energy estimation and generator’s optimization are not always stable. We find that the reason for this instability arises from the inaccuracy of minimizing KL divergence between generative and energy distribution along a vanilla gradient flow. In this paper, we leverage the Wasserstein gradient flow (WGF) of the KL divergence to correct the optimization direction of the generator in the minimax game. Different from existing WGF-based models, we pullback the WGF to parameter space and solve it with a variational scheme for bounded solution error. We propose a new EBM with WGF that overcomes the instability of the minimax game and avoids computational MCMC sampling in traditional methods, as we observe that the solution of WGF in our approach is equivalent to Langevin dynamic in EBMs with MCMC sampling. The empirical experiments on toy and natural datasets validate the effectiveness of our approach.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107300"},"PeriodicalIF":6.3000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EBM-WGF: Training energy-based models with Wasserstein gradient flow\",\"authors\":\"Ben Wan , Cong Geng , Tianyi Zheng , Jia Wang\",\"doi\":\"10.1016/j.neunet.2025.107300\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Energy-based models (EBMs) show their efficiency in density estimation. However, MCMC sampling in traditional EBMs suffers from expensive computation. Although EBMs with minimax game avoid the above drawback, the energy estimation and generator’s optimization are not always stable. We find that the reason for this instability arises from the inaccuracy of minimizing KL divergence between generative and energy distribution along a vanilla gradient flow. In this paper, we leverage the Wasserstein gradient flow (WGF) of the KL divergence to correct the optimization direction of the generator in the minimax game. Different from existing WGF-based models, we pullback the WGF to parameter space and solve it with a variational scheme for bounded solution error. We propose a new EBM with WGF that overcomes the instability of the minimax game and avoids computational MCMC sampling in traditional methods, as we observe that the solution of WGF in our approach is equivalent to Langevin dynamic in EBMs with MCMC sampling. The empirical experiments on toy and natural datasets validate the effectiveness of our approach.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"187 \",\"pages\":\"Article 107300\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025001790\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025001790","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
EBM-WGF: Training energy-based models with Wasserstein gradient flow
Energy-based models (EBMs) show their efficiency in density estimation. However, MCMC sampling in traditional EBMs suffers from expensive computation. Although EBMs with minimax game avoid the above drawback, the energy estimation and generator’s optimization are not always stable. We find that the reason for this instability arises from the inaccuracy of minimizing KL divergence between generative and energy distribution along a vanilla gradient flow. In this paper, we leverage the Wasserstein gradient flow (WGF) of the KL divergence to correct the optimization direction of the generator in the minimax game. Different from existing WGF-based models, we pullback the WGF to parameter space and solve it with a variational scheme for bounded solution error. We propose a new EBM with WGF that overcomes the instability of the minimax game and avoids computational MCMC sampling in traditional methods, as we observe that the solution of WGF in our approach is equivalent to Langevin dynamic in EBMs with MCMC sampling. The empirical experiments on toy and natural datasets validate the effectiveness of our approach.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.