{"title":"A method for quantifying the generalization capabilities of generative models for solving Ising models","authors":"Qunlong Ma, Zhi Ma, Ming Gao","doi":"10.1088/2632-2153/ad3710","DOIUrl":null,"url":null,"abstract":"\n For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilities. Here we design a Hamming distance regularizer in the framework of a class of generative models, variational autoregressive networks (VAN), to quantify the generalization capabilities of various network architectures combined with VAN. The regularizer can control the size of the overlaps between the ground state and the training datasets generated by networks, which, together with the success rates of finding the ground state, form a quantitative metric to quantify their generalization capabilities. We conduct numerical experiments on several prototypical network architectures combined with VAN, including feed-forward neural networks, recurrent neural networks, and graph neural networks, to quantify their generalization capabilities when solving Ising models. Moreover, considering the fact that the quantification of the generalization capabilities of networks on small-scale problems can be used to predict their relative performance on large-scale problems, our method is of great significance for assisting in the Neural Architecture Search field of searching for the optimal network architectures when solving large-scale Ising models.","PeriodicalId":503691,"journal":{"name":"Machine Learning: Science and Technology","volume":" 4","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning: Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2632-2153/ad3710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
For Ising models with complex energy landscapes, whether the ground state can be found by neural networks depends heavily on the Hamming distance between the training datasets and the ground state. Despite the fact that various recently proposed generative models have shown good performance in solving Ising models, there is no adequate discussion on how to quantify their generalization capabilities. Here we design a Hamming distance regularizer in the framework of a class of generative models, variational autoregressive networks (VAN), to quantify the generalization capabilities of various network architectures combined with VAN. The regularizer can control the size of the overlaps between the ground state and the training datasets generated by networks, which, together with the success rates of finding the ground state, form a quantitative metric to quantify their generalization capabilities. We conduct numerical experiments on several prototypical network architectures combined with VAN, including feed-forward neural networks, recurrent neural networks, and graph neural networks, to quantify their generalization capabilities when solving Ising models. Moreover, considering the fact that the quantification of the generalization capabilities of networks on small-scale problems can be used to predict their relative performance on large-scale problems, our method is of great significance for assisting in the Neural Architecture Search field of searching for the optimal network architectures when solving large-scale Ising models.
对于具有复杂能谱的伊辛模型,神经网络能否找到基态在很大程度上取决于训练数据集与基态之间的汉明距离。尽管最近提出的各种生成模型在求解伊辛模型时表现出了良好的性能,但对于如何量化它们的泛化能力还没有充分的讨论。在此,我们在一类生成模型--变异自回归网络(VAN)--的框架内设计了一个汉明距离正则器,以量化与 VAN 结合的各种网络架构的泛化能力。正则化器可以控制地面状态与网络生成的训练数据集之间的重叠大小,这与找到地面状态的成功率一起构成了量化网络泛化能力的定量指标。我们对几种与 VAN 结合的原型网络架构(包括前馈神经网络、递归神经网络和图神经网络)进行了数值实验,以量化它们在求解伊辛模型时的泛化能力。此外,考虑到量化网络在小规模问题上的泛化能力可以用来预测它们在大规模问题上的相对性能,我们的方法对于帮助神经架构搜索领域在求解大规模伊辛模型时寻找最优网络架构具有重要意义。