{"title":"A method for quantifying the generalization capabilities of generative models for solving Ising models","authors":"Qunlong Ma, Zhi Ma, Ming Gao","doi":"arxiv-2405.03435","DOIUrl":null,"url":null,"abstract":"For Ising models with complex energy landscapes, whether the ground state can\nbe found by neural networks depends heavily on the Hamming distance between the\ntraining datasets and the ground state. Despite the fact that various recently\nproposed generative models have shown good performance in solving Ising models,\nthere is no adequate discussion on how to quantify their generalization\ncapabilities. Here we design a Hamming distance regularizer in the framework of\na class of generative models, variational autoregressive networks (VAN), to\nquantify the generalization capabilities of various network architectures\ncombined with VAN. The regularizer can control the size of the overlaps between\nthe ground state and the training datasets generated by networks, which,\ntogether with the success rates of finding the ground state, form a\nquantitative metric to quantify their generalization capabilities. We conduct\nnumerical experiments on several prototypical network architectures combined\nwith VAN, including feed-forward neural networks, recurrent neural networks,\nand graph neural networks, to quantify their generalization capabilities when\nsolving Ising models. Moreover, considering the fact that the quantification of\nthe generalization capabilities of networks on small-scale problems can be used\nto predict their relative performance on large-scale problems, our method is of\ngreat significance for assisting in the Neural Architecture Search field of\nsearching for the optimal network architectures when solving large-scale Ising\nmodels.","PeriodicalId":501066,"journal":{"name":"arXiv - PHYS - Disordered Systems and Neural Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Disordered Systems and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.03435","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
For Ising models with complex energy landscapes, whether the ground state can
be found by neural networks depends heavily on the Hamming distance between the
training datasets and the ground state. Despite the fact that various recently
proposed generative models have shown good performance in solving Ising models,
there is no adequate discussion on how to quantify their generalization
capabilities. Here we design a Hamming distance regularizer in the framework of
a class of generative models, variational autoregressive networks (VAN), to
quantify the generalization capabilities of various network architectures
combined with VAN. The regularizer can control the size of the overlaps between
the ground state and the training datasets generated by networks, which,
together with the success rates of finding the ground state, form a
quantitative metric to quantify their generalization capabilities. We conduct
numerical experiments on several prototypical network architectures combined
with VAN, including feed-forward neural networks, recurrent neural networks,
and graph neural networks, to quantify their generalization capabilities when
solving Ising models. Moreover, considering the fact that the quantification of
the generalization capabilities of networks on small-scale problems can be used
to predict their relative performance on large-scale problems, our method is of
great significance for assisting in the Neural Architecture Search field of
searching for the optimal network architectures when solving large-scale Ising
models.
对于具有复杂能谱的伊辛模型,神经网络能否找到基态在很大程度上取决于训练数据集与基态之间的汉明距离。尽管最近提出的各种生成模型在求解伊辛模型时表现出了良好的性能,但如何量化它们的泛化能力还没有充分的讨论。在此,我们在一类生成模型--变异自回归网络(VAN)的框架内设计了一个汉明距离正则器,以量化与 VAN 结合的各种网络架构的泛化能力。正则化器可以控制地面状态与网络生成的训练数据集之间的重叠大小,这与找到地面状态的成功率一起构成了量化网络泛化能力的量化指标。我们对几种与 VAN 结合的原型网络架构(包括前馈神经网络、递归神经网络和图神经网络)进行了数值实验,以量化它们在求解伊辛模型时的泛化能力。此外,考虑到量化网络在小规模问题上的泛化能力可以用来预测它们在大规模问题上的相对性能,我们的方法对于帮助神经架构搜索领域在求解大规模 Ising 模型时寻找最优网络架构具有重要意义。