{"title":"伊辛临界附近的自回归模型路径依赖性","authors":"Yi Hong Teoh, Roger G. Melko","doi":"arxiv-2408.15715","DOIUrl":null,"url":null,"abstract":"Autoregressive models are a class of generative model that probabilistically\npredict the next output of a sequence based on previous inputs. The\nautoregressive sequence is by definition one-dimensional (1D), which is natural\nfor language tasks and hence an important component of modern architectures\nlike recurrent neural networks (RNNs) and transformers. However, when language\nmodels are used to predict outputs on physical systems that are not\nintrinsically 1D, the question arises of which choice of autoregressive\nsequence -- if any -- is optimal. In this paper, we study the reconstruction of\ncritical correlations in the two-dimensional (2D) Ising model, using RNNs and\ntransformers trained on binary spin data obtained near the thermal phase\ntransition. We compare the training performance for a number of different 1D\nautoregressive sequences imposed on finite-size 2D lattices. We find that paths\nwith long 1D segments are more efficient at training the autoregressive models\ncompared to space-filling curves that better preserve the 2D locality. Our\nresults illustrate the potential importance in choosing the optimal\nautoregressive sequence ordering when training modern language models for tasks\nin physics.","PeriodicalId":501066,"journal":{"name":"arXiv - PHYS - Disordered Systems and Neural Networks","volume":"9 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Autoregressive model path dependence near Ising criticality\",\"authors\":\"Yi Hong Teoh, Roger G. Melko\",\"doi\":\"arxiv-2408.15715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Autoregressive models are a class of generative model that probabilistically\\npredict the next output of a sequence based on previous inputs. The\\nautoregressive sequence is by definition one-dimensional (1D), which is natural\\nfor language tasks and hence an important component of modern architectures\\nlike recurrent neural networks (RNNs) and transformers. However, when language\\nmodels are used to predict outputs on physical systems that are not\\nintrinsically 1D, the question arises of which choice of autoregressive\\nsequence -- if any -- is optimal. In this paper, we study the reconstruction of\\ncritical correlations in the two-dimensional (2D) Ising model, using RNNs and\\ntransformers trained on binary spin data obtained near the thermal phase\\ntransition. We compare the training performance for a number of different 1D\\nautoregressive sequences imposed on finite-size 2D lattices. We find that paths\\nwith long 1D segments are more efficient at training the autoregressive models\\ncompared to space-filling curves that better preserve the 2D locality. Our\\nresults illustrate the potential importance in choosing the optimal\\nautoregressive sequence ordering when training modern language models for tasks\\nin physics.\",\"PeriodicalId\":501066,\"journal\":{\"name\":\"arXiv - PHYS - Disordered Systems and Neural Networks\",\"volume\":\"9 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Disordered Systems and Neural Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.15715\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Disordered Systems and Neural Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.15715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Autoregressive model path dependence near Ising criticality
Autoregressive models are a class of generative model that probabilistically
predict the next output of a sequence based on previous inputs. The
autoregressive sequence is by definition one-dimensional (1D), which is natural
for language tasks and hence an important component of modern architectures
like recurrent neural networks (RNNs) and transformers. However, when language
models are used to predict outputs on physical systems that are not
intrinsically 1D, the question arises of which choice of autoregressive
sequence -- if any -- is optimal. In this paper, we study the reconstruction of
critical correlations in the two-dimensional (2D) Ising model, using RNNs and
transformers trained on binary spin data obtained near the thermal phase
transition. We compare the training performance for a number of different 1D
autoregressive sequences imposed on finite-size 2D lattices. We find that paths
with long 1D segments are more efficient at training the autoregressive models
compared to space-filling curves that better preserve the 2D locality. Our
results illustrate the potential importance in choosing the optimal
autoregressive sequence ordering when training modern language models for tasks
in physics.