{"title":"Dissociating model architectures from inference computations.","authors":"Noor Sajid, Johan Medrano","doi":"10.1080/17588928.2025.2532604","DOIUrl":null,"url":null,"abstract":"<p><p>Parr et al., 2025 examines how auto-regressive and deep temporal models differ in their treatment of non-Markovian sequence modelling. Building on this, we highlight the need for dissociating model architectures-i.e., how the predictive distribution factorises-from the computations invoked at inference. We demonstrate that deep temporal computations are mimicked by autoregressive models by structuring context access during iterative inference. Using a transformer trained on next-token prediction, we show that inducing hierarchical temporal factorisation during iterative inference maintains predictive capacity while instantiating fewer computations. This emphasises that processes for constructing and refining predictions are not necessarily bound to their underlying model architectures.</p>","PeriodicalId":10413,"journal":{"name":"Cognitive Neuroscience","volume":" ","pages":"1-3"},"PeriodicalIF":2.0000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Neuroscience","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/17588928.2025.2532604","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Parr et al., 2025 examines how auto-regressive and deep temporal models differ in their treatment of non-Markovian sequence modelling. Building on this, we highlight the need for dissociating model architectures-i.e., how the predictive distribution factorises-from the computations invoked at inference. We demonstrate that deep temporal computations are mimicked by autoregressive models by structuring context access during iterative inference. Using a transformer trained on next-token prediction, we show that inducing hierarchical temporal factorisation during iterative inference maintains predictive capacity while instantiating fewer computations. This emphasises that processes for constructing and refining predictions are not necessarily bound to their underlying model architectures.
期刊介绍:
Cognitive Neuroscience publishes high quality discussion papers and empirical papers on any topic in the field of cognitive neuroscience including perception, attention, memory, language, action, social cognition, and executive function. The journal covers findings based on a variety of techniques such as fMRI, ERPs, MEG, TMS, and focal lesion studies. Contributions that employ or discuss multiple techniques to shed light on the spatial-temporal brain mechanisms underlying a cognitive process are encouraged.