A spatiotemporal style transfer algorithm for dynamic visual stimulus generation

IF 12 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Nature computational science Pub Date : 2024-12-20 DOI:10.1038/s43588-024-00746-w

Antonino Greco, Markus Siegel

{"title":"A spatiotemporal style transfer algorithm for dynamic visual stimulus generation","authors":"Antonino Greco, Markus Siegel","doi":"10.1038/s43588-024-00746-w","DOIUrl":null,"url":null,"abstract":"Understanding how visual information is encoded in biological and artificial systems often requires the generation of appropriate stimuli to test specific hypotheses, but available methods for video generation are scarce. Here we introduce the spatiotemporal style transfer (STST) algorithm, a dynamic visual stimulus generation framework that allows the manipulation and synthesis of video stimuli for vision research. We show how stimuli can be generated that match the low-level spatiotemporal features of their natural counterparts, but lack their high-level semantic features, providing a useful tool to study object recognition. We used these stimuli to probe PredNet, a predictive coding deep network, and found that its next-frame predictions were not disrupted by the omission of high-level information, with human observers also confirming the preservation of low-level features and lack of high-level information in the generated stimuli. We also introduce a procedure for the independent spatiotemporal factorization of dynamic stimuli. Testing such factorized stimuli on humans and deep vision models suggests a spatial bias in how humans and deep vision models encode dynamic visual information. These results showcase potential applications of the STST algorithm as a versatile tool for dynamic stimulus generation in vision science. The spatiotemporal style transfer (STST) algorithm enables video generation by selectively manipulating the spatial and temporal features of natural videos, fostering vision science research in both biological and artificial systems.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"155-169"},"PeriodicalIF":12.0000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00746-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-024-00746-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Understanding how visual information is encoded in biological and artificial systems often requires the generation of appropriate stimuli to test specific hypotheses, but available methods for video generation are scarce. Here we introduce the spatiotemporal style transfer (STST) algorithm, a dynamic visual stimulus generation framework that allows the manipulation and synthesis of video stimuli for vision research. We show how stimuli can be generated that match the low-level spatiotemporal features of their natural counterparts, but lack their high-level semantic features, providing a useful tool to study object recognition. We used these stimuli to probe PredNet, a predictive coding deep network, and found that its next-frame predictions were not disrupted by the omission of high-level information, with human observers also confirming the preservation of low-level features and lack of high-level information in the generated stimuli. We also introduce a procedure for the independent spatiotemporal factorization of dynamic stimuli. Testing such factorized stimuli on humans and deep vision models suggests a spatial bias in how humans and deep vision models encode dynamic visual information. These results showcase potential applications of the STST algorithm as a versatile tool for dynamic stimulus generation in vision science. The spatiotemporal style transfer (STST) algorithm enables video generation by selectively manipulating the spatial and temporal features of natural videos, fostering vision science research in both biological and artificial systems.

Abstract Image

查看原文本刊更多论文

一种动态视觉刺激生成的时空风格转移算法。

理解视觉信息是如何在生物和人工系统中编码的，通常需要产生适当的刺激来测试特定的假设，但可用的视频生成方法很少。本文介绍了时空风格转移（STST）算法，这是一种动态视觉刺激生成框架，允许对视觉研究中的视频刺激进行操纵和合成。我们展示了如何生成与自然对应的低水平时空特征相匹配的刺激，但缺乏其高水平语义特征，为研究物体识别提供了有用的工具。我们使用这些刺激来探测PredNet，一个预测编码深度网络，发现它的下一帧预测不会因为遗漏高级信息而中断，人类观察者也证实了在生成的刺激中保留了低级特征和缺乏高级信息。我们还介绍了动态刺激的独立时空分解过程。在人类和深度视觉模型上测试这些因子刺激表明，人类和深度视觉模型在如何编码动态视觉信息方面存在空间偏差。这些结果展示了STST算法作为视觉科学中动态刺激生成的通用工具的潜在应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Nature computational science

CiteScore

11.70

自引率

0.00%

发文量