st-DenseViT: A Weakly Supervised Spatiotemporal Vision Transformer for Dense Prediction of Dynamic Brain Networks

IF 3.3 2区医学 Q1 NEUROIMAGING

Human Brain Mapping Pub Date : 2025-09-27 DOI:10.1002/hbm.70364

Behnam Kazemivash, Pranav Suresh, Dong Hye Ye, Armin Iraji, Jingyu Liu, Sergey Plis, Peter Kochunov, David C. Zhu, Vince D. Calhoun

{"title":"st-DenseViT: A Weakly Supervised Spatiotemporal Vision Transformer for Dense Prediction of Dynamic Brain Networks","authors":"Behnam Kazemivash, Pranav Suresh, Dong Hye Ye, Armin Iraji, Jingyu Liu, Sergey Plis, Peter Kochunov, David C. Zhu, Vince D. Calhoun","doi":"10.1002/hbm.70364","DOIUrl":null,"url":null,"abstract":"<p>Modeling dynamic neuronal activity within brain networks enables the precise tracking of rapid temporal fluctuations across different brain regions. However, current approaches in computational neuroscience fall short of capturing and representing the spatiotemporal dynamics within each brain network. We developed a novel weakly supervised spatiotemporal dense prediction model capable of generating personalized 4D dynamic brain networks from fMRI data, providing a more granular representation of brain activity over time. We developed a model that leverages the vision transformer (ViT) as its backbone, jointly encoding spatial and temporal information from fMRI inputs using two different configurations: space–time and sequential encoders. The model generates 4D brain network maps that evolve over time, capturing dynamic changes in both spatial and temporal dimensions. In the absence of ground-truth data, we used spatially constrained windowed independent component analysis (ICA) components derived from fMRI data as weak supervision to guide the training process. The model was evaluated using large-scale resting-state fMRI datasets, and statistical analyses were conducted to assess the effectiveness of the generated dynamic maps using various metrics. Our model effectively produced 4D brain maps that captured both inter-subject and temporal variations, offering a dynamic representation of evolving brain networks. Notably, the model demonstrated the ability to produce smooth maps from noisy priors, effectively denoising the resulting brain dynamics. Additionally, statistically significant differences were observed in the temporally averaged brain maps, as well as in the summation of absolute temporal gradient maps, between patients with schizophrenia and healthy controls. For example, within the Default Mode Network (DMN), significant differences emerged in the temporally averaged space–time configurations, particularly in the thalamus, where healthy controls exhibited higher activity levels compared to subjects with schizophrenia. These findings highlight the model's potential for differentiating between clinical populations. The proposed spatiotemporal dense prediction model offers an effective approach for generating dynamic brain maps by capturing significant spatiotemporal variations in brain activity. Leveraging weak supervision through ICA components enables the model to learn dynamic patterns without direct ground-truth data, making it a robust and efficient tool for brain mapping. Significance: This work presents an important new approach for dynamic brain mapping, potentially opening up new opportunities for studying brain dynamics within specific networks. By framing the problem as a spatiotemporal dense prediction task in computer vision, we leverage the spatiotemporal ViT architecture combined with weakly supervised learning techniques to efficiently and effectively estimate these maps.</p>","PeriodicalId":13019,"journal":{"name":"Human Brain Mapping","volume":"46 14","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hbm.70364","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Brain Mapping","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/hbm.70364","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NEUROIMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Modeling dynamic neuronal activity within brain networks enables the precise tracking of rapid temporal fluctuations across different brain regions. However, current approaches in computational neuroscience fall short of capturing and representing the spatiotemporal dynamics within each brain network. We developed a novel weakly supervised spatiotemporal dense prediction model capable of generating personalized 4D dynamic brain networks from fMRI data, providing a more granular representation of brain activity over time. We developed a model that leverages the vision transformer (ViT) as its backbone, jointly encoding spatial and temporal information from fMRI inputs using two different configurations: space–time and sequential encoders. The model generates 4D brain network maps that evolve over time, capturing dynamic changes in both spatial and temporal dimensions. In the absence of ground-truth data, we used spatially constrained windowed independent component analysis (ICA) components derived from fMRI data as weak supervision to guide the training process. The model was evaluated using large-scale resting-state fMRI datasets, and statistical analyses were conducted to assess the effectiveness of the generated dynamic maps using various metrics. Our model effectively produced 4D brain maps that captured both inter-subject and temporal variations, offering a dynamic representation of evolving brain networks. Notably, the model demonstrated the ability to produce smooth maps from noisy priors, effectively denoising the resulting brain dynamics. Additionally, statistically significant differences were observed in the temporally averaged brain maps, as well as in the summation of absolute temporal gradient maps, between patients with schizophrenia and healthy controls. For example, within the Default Mode Network (DMN), significant differences emerged in the temporally averaged space–time configurations, particularly in the thalamus, where healthy controls exhibited higher activity levels compared to subjects with schizophrenia. These findings highlight the model's potential for differentiating between clinical populations. The proposed spatiotemporal dense prediction model offers an effective approach for generating dynamic brain maps by capturing significant spatiotemporal variations in brain activity. Leveraging weak supervision through ICA components enables the model to learn dynamic patterns without direct ground-truth data, making it a robust and efficient tool for brain mapping. Significance: This work presents an important new approach for dynamic brain mapping, potentially opening up new opportunities for studying brain dynamics within specific networks. By framing the problem as a spatiotemporal dense prediction task in computer vision, we leverage the spatiotemporal ViT architecture combined with weakly supervised learning techniques to efficiently and effectively estimate these maps.

Abstract Image

查看原文本刊更多论文

st-DenseViT：用于动态脑网络密集预测的弱监督时空视觉转换器

对大脑网络中的动态神经元活动进行建模，可以精确跟踪不同大脑区域的快速时间波动。然而，目前的计算神经科学方法在捕捉和表示每个大脑网络中的时空动态方面存在不足。我们开发了一种新的弱监督时空密集预测模型，能够从fMRI数据中生成个性化的4D动态大脑网络，提供随时间变化的更细粒度的大脑活动表示。我们开发了一个模型，利用视觉变压器（ViT）作为其骨干，使用两种不同的配置：时空和顺序编码器，共同编码来自fMRI输入的空间和时间信息。该模型生成4D大脑网络地图，随着时间的推移而演变，捕捉空间和时间维度的动态变化。在缺乏真实数据的情况下，我们使用来自fMRI数据的空间约束窗口独立分量分析（ICA）分量作为弱监督来指导训练过程。该模型使用大规模静息状态fMRI数据集进行评估，并使用各种指标进行统计分析，以评估生成的动态地图的有效性。我们的模型有效地生成了4D脑图，捕获了主体间和时间变化，提供了进化的大脑网络的动态表示。值得注意的是，该模型展示了从噪声先验中生成平滑映射的能力，有效地去噪了产生的大脑动态。此外，在精神分裂症患者和健康对照者之间，在时间平均脑图以及绝对时间梯度图的总和上观察到统计学上显著的差异。例如，在默认模式网络（DMN）中，在时间平均时空结构中出现了显著差异，特别是在丘脑中，与精神分裂症受试者相比，健康对照组表现出更高的活动水平。这些发现突出了该模型在区分临床人群方面的潜力。提出的时空密集预测模型通过捕捉大脑活动的显著时空变化，为生成动态脑图提供了一种有效的方法。通过ICA组件利用弱监督使模型能够在没有直接真实数据的情况下学习动态模式，使其成为大脑映射的强大而有效的工具。意义：这项工作为动态脑映射提供了一种重要的新方法，可能为研究特定网络内的大脑动力学开辟了新的机会。通过将该问题构建为计算机视觉中的时空密集预测任务，我们利用时空ViT架构结合弱监督学习技术来高效有效地估计这些地图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Human Brain Mapping 医学-核医学

CiteScore

8.30

自引率

6.20%

发文量

401

审稿时长

3-6 weeks

期刊介绍： Human Brain Mapping publishes peer-reviewed basic, clinical, technical, and theoretical research in the interdisciplinary and rapidly expanding field of human brain mapping. The journal features research derived from non-invasive brain imaging modalities used to explore the spatial and temporal organization of the neural systems supporting human behavior. Imaging modalities of interest include positron emission tomography, event-related potentials, electro-and magnetoencephalography, magnetic resonance imaging, and single-photon emission tomography. Brain mapping research in both normal and clinical populations is encouraged. Article formats include Research Articles, Review Articles, Clinical Case Studies, and Technique, as well as Technological Developments, Theoretical Articles, and Synthetic Reviews. Technical advances, such as novel brain imaging methods, analyses for detecting or localizing neural activity, synergistic uses of multiple imaging modalities, and strategies for the design of behavioral paradigms and neural-systems modeling are of particular interest. The journal endorses the propagation of methodological standards and encourages database development in the field of human brain mapping.