Behnam Kazemivash, Pranav Suresh, Dong Hye Ye, Armin Iraji, Jingyu Liu, Sergey Plis, Peter Kochunov, David C. Zhu, Vince D. Calhoun
{"title":"st-DenseViT:用于动态脑网络密集预测的弱监督时空视觉转换器","authors":"Behnam Kazemivash, Pranav Suresh, Dong Hye Ye, Armin Iraji, Jingyu Liu, Sergey Plis, Peter Kochunov, David C. Zhu, Vince D. Calhoun","doi":"10.1002/hbm.70364","DOIUrl":null,"url":null,"abstract":"<p>Modeling dynamic neuronal activity within brain networks enables the precise tracking of rapid temporal fluctuations across different brain regions. However, current approaches in computational neuroscience fall short of capturing and representing the spatiotemporal dynamics within each brain network. We developed a novel weakly supervised spatiotemporal dense prediction model capable of generating personalized 4D dynamic brain networks from fMRI data, providing a more granular representation of brain activity over time. We developed a model that leverages the vision transformer (ViT) as its backbone, jointly encoding spatial and temporal information from fMRI inputs using two different configurations: space–time and sequential encoders. The model generates 4D brain network maps that evolve over time, capturing dynamic changes in both spatial and temporal dimensions. In the absence of ground-truth data, we used spatially constrained windowed independent component analysis (ICA) components derived from fMRI data as weak supervision to guide the training process. The model was evaluated using large-scale resting-state fMRI datasets, and statistical analyses were conducted to assess the effectiveness of the generated dynamic maps using various metrics. Our model effectively produced 4D brain maps that captured both inter-subject and temporal variations, offering a dynamic representation of evolving brain networks. Notably, the model demonstrated the ability to produce smooth maps from noisy priors, effectively denoising the resulting brain dynamics. Additionally, statistically significant differences were observed in the temporally averaged brain maps, as well as in the summation of absolute temporal gradient maps, between patients with schizophrenia and healthy controls. For example, within the Default Mode Network (DMN), significant differences emerged in the temporally averaged space–time configurations, particularly in the thalamus, where healthy controls exhibited higher activity levels compared to subjects with schizophrenia. These findings highlight the model's potential for differentiating between clinical populations. The proposed spatiotemporal dense prediction model offers an effective approach for generating dynamic brain maps by capturing significant spatiotemporal variations in brain activity. Leveraging weak supervision through ICA components enables the model to learn dynamic patterns without direct ground-truth data, making it a robust and efficient tool for brain mapping. Significance: This work presents an important new approach for dynamic brain mapping, potentially opening up new opportunities for studying brain dynamics within specific networks. By framing the problem as a spatiotemporal dense prediction task in computer vision, we leverage the spatiotemporal ViT architecture combined with weakly supervised learning techniques to efficiently and effectively estimate these maps.</p>","PeriodicalId":13019,"journal":{"name":"Human Brain Mapping","volume":"46 14","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hbm.70364","citationCount":"0","resultStr":"{\"title\":\"st-DenseViT: A Weakly Supervised Spatiotemporal Vision Transformer for Dense Prediction of Dynamic Brain Networks\",\"authors\":\"Behnam Kazemivash, Pranav Suresh, Dong Hye Ye, Armin Iraji, Jingyu Liu, Sergey Plis, Peter Kochunov, David C. Zhu, Vince D. Calhoun\",\"doi\":\"10.1002/hbm.70364\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Modeling dynamic neuronal activity within brain networks enables the precise tracking of rapid temporal fluctuations across different brain regions. However, current approaches in computational neuroscience fall short of capturing and representing the spatiotemporal dynamics within each brain network. We developed a novel weakly supervised spatiotemporal dense prediction model capable of generating personalized 4D dynamic brain networks from fMRI data, providing a more granular representation of brain activity over time. We developed a model that leverages the vision transformer (ViT) as its backbone, jointly encoding spatial and temporal information from fMRI inputs using two different configurations: space–time and sequential encoders. The model generates 4D brain network maps that evolve over time, capturing dynamic changes in both spatial and temporal dimensions. In the absence of ground-truth data, we used spatially constrained windowed independent component analysis (ICA) components derived from fMRI data as weak supervision to guide the training process. The model was evaluated using large-scale resting-state fMRI datasets, and statistical analyses were conducted to assess the effectiveness of the generated dynamic maps using various metrics. Our model effectively produced 4D brain maps that captured both inter-subject and temporal variations, offering a dynamic representation of evolving brain networks. Notably, the model demonstrated the ability to produce smooth maps from noisy priors, effectively denoising the resulting brain dynamics. Additionally, statistically significant differences were observed in the temporally averaged brain maps, as well as in the summation of absolute temporal gradient maps, between patients with schizophrenia and healthy controls. For example, within the Default Mode Network (DMN), significant differences emerged in the temporally averaged space–time configurations, particularly in the thalamus, where healthy controls exhibited higher activity levels compared to subjects with schizophrenia. These findings highlight the model's potential for differentiating between clinical populations. The proposed spatiotemporal dense prediction model offers an effective approach for generating dynamic brain maps by capturing significant spatiotemporal variations in brain activity. Leveraging weak supervision through ICA components enables the model to learn dynamic patterns without direct ground-truth data, making it a robust and efficient tool for brain mapping. Significance: This work presents an important new approach for dynamic brain mapping, potentially opening up new opportunities for studying brain dynamics within specific networks. By framing the problem as a spatiotemporal dense prediction task in computer vision, we leverage the spatiotemporal ViT architecture combined with weakly supervised learning techniques to efficiently and effectively estimate these maps.</p>\",\"PeriodicalId\":13019,\"journal\":{\"name\":\"Human Brain Mapping\",\"volume\":\"46 14\",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hbm.70364\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Brain Mapping\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/hbm.70364\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"NEUROIMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Brain Mapping","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/hbm.70364","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NEUROIMAGING","Score":null,"Total":0}
st-DenseViT: A Weakly Supervised Spatiotemporal Vision Transformer for Dense Prediction of Dynamic Brain Networks
Modeling dynamic neuronal activity within brain networks enables the precise tracking of rapid temporal fluctuations across different brain regions. However, current approaches in computational neuroscience fall short of capturing and representing the spatiotemporal dynamics within each brain network. We developed a novel weakly supervised spatiotemporal dense prediction model capable of generating personalized 4D dynamic brain networks from fMRI data, providing a more granular representation of brain activity over time. We developed a model that leverages the vision transformer (ViT) as its backbone, jointly encoding spatial and temporal information from fMRI inputs using two different configurations: space–time and sequential encoders. The model generates 4D brain network maps that evolve over time, capturing dynamic changes in both spatial and temporal dimensions. In the absence of ground-truth data, we used spatially constrained windowed independent component analysis (ICA) components derived from fMRI data as weak supervision to guide the training process. The model was evaluated using large-scale resting-state fMRI datasets, and statistical analyses were conducted to assess the effectiveness of the generated dynamic maps using various metrics. Our model effectively produced 4D brain maps that captured both inter-subject and temporal variations, offering a dynamic representation of evolving brain networks. Notably, the model demonstrated the ability to produce smooth maps from noisy priors, effectively denoising the resulting brain dynamics. Additionally, statistically significant differences were observed in the temporally averaged brain maps, as well as in the summation of absolute temporal gradient maps, between patients with schizophrenia and healthy controls. For example, within the Default Mode Network (DMN), significant differences emerged in the temporally averaged space–time configurations, particularly in the thalamus, where healthy controls exhibited higher activity levels compared to subjects with schizophrenia. These findings highlight the model's potential for differentiating between clinical populations. The proposed spatiotemporal dense prediction model offers an effective approach for generating dynamic brain maps by capturing significant spatiotemporal variations in brain activity. Leveraging weak supervision through ICA components enables the model to learn dynamic patterns without direct ground-truth data, making it a robust and efficient tool for brain mapping. Significance: This work presents an important new approach for dynamic brain mapping, potentially opening up new opportunities for studying brain dynamics within specific networks. By framing the problem as a spatiotemporal dense prediction task in computer vision, we leverage the spatiotemporal ViT architecture combined with weakly supervised learning techniques to efficiently and effectively estimate these maps.
期刊介绍:
Human Brain Mapping publishes peer-reviewed basic, clinical, technical, and theoretical research in the interdisciplinary and rapidly expanding field of human brain mapping. The journal features research derived from non-invasive brain imaging modalities used to explore the spatial and temporal organization of the neural systems supporting human behavior. Imaging modalities of interest include positron emission tomography, event-related potentials, electro-and magnetoencephalography, magnetic resonance imaging, and single-photon emission tomography. Brain mapping research in both normal and clinical populations is encouraged.
Article formats include Research Articles, Review Articles, Clinical Case Studies, and Technique, as well as Technological Developments, Theoretical Articles, and Synthetic Reviews. Technical advances, such as novel brain imaging methods, analyses for detecting or localizing neural activity, synergistic uses of multiple imaging modalities, and strategies for the design of behavioral paradigms and neural-systems modeling are of particular interest. The journal endorses the propagation of methodological standards and encourages database development in the field of human brain mapping.