NT-ViT：用于脑电图-fMRI 合成的神经转码视觉变换器

arXiv - EE - Image and Video Processing Pub Date : 2024-09-18 DOI:arxiv-2409.11836

Romeo Lanzino, Federico Fontana, Luigi Cinque, Francesco Scarcello, Atsuto Maki

{"title":"NT-ViT：用于脑电图-fMRI 合成的神经转码视觉变换器","authors":"Romeo Lanzino, Federico Fontana, Luigi Cinque, Francesco Scarcello, Atsuto Maki","doi":"arxiv-2409.11836","DOIUrl":null,"url":null,"abstract":"This paper introduces the Neural Transcoding Vision Transformer (\\modelname),\na generative model designed to estimate high-resolution functional Magnetic\nResonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG)\ndata. A key feature of \\modelname is its Domain Matching (DM) sub-module which\neffectively aligns the latent EEG representations with those of fMRI volumes,\nenhancing the model's accuracy and reliability. Unlike previous methods that\ntend to struggle with fidelity and reproducibility of images, \\modelname\naddresses these challenges by ensuring methodological integrity and\nhigher-quality reconstructions which we showcase through extensive evaluation\non two benchmark datasets; \\modelname outperforms the current state-of-the-art\nby a significant margin in both cases, e.g. achieving a $10\\times$ reduction in\nRMSE and a $3.14\\times$ increase in SSIM on the Oddball dataset. An ablation\nstudy also provides insights into the contribution of each component to the\nmodel's overall effectiveness. This development is critical in offering a new\napproach to lessen the time and financial constraints typically linked with\nhigh-resolution brain imaging, thereby aiding in the swift and precise\ndiagnosis of neurological disorders. Although it is not a replacement for\nactual fMRI but rather a step towards making such imaging more accessible, we\nbelieve that it represents a pivotal advancement in clinical practice and\nneuroscience research. Code is available at\n\\url{https://github.com/rom42pla/ntvit}.","PeriodicalId":501289,"journal":{"name":"arXiv - EE - Image and Video Processing","volume":"190 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis\",\"authors\":\"Romeo Lanzino, Federico Fontana, Luigi Cinque, Francesco Scarcello, Atsuto Maki\",\"doi\":\"arxiv-2409.11836\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces the Neural Transcoding Vision Transformer (\\\\modelname),\\na generative model designed to estimate high-resolution functional Magnetic\\nResonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG)\\ndata. A key feature of \\\\modelname is its Domain Matching (DM) sub-module which\\neffectively aligns the latent EEG representations with those of fMRI volumes,\\nenhancing the model's accuracy and reliability. Unlike previous methods that\\ntend to struggle with fidelity and reproducibility of images, \\\\modelname\\naddresses these challenges by ensuring methodological integrity and\\nhigher-quality reconstructions which we showcase through extensive evaluation\\non two benchmark datasets; \\\\modelname outperforms the current state-of-the-art\\nby a significant margin in both cases, e.g. achieving a $10\\\\times$ reduction in\\nRMSE and a $3.14\\\\times$ increase in SSIM on the Oddball dataset. An ablation\\nstudy also provides insights into the contribution of each component to the\\nmodel's overall effectiveness. This development is critical in offering a new\\napproach to lessen the time and financial constraints typically linked with\\nhigh-resolution brain imaging, thereby aiding in the swift and precise\\ndiagnosis of neurological disorders. Although it is not a replacement for\\nactual fMRI but rather a step towards making such imaging more accessible, we\\nbelieve that it represents a pivotal advancement in clinical practice and\\nneuroscience research. Code is available at\\n\\\\url{https://github.com/rom42pla/ntvit}.\",\"PeriodicalId\":501289,\"journal\":{\"name\":\"arXiv - EE - Image and Video Processing\",\"volume\":\"190 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - EE - Image and Video Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11836\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - EE - Image and Video Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11836","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文介绍了神经转码视觉转换器（Neural Transcoding Vision Transformer），这是一个生成模型，旨在从同步脑电图（EEG）数据估算高分辨率功能磁共振成像（fMRI）样本。该模型的一个关键特征是其领域匹配（DM）子模块，它能有效地将潜在的脑电图表征与 fMRI 容量的表征相匹配，从而提高模型的准确性和可靠性。与以往在图像的保真度和可重复性方面存在困难的方法不同，\modelname通过确保方法的完整性和更高质量的重构来应对这些挑战，我们通过在两个基准数据集上进行广泛评估来展示这些挑战；\modelname在两种情况下都以显著的优势超过了当前最先进的方法，例如，在Oddball数据集上，RMSE降低了10美元/倍，SSIM增加了3.14美元/倍。一项消融研究还让我们深入了解了每个组件对模型整体有效性的贡献。这一发展至关重要，它提供了一种新方法来减少通常与高分辨率脑成像相关的时间和经济限制，从而有助于神经系统疾病的快速精确诊断。虽然它不能取代真正的 fMRI，但它是使这种成像技术更加普及的一步，我们相信它代表了临床实践和神经科学研究的一个关键进步。代码请访问：url{https://github.com/rom42pla/ntvit}。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

NT-ViT: Neural Transcoding Vision Transformers for EEG-to-fMRI Synthesis

This paper introduces the Neural Transcoding Vision Transformer (\modelname), a generative model designed to estimate high-resolution functional Magnetic Resonance Imaging (fMRI) samples from simultaneous Electroencephalography (EEG) data. A key feature of \modelname is its Domain Matching (DM) sub-module which effectively aligns the latent EEG representations with those of fMRI volumes, enhancing the model's accuracy and reliability. Unlike previous methods that tend to struggle with fidelity and reproducibility of images, \modelname addresses these challenges by ensuring methodological integrity and higher-quality reconstructions which we showcase through extensive evaluation on two benchmark datasets; \modelname outperforms the current state-of-the-art by a significant margin in both cases, e.g. achieving a $10\times$ reduction in RMSE and a $3.14\times$ increase in SSIM on the Oddball dataset. An ablation study also provides insights into the contribution of each component to the model's overall effectiveness. This development is critical in offering a new approach to lessen the time and financial constraints typically linked with high-resolution brain imaging, thereby aiding in the swift and precise diagnosis of neurological disorders. Although it is not a replacement for actual fMRI but rather a step towards making such imaging more accessible, we believe that it represents a pivotal advancement in clinical practice and neuroscience research. Code is available at \url{https://github.com/rom42pla/ntvit}.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - EE - Image and Video Processing

自引率

0.00%

发文量