stMCDI:用于空间转录组学数据推算的屏蔽条件扩散模型与图神经网络

Xiaoyu Li, Wenwen Min, Shunfang Wang, Changmiao Wang, Taosheng Xu
{"title":"stMCDI:用于空间转录组学数据推算的屏蔽条件扩散模型与图神经网络","authors":"Xiaoyu Li, Wenwen Min, Shunfang Wang, Changmiao Wang, Taosheng Xu","doi":"arxiv-2403.10863","DOIUrl":null,"url":null,"abstract":"Spatially resolved transcriptomics represents a significant advancement in\nsingle-cell analysis by offering both gene expression data and their\ncorresponding physical locations. However, this high degree of spatial\nresolution entails a drawback, as the resulting spatial transcriptomic data at\nthe cellular level is notably plagued by a high incidence of missing values.\nFurthermore, most existing imputation methods either overlook the spatial\ninformation between spots or compromise the overall gene expression data\ndistribution. To address these challenges, our primary focus is on effectively\nutilizing the spatial location information within spatial transcriptomic data\nto impute missing values, while preserving the overall data distribution. We\nintroduce \\textbf{stMCDI}, a novel conditional diffusion model for spatial\ntranscriptomics data imputation, which employs a denoising network trained\nusing randomly masked data portions as guidance, with the unmasked data serving\nas conditions. Additionally, it utilizes a GNN encoder to integrate the spatial\nposition information, thereby enhancing model performance. The results obtained\nfrom spatial transcriptomics datasets elucidate the performance of our methods\nrelative to existing approaches.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"120 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation\",\"authors\":\"Xiaoyu Li, Wenwen Min, Shunfang Wang, Changmiao Wang, Taosheng Xu\",\"doi\":\"arxiv-2403.10863\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spatially resolved transcriptomics represents a significant advancement in\\nsingle-cell analysis by offering both gene expression data and their\\ncorresponding physical locations. However, this high degree of spatial\\nresolution entails a drawback, as the resulting spatial transcriptomic data at\\nthe cellular level is notably plagued by a high incidence of missing values.\\nFurthermore, most existing imputation methods either overlook the spatial\\ninformation between spots or compromise the overall gene expression data\\ndistribution. To address these challenges, our primary focus is on effectively\\nutilizing the spatial location information within spatial transcriptomic data\\nto impute missing values, while preserving the overall data distribution. We\\nintroduce \\\\textbf{stMCDI}, a novel conditional diffusion model for spatial\\ntranscriptomics data imputation, which employs a denoising network trained\\nusing randomly masked data portions as guidance, with the unmasked data serving\\nas conditions. Additionally, it utilizes a GNN encoder to integrate the spatial\\nposition information, thereby enhancing model performance. The results obtained\\nfrom spatial transcriptomics datasets elucidate the performance of our methods\\nrelative to existing approaches.\",\"PeriodicalId\":501070,\"journal\":{\"name\":\"arXiv - QuanBio - Genomics\",\"volume\":\"120 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Genomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2403.10863\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.10863","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

空间解析转录组学提供了基因表达数据及其相应的物理位置,是单细胞分析的一大进步。然而,这种高度的空间分辨率也有缺点,因为由此产生的细胞水平的空间转录组数据明显受到缺失值发生率高的困扰。此外,大多数现有的估算方法要么忽略了点之间的空间信息,要么损害了整体基因表达数据分布。为了应对这些挑战,我们的主要重点是有效利用空间转录组数据中的空间位置信息来估算缺失值,同时保留整体数据分布。我们引入了一种用于空间转录组学数据估算的新型条件扩散模型--textbf{stMCDI},该模型采用了以随机屏蔽的数据部分为指导、以未屏蔽的数据为条件训练而成的去噪网络。此外,它还利用 GNN 编码器整合空间位置信息,从而提高了模型性能。从空间转录组学数据集获得的结果阐明了我们的方法相对于现有方法的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
stMCDI: Masked Conditional Diffusion Model with Graph Neural Network for Spatial Transcriptomics Data Imputation
Spatially resolved transcriptomics represents a significant advancement in single-cell analysis by offering both gene expression data and their corresponding physical locations. However, this high degree of spatial resolution entails a drawback, as the resulting spatial transcriptomic data at the cellular level is notably plagued by a high incidence of missing values. Furthermore, most existing imputation methods either overlook the spatial information between spots or compromise the overall gene expression data distribution. To address these challenges, our primary focus is on effectively utilizing the spatial location information within spatial transcriptomic data to impute missing values, while preserving the overall data distribution. We introduce \textbf{stMCDI}, a novel conditional diffusion model for spatial transcriptomics data imputation, which employs a denoising network trained using randomly masked data portions as guidance, with the unmasked data serving as conditions. Additionally, it utilizes a GNN encoder to integrate the spatial position information, thereby enhancing model performance. The results obtained from spatial transcriptomics datasets elucidate the performance of our methods relative to existing approaches.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信