Enhancing Spatial Domain Identification in Spatially Resolved Transcriptomics Using Graph Convolutional Networks With Adaptively Feature-Spatial Balance and Contrastive Learning

IF 3.4 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

IEEE/ACM Transactions on Computational Biology and Bioinformatics Pub Date : 2024-09-27 DOI:10.1109/TCBB.2024.3469164

Xuena Liang;Junliang Shang;Jin-Xing Liu;Chun-Hou Zheng;Juan Wang

{"title":"Enhancing Spatial Domain Identification in Spatially Resolved Transcriptomics Using Graph Convolutional Networks With Adaptively Feature-Spatial Balance and Contrastive Learning","authors":"Xuena Liang;Junliang Shang;Jin-Xing Liu;Chun-Hou Zheng;Juan Wang","doi":"10.1109/TCBB.2024.3469164","DOIUrl":null,"url":null,"abstract":"Recent advancements in spatially transcriptomics (ST) technologies have enabled the comprehensive measurement of gene expression profiles while preserving the spatial information of cells. Combining gene expression profiles and spatial information has been the most commonly used method to identify spatial functional domains and genes. However, most existing spatial domain decipherer methods are more focused on spatially neighboring structures and fail to take into account balancing the self-characteristics and the spatial structure dependency of spots. Therefore, we propose a novel model called SpaGCAC, which recognizes spatial domains with the help of an adaptive feature-spatial balanced graph convolutional network named AFSBGCN. The AFSBGCN can dynamically learn the relationship between spatial local topology structures and the self-characteristics of spots by adaptively increasing or declining the weight on the self-characteristics during message aggregation. Moreover, to better capture the local structures of spots, SpaGCAC exploits a local topology structure contrastive learning strategy. Meanwhile, SpaGCAC utilizes a probability distribution contrastive learning strategy to increase the similarity of probability distributions for points belonging to the same category. We validate the performance of SpaGCAC for spatial domain identification on four spatial transcriptomic datasets. In comparison with seven spatial domain recognition methods, SpaGCAC achieved the highest NMI median of 0.683 and the second highest ARI median of 0.559 on the multi-slice DLPFC dataset. SpaGCAC achieved the best results on all three other single-slice datasets. The above-mentioned results show that SpaGCAC outperforms most existing methods, providing enhanced insights into tissue heterogeneity.","PeriodicalId":13344,"journal":{"name":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","volume":"21 6","pages":"2406-2417"},"PeriodicalIF":3.4000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Computational Biology and Bioinformatics","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10696983/","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in spatially transcriptomics (ST) technologies have enabled the comprehensive measurement of gene expression profiles while preserving the spatial information of cells. Combining gene expression profiles and spatial information has been the most commonly used method to identify spatial functional domains and genes. However, most existing spatial domain decipherer methods are more focused on spatially neighboring structures and fail to take into account balancing the self-characteristics and the spatial structure dependency of spots. Therefore, we propose a novel model called SpaGCAC, which recognizes spatial domains with the help of an adaptive feature-spatial balanced graph convolutional network named AFSBGCN. The AFSBGCN can dynamically learn the relationship between spatial local topology structures and the self-characteristics of spots by adaptively increasing or declining the weight on the self-characteristics during message aggregation. Moreover, to better capture the local structures of spots, SpaGCAC exploits a local topology structure contrastive learning strategy. Meanwhile, SpaGCAC utilizes a probability distribution contrastive learning strategy to increase the similarity of probability distributions for points belonging to the same category. We validate the performance of SpaGCAC for spatial domain identification on four spatial transcriptomic datasets. In comparison with seven spatial domain recognition methods, SpaGCAC achieved the highest NMI median of 0.683 and the second highest ARI median of 0.559 on the multi-slice DLPFC dataset. SpaGCAC achieved the best results on all three other single-slice datasets. The above-mentioned results show that SpaGCAC outperforms most existing methods, providing enhanced insights into tissue heterogeneity.

查看原文本刊更多论文

利用具有自适应特征空间平衡和对比学习功能的图卷积网络增强空间分辨转录组学中的空间域识别能力

空间转录组学（ST）技术的最新进展实现了对基因表达谱的全面测量，同时保留了细胞的空间信息。结合基因表达谱和空间信息一直是识别空间功能域和基因最常用的方法。然而，现有的空间功能域破译方法大多更关注空间相邻结构，未能兼顾斑的自特性和空间结构依赖性。因此，我们提出了一种名为 SpaGCAC 的新型模型，它借助名为 AFSBGCN 的自适应特征空间平衡图卷积网络来识别空间域。AFSBGCN 可以通过在信息聚合过程中自适应地增加或降低自特征的权重，动态学习空间局部拓扑结构与点的自特征之间的关系。此外，为了更好地捕捉点的局部结构，SpaGCAC 采用了局部拓扑结构对比学习策略。同时，SpaGCAC 利用概率分布对比学习策略来提高属于同一类别的点的概率分布的相似性。我们在四个空间转录组数据集上验证了 SpaGCAC 在空间域识别方面的性能。与七种空间域识别方法相比，SpaGCAC在多切片DLPFC数据集上取得了最高的NMI中值0.683和第二高的ARI中值0.559。SpaGCAC 在其他三个单片数据集上都取得了最佳结果。上述结果表明，SpaGCAC 优于大多数现有方法，能更好地洞察组织异质性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE/ACM Transactions on Computational Biology and Bioinformatics 工程技术-计算机：跨学科应用

CiteScore

7.50

自引率

6.70%

发文量

479

审稿时长

3 months

期刊介绍： IEEE/ACM Transactions on Computational Biology and Bioinformatics emphasizes the algorithmic, mathematical, statistical and computational methods that are central in bioinformatics and computational biology; the development and testing of effective computer programs in bioinformatics; the development of biological databases; and important biological results that are obtained from the use of these methods, programs and databases; the emerging field of Systems Biology, where many forms of data are used to create a computer-based model of a complex biological system