Yicheng Gao, Kejing Dong, Yuli Gao, Xuan Jin, Jingya Yang, Gang Yan, Qi Liu
{"title":"Unified cross-modality integration and analysis of T cell receptors and T cell transcriptomes by low-resource-aware representation learning.","authors":"Yicheng Gao, Kejing Dong, Yuli Gao, Xuan Jin, Jingya Yang, Gang Yan, Qi Liu","doi":"10.1016/j.xgen.2024.100553","DOIUrl":null,"url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) and T cell receptor sequencing (TCR-seq) are pivotal for investigating T cell heterogeneity. Integrating these modalities, which is expected to uncover profound insights in immunology that might otherwise go unnoticed with a single modality, faces computational challenges due to the low-resource characteristics of the multimodal data. Herein, we present UniTCR, a novel low-resource-aware multimodal representation learning framework designed for the unified cross-modality integration, enabling comprehensive T cell analysis. By designing a dual-modality contrastive learning module and a single-modality preservation module to effectively embed each modality into a common latent space, UniTCR demonstrates versatility in connecting TCR sequences with T cell transcriptomes across various tasks, including single-modality analysis, modality gap analysis, epitope-TCR binding prediction, and TCR profile cross-modality generation, in a low-resource-aware way. Extensive evaluations conducted on multiple scRNA-seq/TCR-seq paired datasets showed the superior performance of UniTCR, exhibiting the ability of exploring the complexity of immune system.</p>","PeriodicalId":72539,"journal":{"name":"Cell genomics","volume":" ","pages":"100553"},"PeriodicalIF":11.1000,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11099349/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.xgen.2024.100553","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/4/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Single-cell RNA sequencing (scRNA-seq) and T cell receptor sequencing (TCR-seq) are pivotal for investigating T cell heterogeneity. Integrating these modalities, which is expected to uncover profound insights in immunology that might otherwise go unnoticed with a single modality, faces computational challenges due to the low-resource characteristics of the multimodal data. Herein, we present UniTCR, a novel low-resource-aware multimodal representation learning framework designed for the unified cross-modality integration, enabling comprehensive T cell analysis. By designing a dual-modality contrastive learning module and a single-modality preservation module to effectively embed each modality into a common latent space, UniTCR demonstrates versatility in connecting TCR sequences with T cell transcriptomes across various tasks, including single-modality analysis, modality gap analysis, epitope-TCR binding prediction, and TCR profile cross-modality generation, in a low-resource-aware way. Extensive evaluations conducted on multiple scRNA-seq/TCR-seq paired datasets showed the superior performance of UniTCR, exhibiting the ability of exploring the complexity of immune system.
单细胞 RNA 测序(scRNA-seq)和 T 细胞受体测序(TCR-seq)是研究 T 细胞异质性的关键。由于多模态数据的低资源特性,将这些模态整合在一起面临着计算上的挑战。在此,我们提出了 UniTCR,这是一种新型的低资源感知多模态表征学习框架,旨在进行统一的跨模态整合,从而实现全面的 T 细胞分析。UniTCR 设计了双模态对比学习模块和单模态保存模块,将每种模态有效地嵌入到一个共同的潜在空间中,从而以一种低资源感知的方式在各种任务中展示了连接 TCR 序列和 T 细胞转录组的多功能性,包括单模态分析、模态差距分析、表位-TCR 结合预测和 TCR 图谱跨模态生成。在多个scRNA-seq/TCR-seq配对数据集上进行的广泛评估表明,UniTCR性能优越,具有探索免疫系统复杂性的能力。