Integrating single-cell multimodal epigenomic data using 1D convolutional neural networks.

Bioinformatics (Oxford, England) Pub Date : 2024-12-26 DOI:10.1093/bioinformatics/btae705

Chao Gao, Joshua D Welch

{"title":"Integrating single-cell multimodal epigenomic data using 1D convolutional neural networks.","authors":"Chao Gao, Joshua D Welch","doi":"10.1093/bioinformatics/btae705","DOIUrl":null,"url":null,"abstract":"Motivation: Recent experimental developments enable single-cell multimodal epigenomic profiling, which measures multiple histone modifications and chromatin accessibility within the same cell. Such parallel measurements provide exciting new opportunities to investigate how epigenomic modalities vary together across cell types and states. A pivotal step in using these types of data is integrating the epigenomic modalities to learn a unified representation of each cell, but existing approaches are not designed to model the unique nature of this data type. Our key insight is to model single-cell multimodal epigenome data as a multichannel sequential signal.Results: We developed ConvNet-VAEs, a novel framework that uses one-dimensional (1D) convolutional variational autoencoders (VAEs) for single-cell multimodal epigenomic data integration. We evaluated ConvNet-VAEs on nano-CUT&Tag and single-cell nanobody-tethered transposition followed by sequencing data generated from juvenile mouse brain and human bone marrow. We found that ConvNet-VAEs can perform dimension reduction and batch correction better than previous architectures while using significantly fewer parameters. Furthermore, the performance gap between convolutional and fully connected architectures increases with the number of modalities, and deeper convolutional architectures can increase the performance, while the performance degrades for deeper fully connected architectures. Our results indicate that convolutional autoencoders are a promising method for integrating current and future single-cell multimodal epigenomic datasets.Availability and implementation: The source code of VAE models and a demo in Jupyter notebook are available at https://github.com/welch-lab/ConvNetVAE.","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":"41 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751632/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae705","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Motivation: Recent experimental developments enable single-cell multimodal epigenomic profiling, which measures multiple histone modifications and chromatin accessibility within the same cell. Such parallel measurements provide exciting new opportunities to investigate how epigenomic modalities vary together across cell types and states. A pivotal step in using these types of data is integrating the epigenomic modalities to learn a unified representation of each cell, but existing approaches are not designed to model the unique nature of this data type. Our key insight is to model single-cell multimodal epigenome data as a multichannel sequential signal.

Results: We developed ConvNet-VAEs, a novel framework that uses one-dimensional (1D) convolutional variational autoencoders (VAEs) for single-cell multimodal epigenomic data integration. We evaluated ConvNet-VAEs on nano-CUT&Tag and single-cell nanobody-tethered transposition followed by sequencing data generated from juvenile mouse brain and human bone marrow. We found that ConvNet-VAEs can perform dimension reduction and batch correction better than previous architectures while using significantly fewer parameters. Furthermore, the performance gap between convolutional and fully connected architectures increases with the number of modalities, and deeper convolutional architectures can increase the performance, while the performance degrades for deeper fully connected architectures. Our results indicate that convolutional autoencoders are a promising method for integrating current and future single-cell multimodal epigenomic datasets.

Availability and implementation: The source code of VAE models and a demo in Jupyter notebook are available at https://github.com/welch-lab/ConvNetVAE.

查看原文本刊更多论文

利用一维卷积神经网络整合单细胞多模态表观基因组数据。

动机：最近的实验发展使单细胞多模态表观基因组分析成为可能，该分析可测量同一细胞内的多种组蛋白修饰和染色质可及性。这种平行测量提供了令人兴奋的新机会来研究表观基因组模式如何在不同的细胞类型和状态下共同变化。使用这些类型的数据的关键一步是整合表观基因组模式来学习每个细胞的统一表示，但是现有的方法并不是为了模拟这种数据类型的独特性而设计的。我们的关键见解是将单细胞多模态表观基因组数据建模为多通道序列信号。结果：我们开发了ConvNet-VAEs，这是一种使用一维（1D）卷积变分自编码器（VAEs）进行单细胞多模态表观基因组数据整合的新框架。我们对ConvNet-VAEs在纳米cut&tag和单细胞纳米体系固转位上的应用进行了评估，并对来自幼鼠大脑和人骨髓的测序数据进行了评估。我们发现，在使用更少参数的情况下，ConvNet-VAEs可以比以前的架构更好地执行降维和批处理校正。此外，卷积和全连接架构之间的性能差距随着模态数量的增加而增加，更深的卷积架构可以提高性能，而更深的全连接架构则会降低性能。我们的研究结果表明，卷积自编码器是整合当前和未来单细胞多模态表观基因组数据集的一种很有前途的方法。可用性和实现：可以在https://github.com/welch-lab/ConvNetVAE上获得VAE模型的源代码和Jupyter笔记本中的演示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Bioinformatics (Oxford, England)

自引率

0.00%

发文量