Ping-Han Hsieh, Ru-Xiu Hsiao, Katalin Ferenc, Anthony Mathelier, Rebekka Burkholz, Chien-Yu Chen, Geir Kjetil Sandve, Tatiana Belova, Marieke Lydia Kuijjer
{"title":"CAVACHON: a hierarchical variational autoencoder to integrate multi-modal single-cell data","authors":"Ping-Han Hsieh, Ru-Xiu Hsiao, Katalin Ferenc, Anthony Mathelier, Rebekka Burkholz, Chien-Yu Chen, Geir Kjetil Sandve, Tatiana Belova, Marieke Lydia Kuijjer","doi":"arxiv-2405.18655","DOIUrl":null,"url":null,"abstract":"Paired single-cell sequencing technologies enable the simultaneous\nmeasurement of complementary modalities of molecular data at single-cell\nresolution. Along with the advances in these technologies, many methods based\non variational autoencoders have been developed to integrate these data.\nHowever, these methods do not explicitly incorporate prior biological\nrelationships between the data modalities, which could significantly enhance\nmodeling and interpretation. We propose a novel probabilistic learning\nframework that explicitly incorporates conditional independence relationships\nbetween multi-modal data as a directed acyclic graph using a generalized\nhierarchical variational autoencoder. We demonstrate the versatility of our\nframework across various applications pertinent to single-cell multi-omics data\nintegration. These include the isolation of common and distinct information\nfrom different modalities, modality-specific differential analysis, and\nintegrated cell clustering. We anticipate that the proposed framework can\nfacilitate the construction of highly flexible graphical models that can\ncapture the complexities of biological hypotheses and unravel the connections\nbetween different biological data types, such as different modalities of paired\nsingle-cell multi-omics data. The implementation of the proposed framework can\nbe found in the repository https://github.com/kuijjerlab/CAVACHON.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"82 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.18655","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Paired single-cell sequencing technologies enable the simultaneous
measurement of complementary modalities of molecular data at single-cell
resolution. Along with the advances in these technologies, many methods based
on variational autoencoders have been developed to integrate these data.
However, these methods do not explicitly incorporate prior biological
relationships between the data modalities, which could significantly enhance
modeling and interpretation. We propose a novel probabilistic learning
framework that explicitly incorporates conditional independence relationships
between multi-modal data as a directed acyclic graph using a generalized
hierarchical variational autoencoder. We demonstrate the versatility of our
framework across various applications pertinent to single-cell multi-omics data
integration. These include the isolation of common and distinct information
from different modalities, modality-specific differential analysis, and
integrated cell clustering. We anticipate that the proposed framework can
facilitate the construction of highly flexible graphical models that can
capture the complexities of biological hypotheses and unravel the connections
between different biological data types, such as different modalities of paired
single-cell multi-omics data. The implementation of the proposed framework can
be found in the repository https://github.com/kuijjerlab/CAVACHON.