{"title":"scMEDAL for the interpretable analysis of single-cell transcriptomics data with batch effect visualization using a deep mixed effects autoencoder.","authors":"Aixa X Andrade, Son Nguyen, Albert Montillo","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>scRNA-seq data has the potential to provide new insights into cellular heterogeneity and data acquisition; however, a major challenge is unraveling confounding from technical and biological batch effects. Existing batch correction algorithms suppress and discard these effects, rather than quantifying and modeling them. Here, we present scMEDAL, a framework for single-cell Mixed Effects Deep Autoencoder Learning, which separately models batch-invariant and batch-specific effects using two complementary autoencoder networks. One network is trained through adversarial learning to capture a batch-invariant representation, while a Bayesian autoencoder learns a batch-specific representation. Comprehensive evaluations spanning conditions (e.g., autism, leukemia, and cardiovascular), cell types, and technical and biological effects demonstrate that scMEDAL suppresses batch effects while modeling batch-specific variation, enhancing accuracy and interpretability. Unlike prior approaches, the framework's fixed- and random-effects autoencoders enable retrospective analyses, including predicting a cell's expression as if it had been acquired in a different batch via genomap projections at the cellular level, revealing the impact of biological (e.g., diagnosis) and technical (e.g., acquisition) effects. By combining scMEDAL's batch-agnostic and batch-specific latent spaces, it enables more accurate predictions of disease status, donor group, and cell type, making scMEDAL a valuable framework for gaining deeper insight into data acquisition and cellular heterogeneity.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601787/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
scRNA-seq data has the potential to provide new insights into cellular heterogeneity and data acquisition; however, a major challenge is unraveling confounding from technical and biological batch effects. Existing batch correction algorithms suppress and discard these effects, rather than quantifying and modeling them. Here, we present scMEDAL, a framework for single-cell Mixed Effects Deep Autoencoder Learning, which separately models batch-invariant and batch-specific effects using two complementary autoencoder networks. One network is trained through adversarial learning to capture a batch-invariant representation, while a Bayesian autoencoder learns a batch-specific representation. Comprehensive evaluations spanning conditions (e.g., autism, leukemia, and cardiovascular), cell types, and technical and biological effects demonstrate that scMEDAL suppresses batch effects while modeling batch-specific variation, enhancing accuracy and interpretability. Unlike prior approaches, the framework's fixed- and random-effects autoencoders enable retrospective analyses, including predicting a cell's expression as if it had been acquired in a different batch via genomap projections at the cellular level, revealing the impact of biological (e.g., diagnosis) and technical (e.g., acquisition) effects. By combining scMEDAL's batch-agnostic and batch-specific latent spaces, it enables more accurate predictions of disease status, donor group, and cell type, making scMEDAL a valuable framework for gaining deeper insight into data acquisition and cellular heterogeneity.