Qiu Xiao, Yan Zhang, Wanwan Shi, Li Wang, Ying Zuo, Fei Guo, Jiawei Luo
{"title":"scMID: a Deep Multi-omics Integration Framework for Comprehensive Single-cell Data Analysis.","authors":"Qiu Xiao, Yan Zhang, Wanwan Shi, Li Wang, Ying Zuo, Fei Guo, Jiawei Luo","doi":"10.1109/TCBBIO.2025.3624040","DOIUrl":null,"url":null,"abstract":"<p><p>Biological research on single cells has witnessed remarkable progress in recent years, with downstream analyses playing a crucial role in uncovering cellular functions and mechanisms. Traditional single-cell analyses, which predominantly rely on single-omics data such as single-cell RNA sequencing, are inherently limited. These methods can only capture one aspect of cellular information, overlooking the complex interplay between different molecular layers, and thus are prone to introducing biases in results. The advent of single cell multi-omics sequencing technologies has revolutionized this landscape. By enabling the integration of diverse molecular profiles, including transcriptomics, epigenomics, and proteomics, these technologies offer a more holistic view of cellular functions. However, existing integration methods often lack the ability to handle the complexity and heterogeneity of multi-omics data, limiting their application in in-depth single-cell studies. In this study, we propose an analysis method based on single-cell multi omics data integration and dropout pattern (scMID). Specifically, scMID utilizes omics-independent deep autoencoders for the alignment of multi-omics data, employs GCN algorithm for data integration, and calculates the gene importance by combining the gene similarity obtained from the binarized dropout pattern. Meanwhile, scMID proposes a dual-strategy for feature gene screening, aiming to identify genes with high biological significance that best match the structural characteristics of reference data. Experimental results demonstrate that scMID significantly improves the accuracy of single-cell clustering in downstream analyses, breaking through the limitations of traditional feature selection methods and providing a superior analytical framework for decoding complex biological information.</p>","PeriodicalId":520987,"journal":{"name":"IEEE transactions on computational biology and bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on computational biology and bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TCBBIO.2025.3624040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Biological research on single cells has witnessed remarkable progress in recent years, with downstream analyses playing a crucial role in uncovering cellular functions and mechanisms. Traditional single-cell analyses, which predominantly rely on single-omics data such as single-cell RNA sequencing, are inherently limited. These methods can only capture one aspect of cellular information, overlooking the complex interplay between different molecular layers, and thus are prone to introducing biases in results. The advent of single cell multi-omics sequencing technologies has revolutionized this landscape. By enabling the integration of diverse molecular profiles, including transcriptomics, epigenomics, and proteomics, these technologies offer a more holistic view of cellular functions. However, existing integration methods often lack the ability to handle the complexity and heterogeneity of multi-omics data, limiting their application in in-depth single-cell studies. In this study, we propose an analysis method based on single-cell multi omics data integration and dropout pattern (scMID). Specifically, scMID utilizes omics-independent deep autoencoders for the alignment of multi-omics data, employs GCN algorithm for data integration, and calculates the gene importance by combining the gene similarity obtained from the binarized dropout pattern. Meanwhile, scMID proposes a dual-strategy for feature gene screening, aiming to identify genes with high biological significance that best match the structural characteristics of reference data. Experimental results demonstrate that scMID significantly improves the accuracy of single-cell clustering in downstream analyses, breaking through the limitations of traditional feature selection methods and providing a superior analytical framework for decoding complex biological information.