scMID: a Deep Multi-omics Integration Framework for Comprehensive Single-cell Data Analysis.

Qiu Xiao, Yan Zhang, Wanwan Shi, Li Wang, Ying Zuo, Fei Guo, Jiawei Luo
{"title":"scMID: a Deep Multi-omics Integration Framework for Comprehensive Single-cell Data Analysis.","authors":"Qiu Xiao, Yan Zhang, Wanwan Shi, Li Wang, Ying Zuo, Fei Guo, Jiawei Luo","doi":"10.1109/TCBBIO.2025.3624040","DOIUrl":null,"url":null,"abstract":"<p><p>Biological research on single cells has witnessed remarkable progress in recent years, with downstream analyses playing a crucial role in uncovering cellular functions and mechanisms. Traditional single-cell analyses, which predominantly rely on single-omics data such as single-cell RNA sequencing, are inherently limited. These methods can only capture one aspect of cellular information, overlooking the complex interplay between different molecular layers, and thus are prone to introducing biases in results. The advent of single cell multi-omics sequencing technologies has revolutionized this landscape. By enabling the integration of diverse molecular profiles, including transcriptomics, epigenomics, and proteomics, these technologies offer a more holistic view of cellular functions. However, existing integration methods often lack the ability to handle the complexity and heterogeneity of multi-omics data, limiting their application in in-depth single-cell studies. In this study, we propose an analysis method based on single-cell multi omics data integration and dropout pattern (scMID). Specifically, scMID utilizes omics-independent deep autoencoders for the alignment of multi-omics data, employs GCN algorithm for data integration, and calculates the gene importance by combining the gene similarity obtained from the binarized dropout pattern. Meanwhile, scMID proposes a dual-strategy for feature gene screening, aiming to identify genes with high biological significance that best match the structural characteristics of reference data. Experimental results demonstrate that scMID significantly improves the accuracy of single-cell clustering in downstream analyses, breaking through the limitations of traditional feature selection methods and providing a superior analytical framework for decoding complex biological information.</p>","PeriodicalId":520987,"journal":{"name":"IEEE transactions on computational biology and bioinformatics","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on computational biology and bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TCBBIO.2025.3624040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Biological research on single cells has witnessed remarkable progress in recent years, with downstream analyses playing a crucial role in uncovering cellular functions and mechanisms. Traditional single-cell analyses, which predominantly rely on single-omics data such as single-cell RNA sequencing, are inherently limited. These methods can only capture one aspect of cellular information, overlooking the complex interplay between different molecular layers, and thus are prone to introducing biases in results. The advent of single cell multi-omics sequencing technologies has revolutionized this landscape. By enabling the integration of diverse molecular profiles, including transcriptomics, epigenomics, and proteomics, these technologies offer a more holistic view of cellular functions. However, existing integration methods often lack the ability to handle the complexity and heterogeneity of multi-omics data, limiting their application in in-depth single-cell studies. In this study, we propose an analysis method based on single-cell multi omics data integration and dropout pattern (scMID). Specifically, scMID utilizes omics-independent deep autoencoders for the alignment of multi-omics data, employs GCN algorithm for data integration, and calculates the gene importance by combining the gene similarity obtained from the binarized dropout pattern. Meanwhile, scMID proposes a dual-strategy for feature gene screening, aiming to identify genes with high biological significance that best match the structural characteristics of reference data. Experimental results demonstrate that scMID significantly improves the accuracy of single-cell clustering in downstream analyses, breaking through the limitations of traditional feature selection methods and providing a superior analytical framework for decoding complex biological information.

scMID:用于综合单细胞数据分析的深度多组学集成框架。
近年来,单细胞生物学研究取得了显著进展,下游分析在揭示细胞功能和机制方面起着至关重要的作用。传统的单细胞分析,主要依赖于单细胞组学数据,如单细胞RNA测序,本质上是有限的。这些方法只能捕获细胞信息的一个方面,忽略了不同分子层之间复杂的相互作用,因此容易在结果中引入偏差。单细胞多组学测序技术的出现彻底改变了这一格局。通过整合不同的分子图谱,包括转录组学、表观基因组学和蛋白质组学,这些技术提供了一个更全面的细胞功能视图。然而,现有的整合方法往往缺乏处理多组学数据的复杂性和异质性的能力,限制了它们在深入单细胞研究中的应用。在这项研究中,我们提出了一种基于单细胞多组学数据整合和辍学模式(scMID)的分析方法。具体而言,scMID利用独立于组学的深度自编码器对多组学数据进行比对,采用GCN算法对数据进行整合,并结合二值化dropout模式得到的基因相似度计算基因重要度。同时,scMID提出了特征基因筛选的双重策略,旨在筛选出与参考数据结构特征最匹配的具有高生物学意义的基因。实验结果表明,scMID显著提高了下游分析中单细胞聚类的准确性,突破了传统特征选择方法的局限性,为复杂生物信息的解码提供了优越的分析框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信