scAMZI: attention-based deep autoencoder with zero-inflated layer for clustering scRNA-seq data.

IF 3.5 2区 生物学 Q2 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Lin Yuan, Zhijie Xu, Boyuan Meng, Lan Ye
{"title":"scAMZI: attention-based deep autoencoder with zero-inflated layer for clustering scRNA-seq data.","authors":"Lin Yuan, Zhijie Xu, Boyuan Meng, Lan Ye","doi":"10.1186/s12864-025-11511-2","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Clustering scRNA-seq data plays a vital role in scRNA-seq data analysis and downstream analyses. Many computational methods have been proposed and achieved remarkable results. However, there are several limitations of these methods. First, they do not fully exploit cellular features. Second, they are developed based on gene expression information and lack of flexibility in integrating intercellular relationships. Finally, the performance of these methods is affected by dropout event.</p><p><strong>Results: </strong>We propose a novel deep learning (DL) model based on attention autoencoder and zero-inflated (ZI) layer, namely scAMZI, to cluster scRNA-seq data. scAMZI is mainly composed of SimAM (a Simple, parameter-free Attention Module), autoencoder, ZINB (Zero-Inflated Negative Binomial) model and ZI layer. Based on ZINB model, we introduce autoencoder and SimAM to reduce dimensionality of data and learn feature representations of cells and relationships between cells. Meanwhile, ZI layer is used to handle zero values in the data. We compare the performance of scAMZI with nine methods (three shallow learning algorithms and six state-of-the-art DL-based methods) on fourteen benchmark scRNA-seq datasets of various sizes (from hundreds to tens of thousands of cells) with known cell types. Experimental results demonstrate that scAMZI outperforms competing methods.</p><p><strong>Conclusions: </strong>scAMZI outperforms competing methods and can facilitate downstream analyses such as cell annotation, marker gene discovery, and cell trajectory inference. The package of scAMZI is made freely available at https://doi.org/10.5281/zenodo.13131559 .</p>","PeriodicalId":9030,"journal":{"name":"BMC Genomics","volume":"26 1","pages":"350"},"PeriodicalIF":3.5000,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12864-025-11511-2","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Clustering scRNA-seq data plays a vital role in scRNA-seq data analysis and downstream analyses. Many computational methods have been proposed and achieved remarkable results. However, there are several limitations of these methods. First, they do not fully exploit cellular features. Second, they are developed based on gene expression information and lack of flexibility in integrating intercellular relationships. Finally, the performance of these methods is affected by dropout event.

Results: We propose a novel deep learning (DL) model based on attention autoencoder and zero-inflated (ZI) layer, namely scAMZI, to cluster scRNA-seq data. scAMZI is mainly composed of SimAM (a Simple, parameter-free Attention Module), autoencoder, ZINB (Zero-Inflated Negative Binomial) model and ZI layer. Based on ZINB model, we introduce autoencoder and SimAM to reduce dimensionality of data and learn feature representations of cells and relationships between cells. Meanwhile, ZI layer is used to handle zero values in the data. We compare the performance of scAMZI with nine methods (three shallow learning algorithms and six state-of-the-art DL-based methods) on fourteen benchmark scRNA-seq datasets of various sizes (from hundreds to tens of thousands of cells) with known cell types. Experimental results demonstrate that scAMZI outperforms competing methods.

Conclusions: scAMZI outperforms competing methods and can facilitate downstream analyses such as cell annotation, marker gene discovery, and cell trajectory inference. The package of scAMZI is made freely available at https://doi.org/10.5281/zenodo.13131559 .

scAMZI:基于注意力的深度自编码器,带有零膨胀层,用于聚类scRNA-seq数据。
背景:scRNA-seq数据聚类在scRNA-seq数据分析和下游分析中起着至关重要的作用。人们提出了许多计算方法,并取得了显著的效果。然而,这些方法有一些局限性。首先,它们没有充分利用蜂窝的特性。其次,它们是基于基因表达信息而发展起来的,在整合细胞间关系方面缺乏灵活性。最后,这些方法的性能受到dropout事件的影响。结果:我们提出了一种新的基于注意力自编码器和零膨胀(ZI)层的深度学习(DL)模型,即scAMZI,用于对scRNA-seq数据进行聚类。scAMZI主要由SimAM(一种简单、无参数的注意力模块)、自编码器、ZINB(零膨胀负二项)模型和ZI层组成。在ZINB模型的基础上,引入自编码器和SimAM来降低数据的维数,学习细胞的特征表示和细胞之间的关系。同时,ZI层用于处理数据中的零值。我们将scAMZI与九种方法(三种浅学习算法和六种最先进的基于dl的方法)在已知细胞类型的14个不同大小(从数百到数万个细胞)的基准scRNA-seq数据集上的性能进行了比较。实验结果表明,scAMZI算法优于同类算法。结论:scAMZI优于竞争对手的方法,可以促进下游分析,如细胞注释,标记基因发现和细胞轨迹推断。scAMZI的软件包可在https://doi.org/10.5281/zenodo.13131559免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
BMC Genomics
BMC Genomics 生物-生物工程与应用微生物
CiteScore
7.40
自引率
4.50%
发文量
769
审稿时长
6.4 months
期刊介绍: BMC Genomics is an open access, peer-reviewed journal that considers articles on all aspects of genome-scale analysis, functional genomics, and proteomics. BMC Genomics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信