scCAM:基于残差网络和层类激活图对scRNA-seq数据进行注释并识别注释相关基因的一种可解释的方法

IF 0.9 Q4 GENETICS & HEREDITY
Ya Zhang , Yongzhao Du , Yuqing Fu
{"title":"scCAM:基于残差网络和层类激活图对scRNA-seq数据进行注释并识别注释相关基因的一种可解释的方法","authors":"Ya Zhang ,&nbsp;Yongzhao Du ,&nbsp;Yuqing Fu","doi":"10.1016/j.genrep.2025.102297","DOIUrl":null,"url":null,"abstract":"<div><div>Single-cell RNA sequencing (scRNA-seq) has been widely used to explore gene expression and cellular heterogeneity. Cell type annotation is a crucial step in the scRNA-seq data analysis. Recently, several deep learning methods have been developed for cell annotation. However, most existing methods lack biological explainability and fail to discover key genes related to annotation. Therefore, we propose an explainable automatic cell annotation method: scCAM. Our method combines residual networks and layer class activation maps, constructs grayscale images to represent gene expression, and utilizes backward class-specific gradients and the spatial location importance to explore the cell annotation decision-making processes and discover annotation-related genes. We performed experiments on benchmark datasets in multiple situations, the experimental results show that scCAM outperforms other state-of-the-art methods, especially on the large-scale dataset, exceeds other methods by 6.4 %, 18 %, 39.2 %, 16.5 % and 9.8 % on the accuracy, respectively. Explainable analysis on the Segerstolpe pancreas dataset successfully identifies annotation-related genes including marker genes and differentially expressed genes, provides reference and support for the discovery of new marker genes. The source code of scCAM is available at <span><span>https://github.com/zhangya10956/scCAM-cell-anno</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":12673,"journal":{"name":"Gene Reports","volume":"41 ","pages":"Article 102297"},"PeriodicalIF":0.9000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"scCAM: An explainable method to annotate scRNA-seq data and identify annotation-related genes based on residual network and layer class activation maps\",\"authors\":\"Ya Zhang ,&nbsp;Yongzhao Du ,&nbsp;Yuqing Fu\",\"doi\":\"10.1016/j.genrep.2025.102297\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Single-cell RNA sequencing (scRNA-seq) has been widely used to explore gene expression and cellular heterogeneity. Cell type annotation is a crucial step in the scRNA-seq data analysis. Recently, several deep learning methods have been developed for cell annotation. However, most existing methods lack biological explainability and fail to discover key genes related to annotation. Therefore, we propose an explainable automatic cell annotation method: scCAM. Our method combines residual networks and layer class activation maps, constructs grayscale images to represent gene expression, and utilizes backward class-specific gradients and the spatial location importance to explore the cell annotation decision-making processes and discover annotation-related genes. We performed experiments on benchmark datasets in multiple situations, the experimental results show that scCAM outperforms other state-of-the-art methods, especially on the large-scale dataset, exceeds other methods by 6.4 %, 18 %, 39.2 %, 16.5 % and 9.8 % on the accuracy, respectively. Explainable analysis on the Segerstolpe pancreas dataset successfully identifies annotation-related genes including marker genes and differentially expressed genes, provides reference and support for the discovery of new marker genes. The source code of scCAM is available at <span><span>https://github.com/zhangya10956/scCAM-cell-anno</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":12673,\"journal\":{\"name\":\"Gene Reports\",\"volume\":\"41 \",\"pages\":\"Article 102297\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Gene Reports\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2452014425001700\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"GENETICS & HEREDITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Gene Reports","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452014425001700","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

摘要

单细胞RNA测序(scRNA-seq)已被广泛用于研究基因表达和细胞异质性。细胞类型标注是scRNA-seq数据分析的关键步骤。近年来,人们开发了几种用于细胞注释的深度学习方法。然而,现有的方法大多缺乏生物学上的可解释性,无法发现与注释相关的关键基因。因此,我们提出了一种可解释的自动细胞注释方法:scCAM。我们的方法结合残差网络和层类激活图,构建灰度图像来表示基因表达,并利用后向类特异性梯度和空间位置重要性来探索细胞注释决策过程,发现与注释相关的基因。在多个场景的基准数据集上进行了实验,实验结果表明,scCAM在大规模数据集上的准确率分别比其他方法高出6.4%、18%、39.2%、16.5%和9.8%。通过对Segerstolpe胰腺数据集的可解释性分析,成功鉴定出标记基因和差异表达基因等注释相关基因,为发现新的标记基因提供了参考和支持。scCAM的源代码可从https://github.com/zhangya10956/scCAM-cell-anno获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
scCAM: An explainable method to annotate scRNA-seq data and identify annotation-related genes based on residual network and layer class activation maps
Single-cell RNA sequencing (scRNA-seq) has been widely used to explore gene expression and cellular heterogeneity. Cell type annotation is a crucial step in the scRNA-seq data analysis. Recently, several deep learning methods have been developed for cell annotation. However, most existing methods lack biological explainability and fail to discover key genes related to annotation. Therefore, we propose an explainable automatic cell annotation method: scCAM. Our method combines residual networks and layer class activation maps, constructs grayscale images to represent gene expression, and utilizes backward class-specific gradients and the spatial location importance to explore the cell annotation decision-making processes and discover annotation-related genes. We performed experiments on benchmark datasets in multiple situations, the experimental results show that scCAM outperforms other state-of-the-art methods, especially on the large-scale dataset, exceeds other methods by 6.4 %, 18 %, 39.2 %, 16.5 % and 9.8 % on the accuracy, respectively. Explainable analysis on the Segerstolpe pancreas dataset successfully identifies annotation-related genes including marker genes and differentially expressed genes, provides reference and support for the discovery of new marker genes. The source code of scCAM is available at https://github.com/zhangya10956/scCAM-cell-anno.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Gene Reports
Gene Reports Biochemistry, Genetics and Molecular Biology-Genetics
CiteScore
3.30
自引率
7.70%
发文量
246
审稿时长
49 days
期刊介绍: Gene Reports publishes papers that focus on the regulation, expression, function and evolution of genes in all biological contexts, including all prokaryotic and eukaryotic organisms, as well as viruses. Gene Reports strives to be a very diverse journal and topics in all fields will be considered for publication. Although not limited to the following, some general topics include: DNA Organization, Replication & Evolution -Focus on genomic DNA (chromosomal organization, comparative genomics, DNA replication, DNA repair, mobile DNA, mitochondrial DNA, chloroplast DNA). Expression & Function - Focus on functional RNAs (microRNAs, tRNAs, rRNAs, mRNA splicing, alternative polyadenylation) Regulation - Focus on processes that mediate gene-read out (epigenetics, chromatin, histone code, transcription, translation, protein degradation). Cell Signaling - Focus on mechanisms that control information flow into the nucleus to control gene expression (kinase and phosphatase pathways controlled by extra-cellular ligands, Wnt, Notch, TGFbeta/BMPs, FGFs, IGFs etc.) Profiling of gene expression and genetic variation - Focus on high throughput approaches (e.g., DeepSeq, ChIP-Seq, Affymetrix microarrays, proteomics) that define gene regulatory circuitry, molecular pathways and protein/protein networks. Genetics - Focus on development in model organisms (e.g., mouse, frog, fruit fly, worm), human genetic variation, population genetics, as well as agricultural and veterinary genetics. Molecular Pathology & Regenerative Medicine - Focus on the deregulation of molecular processes in human diseases and mechanisms supporting regeneration of tissues through pluripotent or multipotent stem cells.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信