scAGG:基于单核数据的阿尔茨海默病样本水平嵌入和分类。

IF 4.1 2区 生物学 Q2 BIOCHEMISTRY & MOLECULAR BIOLOGY
Computational and structural biotechnology journal Pub Date : 2025-08-13 eCollection Date: 2025-01-01 DOI:10.1016/j.csbj.2025.08.009
T Verlaan, G A Bouland, A Mahfouz, M J T Reinders
{"title":"scAGG:基于单核数据的阿尔茨海默病样本水平嵌入和分类。","authors":"T Verlaan, G A Bouland, A Mahfouz, M J T Reinders","doi":"10.1016/j.csbj.2025.08.009","DOIUrl":null,"url":null,"abstract":"<p><p>Identifying key cell types and genes in Alzheimer's Disease (AD) is crucial for understanding its pathogenesis and discovering therapeutic targets. Single-cell RNA sequencing technology (scRNAseq) has provided unprecedented opportunities to study the molecular mechanisms that underlie AD at the cellular level. In this study, we address the problem of sample-level classification of AD using scRNAseq data, where we predict the disease status of entire samples from the gene expression profiles of their cells, which are not necessarily all affected by the disease. We introduce scAGG (single-cell AGGregation), a sample-level classification model that uses a sample-level pooling mechanism to aggregate single-cell embeddings, and show that it can accurately classify AD individuals and healthy controls. We then investigate the latent space learnt by the model and find that the model learns an ordering of the cells corresponding to disease severity. Genes associated with this ordering are enriched in AD-linked pathways, including cytokine signalling, apoptosis, and metal ion response. We also evaluate two attention-based models that perform on par with scAGG, but entropy analysis of their attention scores reveals limited interpretability value. As scRNAseq is increasingly applied to large cohorts and cell-level disease association annotations do not exist, our approach provides a way to classify phenotypes from single-cell measurements. The yielded cell- and sample-level severity scores may enable identification of AD-associated cell subtypes, paving the way for targeted drug development and personalized treatment strategies in AD. Code is available at: https://github.com/timoverlaan/scAGG.</p>","PeriodicalId":10715,"journal":{"name":"Computational and structural biotechnology journal","volume":"27 ","pages":"3753-3761"},"PeriodicalIF":4.1000,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448040/pdf/","citationCount":"0","resultStr":"{\"title\":\"scAGG: Sample-level embedding and classification of Alzheimer's disease from single-nucleus data.\",\"authors\":\"T Verlaan, G A Bouland, A Mahfouz, M J T Reinders\",\"doi\":\"10.1016/j.csbj.2025.08.009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Identifying key cell types and genes in Alzheimer's Disease (AD) is crucial for understanding its pathogenesis and discovering therapeutic targets. Single-cell RNA sequencing technology (scRNAseq) has provided unprecedented opportunities to study the molecular mechanisms that underlie AD at the cellular level. In this study, we address the problem of sample-level classification of AD using scRNAseq data, where we predict the disease status of entire samples from the gene expression profiles of their cells, which are not necessarily all affected by the disease. We introduce scAGG (single-cell AGGregation), a sample-level classification model that uses a sample-level pooling mechanism to aggregate single-cell embeddings, and show that it can accurately classify AD individuals and healthy controls. We then investigate the latent space learnt by the model and find that the model learns an ordering of the cells corresponding to disease severity. Genes associated with this ordering are enriched in AD-linked pathways, including cytokine signalling, apoptosis, and metal ion response. We also evaluate two attention-based models that perform on par with scAGG, but entropy analysis of their attention scores reveals limited interpretability value. As scRNAseq is increasingly applied to large cohorts and cell-level disease association annotations do not exist, our approach provides a way to classify phenotypes from single-cell measurements. The yielded cell- and sample-level severity scores may enable identification of AD-associated cell subtypes, paving the way for targeted drug development and personalized treatment strategies in AD. Code is available at: https://github.com/timoverlaan/scAGG.</p>\",\"PeriodicalId\":10715,\"journal\":{\"name\":\"Computational and structural biotechnology journal\",\"volume\":\"27 \",\"pages\":\"3753-3761\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2025-08-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12448040/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational and structural biotechnology journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1016/j.csbj.2025.08.009\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMISTRY & MOLECULAR BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational and structural biotechnology journal","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.csbj.2025.08.009","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

确定阿尔茨海默病(AD)的关键细胞类型和基因对于了解其发病机制和发现治疗靶点至关重要。单细胞RNA测序技术(scRNAseq)为在细胞水平上研究AD的分子机制提供了前所未有的机会。在这项研究中,我们使用scRNAseq数据解决了AD样本水平分类的问题,我们从其细胞的基因表达谱预测整个样本的疾病状态,这些细胞不一定都受到疾病的影响。我们引入了scAGG(单细胞聚集),这是一种样本级别的分类模型,它使用样本级别的池化机制来聚集单细胞嵌入,并表明它可以准确地对AD个体和健康对照进行分类。然后我们研究了模型学习的潜在空间,发现模型学习了与疾病严重程度相对应的细胞顺序。与这种排序相关的基因在ad相关途径中丰富,包括细胞因子信号传导、细胞凋亡和金属离子反应。我们还评估了两个基于注意力的模型,它们的表现与scAGG相当,但对它们的注意力得分的熵分析显示了有限的可解释性价值。由于scRNAseq越来越多地应用于大型队列,而细胞水平的疾病关联注释不存在,我们的方法提供了一种从单细胞测量中分类表型的方法。所得的细胞和样本水平的严重程度评分可以识别AD相关的细胞亚型,为AD的靶向药物开发和个性化治疗策略铺平道路。代码可从https://github.com/timoverlaan/scAGG获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
scAGG: Sample-level embedding and classification of Alzheimer's disease from single-nucleus data.

Identifying key cell types and genes in Alzheimer's Disease (AD) is crucial for understanding its pathogenesis and discovering therapeutic targets. Single-cell RNA sequencing technology (scRNAseq) has provided unprecedented opportunities to study the molecular mechanisms that underlie AD at the cellular level. In this study, we address the problem of sample-level classification of AD using scRNAseq data, where we predict the disease status of entire samples from the gene expression profiles of their cells, which are not necessarily all affected by the disease. We introduce scAGG (single-cell AGGregation), a sample-level classification model that uses a sample-level pooling mechanism to aggregate single-cell embeddings, and show that it can accurately classify AD individuals and healthy controls. We then investigate the latent space learnt by the model and find that the model learns an ordering of the cells corresponding to disease severity. Genes associated with this ordering are enriched in AD-linked pathways, including cytokine signalling, apoptosis, and metal ion response. We also evaluate two attention-based models that perform on par with scAGG, but entropy analysis of their attention scores reveals limited interpretability value. As scRNAseq is increasingly applied to large cohorts and cell-level disease association annotations do not exist, our approach provides a way to classify phenotypes from single-cell measurements. The yielded cell- and sample-level severity scores may enable identification of AD-associated cell subtypes, paving the way for targeted drug development and personalized treatment strategies in AD. Code is available at: https://github.com/timoverlaan/scAGG.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computational and structural biotechnology journal
Computational and structural biotechnology journal Biochemistry, Genetics and Molecular Biology-Biophysics
CiteScore
9.30
自引率
3.30%
发文量
540
审稿时长
6 weeks
期刊介绍: Computational and Structural Biotechnology Journal (CSBJ) is an online gold open access journal publishing research articles and reviews after full peer review. All articles are published, without barriers to access, immediately upon acceptance. The journal places a strong emphasis on functional and mechanistic understanding of how molecular components in a biological process work together through the application of computational methods. Structural data may provide such insights, but they are not a pre-requisite for publication in the journal. Specific areas of interest include, but are not limited to: Structure and function of proteins, nucleic acids and other macromolecules Structure and function of multi-component complexes Protein folding, processing and degradation Enzymology Computational and structural studies of plant systems Microbial Informatics Genomics Proteomics Metabolomics Algorithms and Hypothesis in Bioinformatics Mathematical and Theoretical Biology Computational Chemistry and Drug Discovery Microscopy and Molecular Imaging Nanotechnology Systems and Synthetic Biology
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信