Compressed Representation of Extreme Learning Machine with Self-Diffusion Graph Denoising Applied for Dissecting Molecular Heterogeneity.

IF 1.4 4区 生物学 Q4 BIOCHEMICAL RESEARCH METHODS
Xin Duan, Xinnan Ding, Yuelin Lu
{"title":"Compressed Representation of Extreme Learning Machine with Self-Diffusion Graph Denoising Applied for Dissecting Molecular Heterogeneity.","authors":"Xin Duan, Xinnan Ding, Yuelin Lu","doi":"10.1089/cmb.2024.0729","DOIUrl":null,"url":null,"abstract":"<p><p>Molecular heterogeneity exists in many biological systems, such as major malignancies or diverse cell populations. Clustering of gene expression profiles has been widely used to dissect molecular heterogeneity. One drawback common to most clustering methods is that they often suffer from high dimensionality and noise, as well as feature redundancy. To address these challenges, we propose Extreme learning machine self-diffusion (ELMSD), an auto-encoder extreme learning machine feature representation method that incorporates a self-diffusion graph denoising framework to effectively dissect molecular heterogeneity. Our method, ELMSD, first learns a compressed representation of gene expression profiles from the hidden layer of the autoencoder extreme learning machine, followed by an iterative graph diffusion process to enhance the sample-to-sample similarity. The enhanced graph can largely facilitate the downstream clustering analysis, making it more efficient to analyze molecular properties. To demonstrate the utility of ELMSD, we applied it on one simulation dataset, five single-cell datasets, and 20 cancer datasets. Experiment results show that the ELMSD approach outperforms several state-of-the-art clustering methods and cancer subtypes, cell types identified by ELMSD reveal strong clinical relevance and biological interpretation. The ELMSD code is available at: https://github.com/DXCODEE/ELMSD.</p>","PeriodicalId":15526,"journal":{"name":"Journal of Computational Biology","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1089/cmb.2024.0729","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Molecular heterogeneity exists in many biological systems, such as major malignancies or diverse cell populations. Clustering of gene expression profiles has been widely used to dissect molecular heterogeneity. One drawback common to most clustering methods is that they often suffer from high dimensionality and noise, as well as feature redundancy. To address these challenges, we propose Extreme learning machine self-diffusion (ELMSD), an auto-encoder extreme learning machine feature representation method that incorporates a self-diffusion graph denoising framework to effectively dissect molecular heterogeneity. Our method, ELMSD, first learns a compressed representation of gene expression profiles from the hidden layer of the autoencoder extreme learning machine, followed by an iterative graph diffusion process to enhance the sample-to-sample similarity. The enhanced graph can largely facilitate the downstream clustering analysis, making it more efficient to analyze molecular properties. To demonstrate the utility of ELMSD, we applied it on one simulation dataset, five single-cell datasets, and 20 cancer datasets. Experiment results show that the ELMSD approach outperforms several state-of-the-art clustering methods and cancer subtypes, cell types identified by ELMSD reveal strong clinical relevance and biological interpretation. The ELMSD code is available at: https://github.com/DXCODEE/ELMSD.

基于自扩散图去噪的极限学习机压缩表示在分子异质性解剖中的应用。
分子异质性存在于许多生物系统中,如主要的恶性肿瘤或不同的细胞群。基因表达谱的聚类已被广泛用于解剖分子异质性。大多数聚类方法的一个共同缺点是它们经常受到高维和噪声以及特征冗余的影响。为了解决这些挑战,我们提出了极限学习机自扩散(ELMSD),这是一种自编码器极限学习机特征表示方法,它结合了自扩散图去噪框架来有效地剖析分子异质性。我们的方法,ELMSD,首先从自编码器极限学习机的隐藏层学习基因表达谱的压缩表示,然后通过迭代图扩散过程来增强样本间的相似性。增强后的图在很大程度上方便了下游聚类分析,使分子性质分析更加高效。为了演示ELMSD的实用性,我们将其应用于一个模拟数据集、五个单细胞数据集和20个癌症数据集。实验结果表明,ELMSD方法优于几种最先进的聚类方法和癌症亚型,ELMSD鉴定的细胞类型具有很强的临床相关性和生物学解释。ELMSD代码可从https://github.com/DXCODEE/ELMSD获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computational Biology
Journal of Computational Biology 生物-计算机:跨学科应用
CiteScore
3.60
自引率
5.90%
发文量
113
审稿时长
6-12 weeks
期刊介绍: Journal of Computational Biology is the leading peer-reviewed journal in computational biology and bioinformatics, publishing in-depth statistical, mathematical, and computational analysis of methods, as well as their practical impact. Available only online, this is an essential journal for scientists and students who want to keep abreast of developments in bioinformatics. Journal of Computational Biology coverage includes: -Genomics -Mathematical modeling and simulation -Distributed and parallel biological computing -Designing biological databases -Pattern matching and pattern detection -Linking disparate databases and data -New tools for computational biology -Relational and object-oriented database technology for bioinformatics -Biological expert system design and use -Reasoning by analogy, hypothesis formation, and testing by machine -Management of biological databases
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信