Tensor decomposition based on the potential low-rank and p-shrinkage generalized threshold algorithm for analyzing cancer multiomics data.

IF 0.9 4区 生物学 Q4 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Hang-Jin Yang, Yu-Xia Lei, Juan Wang, Xiang-Zhen Kong, Jin-Xing Liu, Ying-Lian Gao
{"title":"Tensor decomposition based on the potential low-rank and <i>p</i>-shrinkage generalized threshold algorithm for analyzing cancer multiomics data.","authors":"Hang-Jin Yang,&nbsp;Yu-Xia Lei,&nbsp;Juan Wang,&nbsp;Xiang-Zhen Kong,&nbsp;Jin-Xing Liu,&nbsp;Ying-Lian Gao","doi":"10.1142/S0219720022500020","DOIUrl":null,"url":null,"abstract":"<p><p>Tensor Robust Principal Component Analysis (TRPCA) has achieved promising results in the analysis of genomics data. However, the TRPCA model under the existing tensor singular value decomposition ([Formula: see text]-SVD) framework insufficiently extracts the potential low-rank structure of the data, resulting in suboptimal restored components. Simultaneously, the tensor nuclear norm (TNN) defined based on [Formula: see text]-SVD uses the same standard to handle various singular values. TNN ignores the difference of singular values, leading to the failure of the main information that needs to be well preserved. To preserve the heterogeneous structure in the low-rank information, we propose a novel TNN and extend it to the TRPCA model. Potential low-rank space may contain important information. We learn the low-rank structural information from the core tensor. The singular value space contains the association information between genes and cancers. The [Formula: see text]-shrinkage generalized threshold function is utilized to preserve the low-rank properties of larger singular values. The optimization problem is solved by the alternating direction method of the multiplier (ADMM) algorithm. Clustering and feature selection experiments are performed on the TCGA data set. The experimental results show that the proposed model is more promising than other state-of-the-art tensor decomposition methods.</p>","PeriodicalId":48910,"journal":{"name":"Journal of Bioinformatics and Computational Biology","volume":null,"pages":null},"PeriodicalIF":0.9000,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Bioinformatics and Computational Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1142/S0219720022500020","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/2/21 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Tensor Robust Principal Component Analysis (TRPCA) has achieved promising results in the analysis of genomics data. However, the TRPCA model under the existing tensor singular value decomposition ([Formula: see text]-SVD) framework insufficiently extracts the potential low-rank structure of the data, resulting in suboptimal restored components. Simultaneously, the tensor nuclear norm (TNN) defined based on [Formula: see text]-SVD uses the same standard to handle various singular values. TNN ignores the difference of singular values, leading to the failure of the main information that needs to be well preserved. To preserve the heterogeneous structure in the low-rank information, we propose a novel TNN and extend it to the TRPCA model. Potential low-rank space may contain important information. We learn the low-rank structural information from the core tensor. The singular value space contains the association information between genes and cancers. The [Formula: see text]-shrinkage generalized threshold function is utilized to preserve the low-rank properties of larger singular values. The optimization problem is solved by the alternating direction method of the multiplier (ADMM) algorithm. Clustering and feature selection experiments are performed on the TCGA data set. The experimental results show that the proposed model is more promising than other state-of-the-art tensor decomposition methods.

基于潜在低秩p缩广义阈值算法的张量分解癌症多组学数据分析。
张量鲁棒主成分分析(TRPCA)在基因组学数据分析中取得了可喜的成果。然而,现有张量奇异值分解([公式:见文]-SVD)框架下的TRPCA模型未能充分提取数据潜在的低秩结构,导致恢复分量次优。同时,基于[公式:见文]-SVD定义的张量核范数(TNN)使用相同的标准处理各种奇异值。TNN忽略了奇异值的差异,导致不能很好地保存主要信息。为了保留低秩信息中的异构结构,我们提出了一种新的TNN,并将其扩展到TRPCA模型中。潜在的低秩空间可能包含重要信息。我们从核心张量中学习低秩结构信息。奇异值空间包含了基因与癌症之间的关联信息。使用[公式:见文本]-收缩广义阈值函数来保持较大奇异值的低秩性质。采用乘法器(ADMM)算法的交替方向法求解优化问题。对TCGA数据集进行了聚类和特征选择实验。实验结果表明,该模型比现有的张量分解方法更有前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Bioinformatics and Computational Biology
Journal of Bioinformatics and Computational Biology MATHEMATICAL & COMPUTATIONAL BIOLOGY-
CiteScore
2.10
自引率
0.00%
发文量
57
期刊介绍: The Journal of Bioinformatics and Computational Biology aims to publish high quality, original research articles, expository tutorial papers and review papers as well as short, critical comments on technical issues associated with the analysis of cellular information. The research papers will be technical presentations of new assertions, discoveries and tools, intended for a narrower specialist community. The tutorials, reviews and critical commentary will be targeted at a broader readership of biologists who are interested in using computers but are not knowledgeable about scientific computing, and equally, computer scientists who have an interest in biology but are not familiar with current thrusts nor the language of biology. Such carefully chosen tutorials and articles should greatly accelerate the rate of entry of these new creative scientists into the field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信