SeqBMC:基于矩阵分解的迭代块矩阵补全算法的单细胞数据处理

IF 1.9 4区 生物学 Q4 CELL BIOLOGY
Gong Lejun, Yu Like, Wei Xinyi, Zhou Shehai, Xu Shuhua
{"title":"SeqBMC:基于矩阵分解的迭代块矩阵补全算法的单细胞数据处理","authors":"Gong Lejun,&nbsp;Yu Like,&nbsp;Wei Xinyi,&nbsp;Zhou Shehai,&nbsp;Xu Shuhua","doi":"10.1049/syb2.70003","DOIUrl":null,"url":null,"abstract":"<p>With the development of high-throughput sequencing technology, the analysis of single-cell RNA sequencing data has become the focus of current research. Matrix analysis and processing of downstream gene expression after preprocessing is a hot topic for researchers. This paper proposed an iterative block matrix completion algorithm, called SeqBMC, based on matrix factorisation. The algorithm is used to complete the missing value of the gene expression matrix caused by the defect of sequencing technology. The gene frequency of the matrix is used to block the matrix, and then the matrix factorisation algorithm is used to complete the small matrix after the block, and then the biological zeros that may exist in the block matrix are retained. Experimental results show that the matrix completion algorithm can significantly improve the classification performance of the gene expression matrix after completion with 86.81% F1 score, which is conducive to the recognition of cell types in sequencing data. Moreover, this completion method can be completed only by the machine learning method without too much prior knowledge related to biology and has good effects. Compared with ALRA, SeqBMC increased 5.47% accuracy and 5.03% F1 score. It indicates that SeqBMC has significant advantages in the matrix completion of single-cell RNA sequencing data.</p>","PeriodicalId":50379,"journal":{"name":"IET Systems Biology","volume":"19 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70003","citationCount":"0","resultStr":"{\"title\":\"SeqBMC: Single-cell data processing using iterative block matrix completion algorithm based on matrix factorisation\",\"authors\":\"Gong Lejun,&nbsp;Yu Like,&nbsp;Wei Xinyi,&nbsp;Zhou Shehai,&nbsp;Xu Shuhua\",\"doi\":\"10.1049/syb2.70003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With the development of high-throughput sequencing technology, the analysis of single-cell RNA sequencing data has become the focus of current research. Matrix analysis and processing of downstream gene expression after preprocessing is a hot topic for researchers. This paper proposed an iterative block matrix completion algorithm, called SeqBMC, based on matrix factorisation. The algorithm is used to complete the missing value of the gene expression matrix caused by the defect of sequencing technology. The gene frequency of the matrix is used to block the matrix, and then the matrix factorisation algorithm is used to complete the small matrix after the block, and then the biological zeros that may exist in the block matrix are retained. Experimental results show that the matrix completion algorithm can significantly improve the classification performance of the gene expression matrix after completion with 86.81% F1 score, which is conducive to the recognition of cell types in sequencing data. Moreover, this completion method can be completed only by the machine learning method without too much prior knowledge related to biology and has good effects. Compared with ALRA, SeqBMC increased 5.47% accuracy and 5.03% F1 score. It indicates that SeqBMC has significant advantages in the matrix completion of single-cell RNA sequencing data.</p>\",\"PeriodicalId\":50379,\"journal\":{\"name\":\"IET Systems Biology\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2025-02-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/syb2.70003\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Systems Biology\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/syb2.70003\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"CELL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Systems Biology","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/syb2.70003","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

随着高通量测序技术的发展,单细胞RNA测序数据的分析已成为当前研究的热点。预处理后的下游基因表达的基质分析和处理一直是研究人员关注的热点。提出了一种基于矩阵分解的迭代分块矩阵补全算法SeqBMC。该算法用于补全由于测序技术缺陷导致的基因表达矩阵缺失值。先用矩阵的基因频率对矩阵进行分块,然后用矩阵分解算法对分块后的小矩阵进行补全,然后保留分块矩阵中可能存在的生物零。实验结果表明,矩阵补全算法能够显著提高补全后基因表达矩阵的分类性能,F1得分达到86.81%,有利于测序数据中细胞类型的识别。而且这种补全方法只需要机器学习的方法就可以完成,不需要太多的生物学相关的先验知识,效果很好。与ALRA相比,SeqBMC的准确率提高了5.47%,F1评分提高了5.03%。说明SeqBMC在单细胞RNA测序数据的基质补全方面具有显著优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

SeqBMC: Single-cell data processing using iterative block matrix completion algorithm based on matrix factorisation

SeqBMC: Single-cell data processing using iterative block matrix completion algorithm based on matrix factorisation

With the development of high-throughput sequencing technology, the analysis of single-cell RNA sequencing data has become the focus of current research. Matrix analysis and processing of downstream gene expression after preprocessing is a hot topic for researchers. This paper proposed an iterative block matrix completion algorithm, called SeqBMC, based on matrix factorisation. The algorithm is used to complete the missing value of the gene expression matrix caused by the defect of sequencing technology. The gene frequency of the matrix is used to block the matrix, and then the matrix factorisation algorithm is used to complete the small matrix after the block, and then the biological zeros that may exist in the block matrix are retained. Experimental results show that the matrix completion algorithm can significantly improve the classification performance of the gene expression matrix after completion with 86.81% F1 score, which is conducive to the recognition of cell types in sequencing data. Moreover, this completion method can be completed only by the machine learning method without too much prior knowledge related to biology and has good effects. Compared with ALRA, SeqBMC increased 5.47% accuracy and 5.03% F1 score. It indicates that SeqBMC has significant advantages in the matrix completion of single-cell RNA sequencing data.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IET Systems Biology
IET Systems Biology 生物-数学与计算生物学
CiteScore
4.20
自引率
4.30%
发文量
17
审稿时长
>12 weeks
期刊介绍: IET Systems Biology covers intra- and inter-cellular dynamics, using systems- and signal-oriented approaches. Papers that analyse genomic data in order to identify variables and basic relationships between them are considered if the results provide a basis for mathematical modelling and simulation of cellular dynamics. Manuscripts on molecular and cell biological studies are encouraged if the aim is a systems approach to dynamic interactions within and between cells. The scope includes the following topics: Genomics, transcriptomics, proteomics, metabolomics, cells, tissue and the physiome; molecular and cellular interaction, gene, cell and protein function; networks and pathways; metabolism and cell signalling; dynamics, regulation and control; systems, signals, and information; experimental data analysis; mathematical modelling, simulation and theoretical analysis; biological modelling, simulation, prediction and control; methodologies, databases, tools and algorithms for modelling and simulation; modelling, analysis and control of biological networks; synthetic biology and bioengineering based on systems biology.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信