使用单细胞RNA测序和机器学习的自动细胞识别

Chengqi Xu, Yuetian Chen, Yiyang Cao
{"title":"使用单细胞RNA测序和机器学习的自动细胞识别","authors":"Chengqi Xu, Yuetian Chen, Yiyang Cao","doi":"10.1145/3512452.3512455","DOIUrl":null,"url":null,"abstract":"This paper investigates the superiority and limitations of different dimensionality reduction schemes and classification methods in specific single-cell RNA sequencing (scRNA-seq) data sets. With systematic analysis as well as variables-controlled experiments, a pipeline was constructed from rpkm data to final cell type recognition and multiple dimension reduction methods are applied (including PCA, AutoEncoder, ISOMAP, and the combination algorithm of PCA+t-SNE) and multiple classifiers (Random Forest and Support Vector Machine, etc.) to obtain the accuracy difference of multiple solutions. By comparing the variation of different models and parameters on the final classification accuracy, this paper summarizes and outlook the information loss and classification effects of different processing schemes on the data set and seeks to find the best combination from them. Using the combination of PCA+SVM, this work obtained 53.13% global maximum accuracy and based on this result to further explore the possibility of improving accuracy and model transfer learning in a wider range of applications.","PeriodicalId":120446,"journal":{"name":"Proceedings of the 2021 5th International Conference on Computational Biology and Bioinformatics","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Cell Recognition using Single-cell RNA Sequencing with Machine Learning\",\"authors\":\"Chengqi Xu, Yuetian Chen, Yiyang Cao\",\"doi\":\"10.1145/3512452.3512455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper investigates the superiority and limitations of different dimensionality reduction schemes and classification methods in specific single-cell RNA sequencing (scRNA-seq) data sets. With systematic analysis as well as variables-controlled experiments, a pipeline was constructed from rpkm data to final cell type recognition and multiple dimension reduction methods are applied (including PCA, AutoEncoder, ISOMAP, and the combination algorithm of PCA+t-SNE) and multiple classifiers (Random Forest and Support Vector Machine, etc.) to obtain the accuracy difference of multiple solutions. By comparing the variation of different models and parameters on the final classification accuracy, this paper summarizes and outlook the information loss and classification effects of different processing schemes on the data set and seeks to find the best combination from them. Using the combination of PCA+SVM, this work obtained 53.13% global maximum accuracy and based on this result to further explore the possibility of improving accuracy and model transfer learning in a wider range of applications.\",\"PeriodicalId\":120446,\"journal\":{\"name\":\"Proceedings of the 2021 5th International Conference on Computational Biology and Bioinformatics\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 5th International Conference on Computational Biology and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3512452.3512455\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 5th International Conference on Computational Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3512452.3512455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文研究了不同降维方案和分类方法在特定单细胞RNA测序(scRNA-seq)数据集中的优势和局限性。通过系统分析和变量控制实验,构建了从rpkm数据到最终细胞类型识别的管道,并采用多种降维方法(包括PCA、AutoEncoder、ISOMAP以及PCA+t-SNE的组合算法)和多种分类器(随机森林和支持向量机等)获得多个解的精度差异。通过比较不同模型和参数对最终分类精度的影响,总结和展望了不同处理方案对数据集的信息损失和分类效果,并从中寻求最佳组合。使用PCA+SVM的组合,本工作获得了53.13%的全局最大准确率,并在此结果的基础上进一步探索在更广泛的应用中提高准确率和模型迁移学习的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated Cell Recognition using Single-cell RNA Sequencing with Machine Learning
This paper investigates the superiority and limitations of different dimensionality reduction schemes and classification methods in specific single-cell RNA sequencing (scRNA-seq) data sets. With systematic analysis as well as variables-controlled experiments, a pipeline was constructed from rpkm data to final cell type recognition and multiple dimension reduction methods are applied (including PCA, AutoEncoder, ISOMAP, and the combination algorithm of PCA+t-SNE) and multiple classifiers (Random Forest and Support Vector Machine, etc.) to obtain the accuracy difference of multiple solutions. By comparing the variation of different models and parameters on the final classification accuracy, this paper summarizes and outlook the information loss and classification effects of different processing schemes on the data set and seeks to find the best combination from them. Using the combination of PCA+SVM, this work obtained 53.13% global maximum accuracy and based on this result to further explore the possibility of improving accuracy and model transfer learning in a wider range of applications.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信