{"title":"Comparison of different machine learning algorithms on Cell Classification with scRNA-seq after Principal Component Analysis","authors":"Jingkai Guo, Jing Gao","doi":"10.1109/ICSP54964.2022.9778439","DOIUrl":null,"url":null,"abstract":"This project did the process of the Single-cell RNA sequencing data (scRNA-seq) to predict the cell type. Researchers iterated the currently commonly used machine learning algorithm to form predict training models from an extensive dataset. To begin with, researchers executed the principal component analysis (PCA) to reduce the dataset sample dimension. Furthermore, four other different algorithms were constructed in this classification process in each iteration: logistic regression (LR), k nearest neighbor (kNN), supporting vector machine (SVM). In addition, this work applied boosting methods to the decision tree algorithm. Finally, the best approach for listing testing models above is the PCA for dimensional reduction and logistic regression as the classifier. The accuracy is 54.4% for testing data.","PeriodicalId":363766,"journal":{"name":"2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSP54964.2022.9778439","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This project did the process of the Single-cell RNA sequencing data (scRNA-seq) to predict the cell type. Researchers iterated the currently commonly used machine learning algorithm to form predict training models from an extensive dataset. To begin with, researchers executed the principal component analysis (PCA) to reduce the dataset sample dimension. Furthermore, four other different algorithms were constructed in this classification process in each iteration: logistic regression (LR), k nearest neighbor (kNN), supporting vector machine (SVM). In addition, this work applied boosting methods to the decision tree algorithm. Finally, the best approach for listing testing models above is the PCA for dimensional reduction and logistic regression as the classifier. The accuracy is 54.4% for testing data.