Lilu Guo, S. Bu, Yongjin Gan, Jianbo Lu, Xiaoshu Zhu
{"title":"snn - cliq++:改进的基于图的细胞聚类方法","authors":"Lilu Guo, S. Bu, Yongjin Gan, Jianbo Lu, Xiaoshu Zhu","doi":"10.1145/3407703.3407731","DOIUrl":null,"url":null,"abstract":"Cell typing using sing-cell RNA-seq data is the basis of precision medicine, life development & evolution, and drug research & development, etc. However, those data is characterized by ultrahigh dimensions, small samples, no labeling, and high noise, which bring challenges to traditional clustering methods, e.g. poor cell typing performance, high computational cost, and difficulty in parameter adjustment. SNN-Cliq is an outstanding clustering algorithm for cell typing proposed in 2015, with unique characters of simple and efficient computing process, good scalability and insensitiveness of parameters. Based on the frame of previous works [5], three improvements were proposed in our new method, namely SNN-Cliq++. Firstly, we replaced Euclidean distance with Spearman correlation coefficient to measurement the similarity between each cells pairs. Secondly, we optimize parameter k constrained by min |clusterNum-trueNum|, note that this process does not cost much time. Thirdly, we add negative indicate matric to forbid connection between cells which have top negative Spearman correlation coefficient. In extensive datasets, results reveal new algorithms has remarkable improvement than original, NMI rises 20.5% and ARI rises 28.6% in average.","PeriodicalId":284603,"journal":{"name":"Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SNN-Cliq++: Improved Cell Clustering Method Based on Graph\",\"authors\":\"Lilu Guo, S. Bu, Yongjin Gan, Jianbo Lu, Xiaoshu Zhu\",\"doi\":\"10.1145/3407703.3407731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cell typing using sing-cell RNA-seq data is the basis of precision medicine, life development & evolution, and drug research & development, etc. However, those data is characterized by ultrahigh dimensions, small samples, no labeling, and high noise, which bring challenges to traditional clustering methods, e.g. poor cell typing performance, high computational cost, and difficulty in parameter adjustment. SNN-Cliq is an outstanding clustering algorithm for cell typing proposed in 2015, with unique characters of simple and efficient computing process, good scalability and insensitiveness of parameters. Based on the frame of previous works [5], three improvements were proposed in our new method, namely SNN-Cliq++. Firstly, we replaced Euclidean distance with Spearman correlation coefficient to measurement the similarity between each cells pairs. Secondly, we optimize parameter k constrained by min |clusterNum-trueNum|, note that this process does not cost much time. Thirdly, we add negative indicate matric to forbid connection between cells which have top negative Spearman correlation coefficient. In extensive datasets, results reveal new algorithms has remarkable improvement than original, NMI rises 20.5% and ARI rises 28.6% in average.\",\"PeriodicalId\":284603,\"journal\":{\"name\":\"Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3407703.3407731\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 Artificial Intelligence and Complex Systems Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3407703.3407731","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SNN-Cliq++: Improved Cell Clustering Method Based on Graph
Cell typing using sing-cell RNA-seq data is the basis of precision medicine, life development & evolution, and drug research & development, etc. However, those data is characterized by ultrahigh dimensions, small samples, no labeling, and high noise, which bring challenges to traditional clustering methods, e.g. poor cell typing performance, high computational cost, and difficulty in parameter adjustment. SNN-Cliq is an outstanding clustering algorithm for cell typing proposed in 2015, with unique characters of simple and efficient computing process, good scalability and insensitiveness of parameters. Based on the frame of previous works [5], three improvements were proposed in our new method, namely SNN-Cliq++. Firstly, we replaced Euclidean distance with Spearman correlation coefficient to measurement the similarity between each cells pairs. Secondly, we optimize parameter k constrained by min |clusterNum-trueNum|, note that this process does not cost much time. Thirdly, we add negative indicate matric to forbid connection between cells which have top negative Spearman correlation coefficient. In extensive datasets, results reveal new algorithms has remarkable improvement than original, NMI rises 20.5% and ARI rises 28.6% in average.