{"title":"利用无监督学习改进基因选择性能","authors":"Mingyi Wang, Ping Wu, Shu-Quan Xia","doi":"10.1109/ICNNSP.2003.1279209","DOIUrl":null,"url":null,"abstract":"Selection of significant genes via expression profiles is an important problem in microarray experiments for diseases classification and prediction. Genes of interest are typically selected by a statistical significance test and the top ranked genes were used. A problem with this approach is that many of these genes are highly correlated. For classification purposes it required to have distinct but still highly informative genes. In this paper, we proposed an unsupervised feature selection algorithm to resolve this problem. The method retrieves groups of similar genes by measuring similarity between them whereby redundancy therein is removed. This does not need any search and therefore, is fast. Real biological data experiments have shown that this approach will significantly improve existing classifiers.","PeriodicalId":336216,"journal":{"name":"International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003","volume":"109 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Improving performance of gene selection by unsupervised learning\",\"authors\":\"Mingyi Wang, Ping Wu, Shu-Quan Xia\",\"doi\":\"10.1109/ICNNSP.2003.1279209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Selection of significant genes via expression profiles is an important problem in microarray experiments for diseases classification and prediction. Genes of interest are typically selected by a statistical significance test and the top ranked genes were used. A problem with this approach is that many of these genes are highly correlated. For classification purposes it required to have distinct but still highly informative genes. In this paper, we proposed an unsupervised feature selection algorithm to resolve this problem. The method retrieves groups of similar genes by measuring similarity between them whereby redundancy therein is removed. This does not need any search and therefore, is fast. Real biological data experiments have shown that this approach will significantly improve existing classifiers.\",\"PeriodicalId\":336216,\"journal\":{\"name\":\"International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003\",\"volume\":\"109 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNNSP.2003.1279209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Neural Networks and Signal Processing, 2003. Proceedings of the 2003","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNNSP.2003.1279209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving performance of gene selection by unsupervised learning
Selection of significant genes via expression profiles is an important problem in microarray experiments for diseases classification and prediction. Genes of interest are typically selected by a statistical significance test and the top ranked genes were used. A problem with this approach is that many of these genes are highly correlated. For classification purposes it required to have distinct but still highly informative genes. In this paper, we proposed an unsupervised feature selection algorithm to resolve this problem. The method retrieves groups of similar genes by measuring similarity between them whereby redundancy therein is removed. This does not need any search and therefore, is fast. Real biological data experiments have shown that this approach will significantly improve existing classifiers.