{"title":"基因-样品-时间微阵列数据分析的缺失值输入方法","authors":"Yifeng Li, A. Ngom, L. Rueda","doi":"10.1109/CIBCB.2010.5510349","DOIUrl":null,"url":null,"abstract":"With the recent advances in microarray technology, the expression levels of genes with respect to the samples can be monitored synchronically over a series of time points. Such three-dimensional microarray data, termed gene-sample-time microarray data or GST data for short, may contain missing values. Current microarray analysis methods require complete data sets, and thus, either each row, column or tube containing missing values must be removed from the original GST data, or these missing values must be estimated before analysis. Imputation of missing values is, however, more recommended than removal of data in order to increase the effectiveness of analysis algorithms. In this paper, we extend automated imputation methods, devised for two-dimensional microarray data, to GST data. We implemented imputation methods for GST data based on Singular Value Decomposition (3SVDimpute), K-Nearest Neighbor (3KNNimpute), and gene and sample average methods (3Aimpute), and show that methods based on KNN yield the best results with the lowest normalized root mean squared error.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Missing value imputation methods for gene-sample-time microarray data analysis\",\"authors\":\"Yifeng Li, A. Ngom, L. Rueda\",\"doi\":\"10.1109/CIBCB.2010.5510349\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the recent advances in microarray technology, the expression levels of genes with respect to the samples can be monitored synchronically over a series of time points. Such three-dimensional microarray data, termed gene-sample-time microarray data or GST data for short, may contain missing values. Current microarray analysis methods require complete data sets, and thus, either each row, column or tube containing missing values must be removed from the original GST data, or these missing values must be estimated before analysis. Imputation of missing values is, however, more recommended than removal of data in order to increase the effectiveness of analysis algorithms. In this paper, we extend automated imputation methods, devised for two-dimensional microarray data, to GST data. We implemented imputation methods for GST data based on Singular Value Decomposition (3SVDimpute), K-Nearest Neighbor (3KNNimpute), and gene and sample average methods (3Aimpute), and show that methods based on KNN yield the best results with the lowest normalized root mean squared error.\",\"PeriodicalId\":340637,\"journal\":{\"name\":\"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIBCB.2010.5510349\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2010.5510349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Missing value imputation methods for gene-sample-time microarray data analysis
With the recent advances in microarray technology, the expression levels of genes with respect to the samples can be monitored synchronically over a series of time points. Such three-dimensional microarray data, termed gene-sample-time microarray data or GST data for short, may contain missing values. Current microarray analysis methods require complete data sets, and thus, either each row, column or tube containing missing values must be removed from the original GST data, or these missing values must be estimated before analysis. Imputation of missing values is, however, more recommended than removal of data in order to increase the effectiveness of analysis algorithms. In this paper, we extend automated imputation methods, devised for two-dimensional microarray data, to GST data. We implemented imputation methods for GST data based on Singular Value Decomposition (3SVDimpute), K-Nearest Neighbor (3KNNimpute), and gene and sample average methods (3Aimpute), and show that methods based on KNN yield the best results with the lowest normalized root mean squared error.