{"title":"Supervised Methods with Genomic Data: a Review and Cautionary View","authors":"R. Díaz-Uriarte","doi":"10.1002/0470094419.CH12","DOIUrl":null,"url":null,"abstract":"We review well accepted methods to address questions about differential expression of genes and class prediction from gene expression data. We highlight some new topics that deserve more attention: testing of differential expression of specific groups of genes, intra-group heterogeneity and class prediction, gene interaction in predictors, visualisation, difficulties in the biological interpretation of predictor genes and molecular signatures, and the use of ROC[Receiver Operating Characteristic curve]-based statistics for evaluating predictors and differential expression. We end with a review of some serious problems that can limit the potential of these methods; we focus specially on inadequate assessment of the performance of new methods (due to inadequate estimation of error rates and to the use of few and “easy” data sets) and failure to recognise observational studies and include needed covariates. A final comment is made about the need for freely available source code.","PeriodicalId":268206,"journal":{"name":"Data Analysis and Visualization in Genomics and Proteomics","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Analysis and Visualization in Genomics and Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/0470094419.CH12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30
Abstract
We review well accepted methods to address questions about differential expression of genes and class prediction from gene expression data. We highlight some new topics that deserve more attention: testing of differential expression of specific groups of genes, intra-group heterogeneity and class prediction, gene interaction in predictors, visualisation, difficulties in the biological interpretation of predictor genes and molecular signatures, and the use of ROC[Receiver Operating Characteristic curve]-based statistics for evaluating predictors and differential expression. We end with a review of some serious problems that can limit the potential of these methods; we focus specially on inadequate assessment of the performance of new methods (due to inadequate estimation of error rates and to the use of few and “easy” data sets) and failure to recognise observational studies and include needed covariates. A final comment is made about the need for freely available source code.