Carmen D. Tekwe, Alan R. Dabney, Raymond J. Carroll
{"title":"Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data","authors":"Carmen D. Tekwe, Alan R. Dabney, Raymond J. Carroll","doi":"10.1109/GENSiPS.2011.6169453","DOIUrl":null,"url":null,"abstract":"Protein abundance in quantitative proteomics is often based on observed spectral features derived from LC-MS experiments. Peak intensities are largely non-Normal in distribution. Furthermore, LC-MS data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model, accelerated failure time model with the Weibull distribution were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated data set.","PeriodicalId":181666,"journal":{"name":"2011 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GENSiPS.2011.6169453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Protein abundance in quantitative proteomics is often based on observed spectral features derived from LC-MS experiments. Peak intensities are largely non-Normal in distribution. Furthermore, LC-MS data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model, accelerated failure time model with the Weibull distribution were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated data set.