Methodological approach from the Best Overall Team in the sbv IMPROVER Diagnostic Signature Challenge

A. Tarca, N. Than, R. Romero
{"title":"Methodological approach from the Best Overall Team in the sbv IMPROVER Diagnostic Signature Challenge","authors":"A. Tarca, N. Than, R. Romero","doi":"10.4161/sysb.25980","DOIUrl":null,"url":null,"abstract":"The sbv IMPROVER Diagnostic Signature Challenge used crowdsourcing to identify the best methods to classify clinical samples using transcriptomics data. Participating teams used public microarray data sets to develop prediction models in four disease areas, and then made predictions on blinded test data generated by the organizers. Here we describe the approach of the team for the Perinatology Research Branch (Team PRB; AL Tarca, R Romero), that was awarded the best performing entrant prize out of 54 entrants. The key elements of our approach included: (1) selection of training data sets by trial and error; (2) removal of batch effects by pre-processing the test and training data together; (3) the use of statistical significance and magnitude of change to select biomarkers; and (4) optimization of the number of biomarkers via the cross-validated performance of a simple linear discriminant analysis (LDA) model. Not only were our resulting models ranked consistently high, but they also generated parsimonious signatures of as low as two genes, unlike most of the other top-ranked teams that used hundreds of genes for prediction.","PeriodicalId":90057,"journal":{"name":"Systems biomedicine (Austin, Tex.)","volume":"1 1","pages":"217 - 227"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4161/sysb.25980","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems biomedicine (Austin, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4161/sysb.25980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

The sbv IMPROVER Diagnostic Signature Challenge used crowdsourcing to identify the best methods to classify clinical samples using transcriptomics data. Participating teams used public microarray data sets to develop prediction models in four disease areas, and then made predictions on blinded test data generated by the organizers. Here we describe the approach of the team for the Perinatology Research Branch (Team PRB; AL Tarca, R Romero), that was awarded the best performing entrant prize out of 54 entrants. The key elements of our approach included: (1) selection of training data sets by trial and error; (2) removal of batch effects by pre-processing the test and training data together; (3) the use of statistical significance and magnitude of change to select biomarkers; and (4) optimization of the number of biomarkers via the cross-validated performance of a simple linear discriminant analysis (LDA) model. Not only were our resulting models ranked consistently high, but they also generated parsimonious signatures of as low as two genes, unlike most of the other top-ranked teams that used hundreds of genes for prediction.
来自sbv improved诊断签名挑战赛最佳整体团队的方法学方法
sbv IMPROVER诊断签名挑战赛采用众包的方式,利用转录组学数据确定对临床样本进行分类的最佳方法。参与团队使用公共微阵列数据集开发了四个疾病领域的预测模型,然后根据组织者生成的盲法测试数据进行预测。在这里,我们描述了围产期研究部门(团队PRB;AL Tarca, R Romero),从54个参赛者中获得了最佳表现参赛者奖。我们方法的关键要素包括:(1)通过试错法选择训练数据集;(2)通过对测试数据和训练数据进行预处理,去除批次效应;(3)利用统计显著性和变化幅度来选择生物标志物;(4)通过交叉验证的简单线性判别分析(LDA)模型优化生物标记物的数量。我们的结果模型不仅排名一直很高,而且它们还生成了低至两个基因的简约特征,这与其他大多数排名靠前的团队使用数百个基因进行预测不同。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信