mCOPA: analysis of heterogeneous features in cancer expression data.

Chenwei Wang, Alperen Taciroglu, Stefan R Maetschke, Colleen C Nelson, Mark A Ragan, Melissa J Davis
{"title":"mCOPA: analysis of heterogeneous features in cancer expression data.","authors":"Chenwei Wang,&nbsp;Alperen Taciroglu,&nbsp;Stefan R Maetschke,&nbsp;Colleen C Nelson,&nbsp;Mark A Ragan,&nbsp;Melissa J Davis","doi":"10.1186/2043-9113-2-22","DOIUrl":null,"url":null,"abstract":"<p><strong>Unlabelled: </strong></p><p><strong>Background: </strong>Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset.</p><p><strong>Results: </strong>We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours.</p><p><strong>Conclusions: </strong>We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-2-22","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of clinical bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/2043-9113-2-22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

Unlabelled:

Background: Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset.

Results: We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours.

Conclusions: We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers.

Abstract Image

Abstract Image

Abstract Image

mCOPA:分析肿瘤表达数据的异质性特征。
背景:癌症异常值分析(COPA)已被证明是分析癌症表达数据的有效方法,从而发现了前列腺癌中的TMPRSS2和ETS家族基因融合事件。然而,最初的COPA算法没有识别下调的异常值,目前可用的R包实现该方法同样仅限于分析过表达的异常值。在这里,我们提出了一种改进的异常点检测方法,mCOPA,它包含了对异常点检测算法的改进,可以识别过度和不足表达的异常点,是免费的,并且可以应用于任何表达式数据集。结果:我们将我们的方法与其他特征选择方法进行了比较,并证明mCOPA比差异表达或基于方差的特征选择方法经常选择更多信息的特征,并且能够更一致地恢复观察到的临床亚型。我们展示了mCOPA在前列腺癌表达数据中的应用,并探索了异常值在聚类、通路分析和肿瘤抑制因子鉴定中的应用。我们分析了低表达的异常值,以确定已知和新的前列腺癌肿瘤抑制基因,并根据Oncomine和癌症基因指数的数据验证了这些基因。我们还展示了如何结合异常值分析和途径分析来识别单个肿瘤中被破坏的分子机制。结论:我们证明,与差异表达或变异相比,mCOPA在选择异常特征方面具有优势,并且这样选择的特征能够更好地将样本分配到临床注释的亚型。此外,我们表明异常值分析探索的生物学不同于差异表达或方差分析中发现的生物学。mCOPA是探索癌症数据集和发现新的癌症亚型的重要新工具,可以与途径和功能分析方法相结合,发现癌症异质性的机制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信