Detailed evaluation of cancer sequencing pipelines in different microenvironments and heterogeneity levels.

Turkish journal of biology = Turk biyoloji dergisi Pub Date : 2021-04-20 eCollection Date: 2021-01-01 DOI:10.3906/biy-2008-8
Batuhan Kısakol, Şahin Sarıhan, Mehmet Arif Ergün, Mehmet Baysan
{"title":"Detailed evaluation of cancer sequencing pipelines in different microenvironments and heterogeneity levels.","authors":"Batuhan Kısakol,&nbsp;Şahin Sarıhan,&nbsp;Mehmet Arif Ergün,&nbsp;Mehmet Baysan","doi":"10.3906/biy-2008-8","DOIUrl":null,"url":null,"abstract":"<p><p>The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.</p>","PeriodicalId":23375,"journal":{"name":"Turkish journal of biology = Turk biyoloji dergisi","volume":"45 2","pages":"114-126"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8068765/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Turkish journal of biology = Turk biyoloji dergisi","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3906/biy-2008-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.

Abstract Image

Abstract Image

Abstract Image

不同微环境和异质性水平下癌症测序管道的详细评估。
随着研究人员更容易获得这项关键技术,下一代测序(NGS)在癌症研究中的重要性也在上升。由NGS技术产生的序列数据必须在管道内通过各种生物信息学算法进行处理,以便将原始数据转换为有意义的信息。映射和变量调用是这些分析管道的两个主要步骤,许多算法可用于这些步骤。因此,在不同场景下对这些算法进行详细的基准测试对于有效利用测序技术至关重要。在这项研究中,我们比较了12种管道(3种映射算法和4种变体发现算法)与推荐设置的性能,以捕获单核苷酸变体。我们观察到,在真实和模拟样品中,不同异质性水平的测试管道中,变体呼叫存在显着差异,总体上具有高特异性和低灵敏度。除了对管道进行单独评估外,我们还构建并测试了管道组合的性能。在这些分析中,我们观察到某些管道的互补性比其他管道好得多,并且表现出比单个管道更优越的性能。这表明,对于癌症测序分析,坚持单一管道并不是最优的,在算法优化时应考虑样本异质性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信