Whole Genome Variant Dataset for Enriching Studies across 18 Different Cancers.

Onco Pub Date : 2022-06-01 Epub Date: 2022-06-17 DOI:10.3390/onco2020009
John Torcivia, Kawther Abdilleh, Fabian Seidl, Owais Shahzada, Rebecca Rodriguez, David Pot, Raja Mazumder
{"title":"Whole Genome Variant Dataset for Enriching Studies across 18 Different Cancers.","authors":"John Torcivia, Kawther Abdilleh, Fabian Seidl, Owais Shahzada, Rebecca Rodriguez, David Pot, Raja Mazumder","doi":"10.3390/onco2020009","DOIUrl":null,"url":null,"abstract":"<p><p>Whole genome sequencing (WGS) has helped to revolutionize biology, but the computational challenge remains for extracting valuable inferences from this information. Here, we present the cancer-associated variants from the Cancer Genome Atlas (TCGA) WGS dataset. This set of data will allow cancer researchers to further expand their analysis beyond the exomic regions of the genome to the entire genome. A total of 1342 WGS alignments available from the consortium were processed with VarScan2 and deposited to the NCI Cancer Cloud. The sample set covers 18 different cancers and reveals 157,313,519 pooled (non-unique) cancer-associated single-nucleotide variations (SNVs) across all samples. There was an average of 117,223 SNVs per sample, with a range from 1111 to 775,470 and a standard deviation of 163,273. The dataset was incorporated into BigQuery, which allows for fast access and cross-mapping, which will allow researchers to enrich their current studies with a plethora of newly available genomic data.</p>","PeriodicalId":74339,"journal":{"name":"Onco","volume":"2 2","pages":"129-144"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10571071/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Onco","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/onco2020009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/6/17 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Whole genome sequencing (WGS) has helped to revolutionize biology, but the computational challenge remains for extracting valuable inferences from this information. Here, we present the cancer-associated variants from the Cancer Genome Atlas (TCGA) WGS dataset. This set of data will allow cancer researchers to further expand their analysis beyond the exomic regions of the genome to the entire genome. A total of 1342 WGS alignments available from the consortium were processed with VarScan2 and deposited to the NCI Cancer Cloud. The sample set covers 18 different cancers and reveals 157,313,519 pooled (non-unique) cancer-associated single-nucleotide variations (SNVs) across all samples. There was an average of 117,223 SNVs per sample, with a range from 1111 to 775,470 and a standard deviation of 163,273. The dataset was incorporated into BigQuery, which allows for fast access and cross-mapping, which will allow researchers to enrich their current studies with a plethora of newly available genomic data.

Abstract Image

Abstract Image

Abstract Image

全基因组变异数据集,用于18种不同癌症的丰富研究。
全基因组测序(WGS)有助于彻底改变生物学,但从这些信息中提取有价值的推论仍然是计算上的挑战。在此,我们展示了癌症基因组图谱(TCGA)WGS数据集中的癌症相关变体。这组数据将使癌症研究人员能够进一步将他们的分析范围从基因组的外显区扩展到整个基因组。使用VarScan2处理来自联合体的总共1342个WGS比对,并将其存入NCI癌症云。样本集涵盖了18种不同的癌症,并揭示了所有样本中157313519种合并的(非唯一的)癌症相关单核苷酸变异(SNV)。每个样本平均有117223个SNV,范围从1111到775470,标准偏差为163273。该数据集被整合到BigQuery中,它允许快速访问和交叉映射,这将使研究人员能够用大量新可用的基因组数据丰富他们当前的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信