台湾汉族人群常见编码序列变异的综合分析

Ya-Chi Lin , Joseph T. Tseng , Shuen-Lin Jeng , H. Sunny Sun
{"title":"台湾汉族人群常见编码序列变异的综合分析","authors":"Ya-Chi Lin ,&nbsp;Joseph T. Tseng ,&nbsp;Shuen-Lin Jeng ,&nbsp;H. Sunny Sun","doi":"10.1016/j.bgm.2014.05.001","DOIUrl":null,"url":null,"abstract":"<div><p>The diversity of genomic variations exists among different ethnic populations. Information on population-specific genomic variants provides important insights to link between genotypes and phenotypes. To facilitate genomic medicine research, this study aims to detect and characterize sequence variations enriched in the coding regions of the genome in the Chinese population residing in Taiwan. DNAs from 11 unrelated Taiwanese individuals were enriched for coding regions (i.e., exome) and followed by deep sequencing. Approximately 30 Gb of high-quality data from massively parallel sequencing was obtained. On average, ∼60% of the total reads were uniquely mapped to the human reference genome and overall 97% of the target regions were covered by sequence reads, resulting in an average enrichment fold relative to target size of ∼50-fold. Comprehensive variant detection and analysis were performed with various in-house established bioinformatics pipelines, and information for different types of variations including single nucleotide variants, short insertions and deletions, and copy number variations was collected. The sequence variations were crossed with variants in the public databases to identify ethnic-specific variants. To study the impact of sequence variations that are enriched in the Taiwanese Han population, variants that are present in at least two exomes (i.e., minor allele frequency &gt;9%) were further annotated. Overall, we detected 308 loss-of-function variants that belong to 291 genes in the Taiwanese Han Exome Sequencing dataset. Functional annotation revealed a significant pathological influence of these loss-of-function-associated genes in the risk of various human diseases including lung cancer. This is the first NGS (next-generation sequencing)-generating dataset to comprehensively report coding sequence variants in the Taiwanese Han population. Given that the Taiwanese Han population is the Han Chinese residing in Taiwan, it is normally underrepresented in population-genetics studies. We believe the study will contribute valuable information that will have an impact on medical as well as population genetics.</p></div>","PeriodicalId":100178,"journal":{"name":"Biomarkers and Genomic Medicine","volume":"6 4","pages":"Pages 133-143"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.bgm.2014.05.001","citationCount":"1","resultStr":"{\"title\":\"Comprehensive analysis of common coding sequence variants in Taiwanese Han population\",\"authors\":\"Ya-Chi Lin ,&nbsp;Joseph T. Tseng ,&nbsp;Shuen-Lin Jeng ,&nbsp;H. Sunny Sun\",\"doi\":\"10.1016/j.bgm.2014.05.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The diversity of genomic variations exists among different ethnic populations. Information on population-specific genomic variants provides important insights to link between genotypes and phenotypes. To facilitate genomic medicine research, this study aims to detect and characterize sequence variations enriched in the coding regions of the genome in the Chinese population residing in Taiwan. DNAs from 11 unrelated Taiwanese individuals were enriched for coding regions (i.e., exome) and followed by deep sequencing. Approximately 30 Gb of high-quality data from massively parallel sequencing was obtained. On average, ∼60% of the total reads were uniquely mapped to the human reference genome and overall 97% of the target regions were covered by sequence reads, resulting in an average enrichment fold relative to target size of ∼50-fold. Comprehensive variant detection and analysis were performed with various in-house established bioinformatics pipelines, and information for different types of variations including single nucleotide variants, short insertions and deletions, and copy number variations was collected. The sequence variations were crossed with variants in the public databases to identify ethnic-specific variants. To study the impact of sequence variations that are enriched in the Taiwanese Han population, variants that are present in at least two exomes (i.e., minor allele frequency &gt;9%) were further annotated. Overall, we detected 308 loss-of-function variants that belong to 291 genes in the Taiwanese Han Exome Sequencing dataset. Functional annotation revealed a significant pathological influence of these loss-of-function-associated genes in the risk of various human diseases including lung cancer. This is the first NGS (next-generation sequencing)-generating dataset to comprehensively report coding sequence variants in the Taiwanese Han population. Given that the Taiwanese Han population is the Han Chinese residing in Taiwan, it is normally underrepresented in population-genetics studies. We believe the study will contribute valuable information that will have an impact on medical as well as population genetics.</p></div>\",\"PeriodicalId\":100178,\"journal\":{\"name\":\"Biomarkers and Genomic Medicine\",\"volume\":\"6 4\",\"pages\":\"Pages 133-143\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/j.bgm.2014.05.001\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biomarkers and Genomic Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214024714000355\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomarkers and Genomic Medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214024714000355","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

基因组变异在不同民族人群中存在多样性。关于群体特异性基因组变异的信息为基因型和表型之间的联系提供了重要的见解。为了促进基因组医学研究,本研究旨在检测和表征台湾华人基因组编码区丰富的序列变异。对11个台湾个体的dna进行编码区(即外显子组)富集,然后进行深度测序。从大规模并行测序中获得了大约30 Gb的高质量数据。平均而言,约60%的总reads被唯一地映射到人类参考基因组,97%的目标区域被序列reads覆盖,导致相对于目标大小的平均富集倍数约为50倍。利用各种内部建立的生物信息学管道进行了全面的变异检测和分析,收集了不同类型的变异信息,包括单核苷酸变异、短插入和缺失以及拷贝数变异。将序列变异与公共数据库中的变异进行交叉,以确定种族特异性变异。为了研究台湾汉族人群中丰富的序列变异的影响,我们进一步注释了至少存在于两个外显子组(即小等位基因频率>9%)的变异。总体而言,我们在台湾汉族外显子组测序数据集中检测到属于291个基因的308个功能缺失变异。功能注释揭示了这些功能丧失相关基因在包括肺癌在内的各种人类疾病风险中的重要病理影响。这是第一个全面报告台湾汉族人群编码序列变异的NGS(下一代测序)生成数据集。由于台湾汉族人口是居住在台湾的汉人,因此在群体遗传学研究中通常代表性不足。我们相信这项研究将提供有价值的信息,将对医学和人口遗传学产生影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comprehensive analysis of common coding sequence variants in Taiwanese Han population

The diversity of genomic variations exists among different ethnic populations. Information on population-specific genomic variants provides important insights to link between genotypes and phenotypes. To facilitate genomic medicine research, this study aims to detect and characterize sequence variations enriched in the coding regions of the genome in the Chinese population residing in Taiwan. DNAs from 11 unrelated Taiwanese individuals were enriched for coding regions (i.e., exome) and followed by deep sequencing. Approximately 30 Gb of high-quality data from massively parallel sequencing was obtained. On average, ∼60% of the total reads were uniquely mapped to the human reference genome and overall 97% of the target regions were covered by sequence reads, resulting in an average enrichment fold relative to target size of ∼50-fold. Comprehensive variant detection and analysis were performed with various in-house established bioinformatics pipelines, and information for different types of variations including single nucleotide variants, short insertions and deletions, and copy number variations was collected. The sequence variations were crossed with variants in the public databases to identify ethnic-specific variants. To study the impact of sequence variations that are enriched in the Taiwanese Han population, variants that are present in at least two exomes (i.e., minor allele frequency >9%) were further annotated. Overall, we detected 308 loss-of-function variants that belong to 291 genes in the Taiwanese Han Exome Sequencing dataset. Functional annotation revealed a significant pathological influence of these loss-of-function-associated genes in the risk of various human diseases including lung cancer. This is the first NGS (next-generation sequencing)-generating dataset to comprehensively report coding sequence variants in the Taiwanese Han population. Given that the Taiwanese Han population is the Han Chinese residing in Taiwan, it is normally underrepresented in population-genetics studies. We believe the study will contribute valuable information that will have an impact on medical as well as population genetics.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信