CpG island definition and methylation mapping of the T2T-YAO genome

IF 11.5 2区 生物学 Q1 GENETICS & HEREDITY
Ming Xiao, Rui Wei, Jun Yu, Chujie Gao, Fengyi Yang, Le Zhang
{"title":"CpG island definition and methylation mapping of the T2T-YAO genome","authors":"Ming Xiao, Rui Wei, Jun Yu, Chujie Gao, Fengyi Yang, Le Zhang","doi":"10.1093/gpbjnl/qzae009","DOIUrl":null,"url":null,"abstract":"<jats:title>Abstract</jats:title> Precisely defining and mapping all cytosine positions and their clusters, known as CpG islands (CGIs), as well as their methylation status are pivotal for genome-wide epigenetic studies, especially when population-centric reference genomes are ready for timely application. Here we first align the two high-quality reference genomes, T2T-YAO and T2T-CHM13, from different ethnic backgrounds in a base-by-base fashion and compute their genome-wide density-defined and position-defined CGIs. Second, mapping some representative genome-wide methylation data from selected organs onto the two genomes, we find that there are about 4.7–5.8% sequence divergency of variable categories depending on quality cutoffs. Genes among the divergent sequences are mostly associated with neurological functions. Moreover, CGIs associated with the divergent sequences are significantly different with respect to CpG density and observed CpG/expected CpG (O/E) ratio between the two genomes. Finally, we find that the T2T-YAO genome not only has a greater CpG site coverage than that of the T2T-CHM13 genome when whole-genome bisulfite sequencing (WGBS) data from the European and American populations are mapped to each reference, but also show more hyper-methylated CpG sites as compared to the T2T-CHM13 genome. Our study suggests that future genome-wide epigenetic studies of the Chinese populations rely on both acquisition of high-quality methylation data and subsequent precision CGI mapping based on the Chinese T2T reference.","PeriodicalId":12528,"journal":{"name":"Genomics, Proteomics & Bioinformatics","volume":"41 1","pages":""},"PeriodicalIF":11.5000,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, Proteomics & Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzae009","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0
Abstract Precisely defining and mapping all cytosine positions and their clusters, known as CpG islands (CGIs), as well as their methylation status are pivotal for genome-wide epigenetic studies, especially when population-centric reference genomes are ready for timely application. Here we first align the two high-quality reference genomes, T2T-YAO and T2T-CHM13, from different ethnic backgrounds in a base-by-base fashion and compute their genome-wide density-defined and position-defined CGIs. Second, mapping some representative genome-wide methylation data from selected organs onto the two genomes, we find that there are about 4.7–5.8% sequence divergency of variable categories depending on quality cutoffs. Genes among the divergent sequences are mostly associated with neurological functions. Moreover, CGIs associated with the divergent sequences are significantly different with respect to CpG density and observed CpG/expected CpG (O/E) ratio between the two genomes. Finally, we find that the T2T-YAO genome not only has a greater CpG site coverage than that of the T2T-CHM13 genome when whole-genome bisulfite sequencing (WGBS) data from the European and American populations are mapped to each reference, but also show more hyper-methylated CpG sites as compared to the T2T-CHM13 genome. Our study suggests that future genome-wide epigenetic studies of the Chinese populations rely on both acquisition of high-quality methylation data and subsequent precision CGI mapping based on the Chinese T2T reference.
T2T-YAO 基因组的 CpG 岛定义和甲基化图谱
摘要 精确定义和绘制所有胞嘧啶位置及其簇(称为 CpG 岛(CGIs))以及它们的甲基化状态对于全基因组表观遗传学研究至关重要,尤其是当以人群为中心的参考基因组已经准备就绪可以及时应用时。在这里,我们首先对来自不同种族背景的两个高质量参考基因组 T2T-YAO 和 T2T-CHM13 进行逐碱基对齐,并计算它们的全基因组密度定义和位置定义的 CGI。其次,我们将一些来自选定器官的代表性全基因组甲基化数据映射到这两个基因组上,发现根据质量截断值的不同,变量类别的序列差异约为 4.7-5.8%。差异序列中的基因大多与神经功能有关。此外,与差异序列相关的 CGIs 在 CpG 密度和观察到的 CpG/预期 CpG(O/E)比率方面在两个基因组之间存在显著差异。最后,我们发现,与 T2T-CHM13 基因组相比,将欧美人群的全基因组亚硫酸氢盐测序(WGBS)数据映射到每个参考系时,T2T-YAO 基因组不仅比 T2T-CHM13 基因组有更大的 CpG 位点覆盖率,而且还显示出更多的高甲基化 CpG 位点。我们的研究表明,未来对中国人群的全基因组表观遗传学研究有赖于获取高质量的甲基化数据以及随后基于中国 T2T 参考文献的精确 CGI 图谱绘制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Genomics, Proteomics & Bioinformatics
Genomics, Proteomics & Bioinformatics Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
14.30
自引率
4.20%
发文量
844
审稿时长
61 days
期刊介绍: Genomics, Proteomics and Bioinformatics (GPB) is the official journal of the Beijing Institute of Genomics, Chinese Academy of Sciences / China National Center for Bioinformation and Genetics Society of China. It aims to disseminate new developments in the field of omics and bioinformatics, publish high-quality discoveries quickly, and promote open access and online publication. GPB welcomes submissions in all areas of life science, biology, and biomedicine, with a focus on large data acquisition, analysis, and curation. Manuscripts covering omics and related bioinformatics topics are particularly encouraged. GPB is indexed/abstracted by PubMed/MEDLINE, PubMed Central, Scopus, BIOSIS Previews, Chemical Abstracts, CSCD, among others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信