芒果泛基因组揭示了参考偏差对基因组分析的巨大影响

IF 8.7 1区 农林科学 Q1 Agricultural and Biological Sciences
Bilal Ahmad, Ying Su, Yani Hao, Tayyaba Razzaq, Rida Arshad, Yi Zhang, Yingchun Zhang, Xingyi Wang, Guizhou Huang, Xiangnian Su, Ting Hou, Chaochao Li, Xuanwen Yang, Chuanning Li, Zhenzhou Chu, Qiuyan Wang, Yu Zhang, Zhongxin Jin, Qi Xu, Xiaodong Xu, Yanling Peng, Guiqi Bi, Chengjie Chen, Yeyuan Chen, Hua Xiao, Jianfeng Huang, Yongfeng Zhou, Xinmin Tian
{"title":"芒果泛基因组揭示了参考偏差对基因组分析的巨大影响","authors":"Bilal Ahmad, Ying Su, Yani Hao, Tayyaba Razzaq, Rida Arshad, Yi Zhang, Yingchun Zhang, Xingyi Wang, Guizhou Huang, Xiangnian Su, Ting Hou, Chaochao Li, Xuanwen Yang, Chuanning Li, Zhenzhou Chu, Qiuyan Wang, Yu Zhang, Zhongxin Jin, Qi Xu, Xiaodong Xu, Yanling Peng, Guiqi Bi, Chengjie Chen, Yeyuan Chen, Hua Xiao, Jianfeng Huang, Yongfeng Zhou, Xinmin Tian","doi":"10.1093/hr/uhaf166","DOIUrl":null,"url":null,"abstract":"Most genomics studies start by mapping sequencing data to a reference genome. The quality of reference genome assembly, genetic relatedness to the studied population, and the mapping method employed directly impact variant calling accuracy and subsequent genomic analyses, introducing reference bias and resulting in erroneous conclusions. However, the impacts of reference bias and methods to reduce it have gained limited attention. This study compared genomic analyses using four different reference genomes of mango (Mangifera indica), including the two haploid assemblies of haplotype-resolved telomere-to-telomere (T2T) genome assembly, a pangenome, and an older version of the reference genome available on NCBI. The choice of reference genome dramatically impacted the mapping efficiency and resulted in notable differences in calling the genetic variants, particularly structural variations (SVs). Phylogenetic analysis was more sensitive to the reference genome compared to genetic differentiation. Population genomic analyses of artificial selection in domestication and SV hotspot regions varied across reference genomes. Notably, the gene enrichment analyses showed significant differences in the top enriched biological processes depending on the reference genome used. Overall, the mango pangenome outperformed the other reference genomes across various metrics, followed by T2T reference genomes, as they captured greater diversity and effectively reduced reference bias. Our findings highlight the role of the mango pangenome in reducing reference bias and underscore the critical role of reference genome selection, suggesting that it is one of the most important factors in genomic studies.","PeriodicalId":13179,"journal":{"name":"Horticulture Research","volume":"77 1","pages":""},"PeriodicalIF":8.7000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mango Pangenome Reveals Dramatic Impacts of Reference Bias on Genomic Analyses\",\"authors\":\"Bilal Ahmad, Ying Su, Yani Hao, Tayyaba Razzaq, Rida Arshad, Yi Zhang, Yingchun Zhang, Xingyi Wang, Guizhou Huang, Xiangnian Su, Ting Hou, Chaochao Li, Xuanwen Yang, Chuanning Li, Zhenzhou Chu, Qiuyan Wang, Yu Zhang, Zhongxin Jin, Qi Xu, Xiaodong Xu, Yanling Peng, Guiqi Bi, Chengjie Chen, Yeyuan Chen, Hua Xiao, Jianfeng Huang, Yongfeng Zhou, Xinmin Tian\",\"doi\":\"10.1093/hr/uhaf166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most genomics studies start by mapping sequencing data to a reference genome. The quality of reference genome assembly, genetic relatedness to the studied population, and the mapping method employed directly impact variant calling accuracy and subsequent genomic analyses, introducing reference bias and resulting in erroneous conclusions. However, the impacts of reference bias and methods to reduce it have gained limited attention. This study compared genomic analyses using four different reference genomes of mango (Mangifera indica), including the two haploid assemblies of haplotype-resolved telomere-to-telomere (T2T) genome assembly, a pangenome, and an older version of the reference genome available on NCBI. The choice of reference genome dramatically impacted the mapping efficiency and resulted in notable differences in calling the genetic variants, particularly structural variations (SVs). Phylogenetic analysis was more sensitive to the reference genome compared to genetic differentiation. Population genomic analyses of artificial selection in domestication and SV hotspot regions varied across reference genomes. Notably, the gene enrichment analyses showed significant differences in the top enriched biological processes depending on the reference genome used. Overall, the mango pangenome outperformed the other reference genomes across various metrics, followed by T2T reference genomes, as they captured greater diversity and effectively reduced reference bias. Our findings highlight the role of the mango pangenome in reducing reference bias and underscore the critical role of reference genome selection, suggesting that it is one of the most important factors in genomic studies.\",\"PeriodicalId\":13179,\"journal\":{\"name\":\"Horticulture Research\",\"volume\":\"77 1\",\"pages\":\"\"},\"PeriodicalIF\":8.7000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Horticulture Research\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://doi.org/10.1093/hr/uhaf166\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Agricultural and Biological Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Horticulture Research","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.1093/hr/uhaf166","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

摘要

大多数基因组学研究都是从将测序数据映射到参考基因组开始的。参考基因组组装的质量、与研究群体的遗传相关性以及所采用的作图方法直接影响变异调用的准确性和随后的基因组分析,引入参考偏差并导致错误的结论。然而,参考偏差的影响和减少参考偏差的方法得到的关注有限。本研究比较了芒果(Mangifera indica)的四种不同参考基因组的基因组分析,包括两个单倍体组装的端粒到端粒(T2T)基因组组装、一个泛基因组和NCBI上可获得的一个较旧版本的参考基因组。参考基因组的选择极大地影响了作图效率,导致遗传变异特别是结构变异(SVs)的命名存在显著差异。与遗传分化相比,系统发育分析对参考基因组更为敏感。人工选择在驯化和SV热点地区的群体基因组分析在参考基因组中存在差异。值得注意的是,基因富集分析显示,根据所使用的参考基因组,顶部富集的生物过程存在显著差异。总体而言,芒果泛基因组在各种指标上的表现优于其他参考基因组,其次是T2T参考基因组,因为它们捕获了更大的多样性并有效地减少了参考偏差。我们的研究结果强调了芒果泛基因组在减少参考偏倚中的作用,并强调了参考基因组选择的关键作用,这表明它是基因组研究中最重要的因素之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Mango Pangenome Reveals Dramatic Impacts of Reference Bias on Genomic Analyses
Most genomics studies start by mapping sequencing data to a reference genome. The quality of reference genome assembly, genetic relatedness to the studied population, and the mapping method employed directly impact variant calling accuracy and subsequent genomic analyses, introducing reference bias and resulting in erroneous conclusions. However, the impacts of reference bias and methods to reduce it have gained limited attention. This study compared genomic analyses using four different reference genomes of mango (Mangifera indica), including the two haploid assemblies of haplotype-resolved telomere-to-telomere (T2T) genome assembly, a pangenome, and an older version of the reference genome available on NCBI. The choice of reference genome dramatically impacted the mapping efficiency and resulted in notable differences in calling the genetic variants, particularly structural variations (SVs). Phylogenetic analysis was more sensitive to the reference genome compared to genetic differentiation. Population genomic analyses of artificial selection in domestication and SV hotspot regions varied across reference genomes. Notably, the gene enrichment analyses showed significant differences in the top enriched biological processes depending on the reference genome used. Overall, the mango pangenome outperformed the other reference genomes across various metrics, followed by T2T reference genomes, as they captured greater diversity and effectively reduced reference bias. Our findings highlight the role of the mango pangenome in reducing reference bias and underscore the critical role of reference genome selection, suggesting that it is one of the most important factors in genomic studies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Horticulture Research
Horticulture Research Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
11.20
自引率
6.90%
发文量
367
审稿时长
20 weeks
期刊介绍: Horticulture Research, an open access journal affiliated with Nanjing Agricultural University, has achieved the prestigious ranking of number one in the Horticulture category of the Journal Citation Reports ™ from Clarivate, 2022. As a leading publication in the field, the journal is dedicated to disseminating original research articles, comprehensive reviews, insightful perspectives, thought-provoking comments, and valuable correspondence articles and letters to the editor. Its scope encompasses all vital aspects of horticultural plants and disciplines, such as biotechnology, breeding, cellular and molecular biology, evolution, genetics, inter-species interactions, physiology, and the origination and domestication of crops.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信