Jiaxin Liu , Dongxin Mo , Lingyun Luo, Yilong Shi, Songsong Xu
{"title":"绵羊泛基因组恢复了驯化和选择过程中丢失的序列和基因","authors":"Jiaxin Liu , Dongxin Mo , Lingyun Luo, Yilong Shi, Songsong Xu","doi":"10.1016/j.ygeno.2025.111047","DOIUrl":null,"url":null,"abstract":"<div><div>The reference genome plays a crucial role in uncovering genomic variations, which increase our understanding of the molecular mechanisms influencing biological traits. However, most of the sheep reference genomes derive from a single individual, which couldn't adequately represent the genetic diversity of sheep. The map-to-pan strategy was used to construct the sheep pan-genome based on 801 samples with short read whole genome sequencing data including 724 domestic individuals from 151 sheep populations/breeds and 77 wild individuals from seven genus <em>Ovis</em> species, and a total of 195 Mb of nonreference sequences were assembled that absent from the <em>ARS-UI_Ramb_v2.0</em> reference. MAKER2 pipeline, integrating ab initio gene prediction, RNA-Seq, and protein homology was used to annotate the nonreference sequences. As a result, a total of additional 2678 genes were predicted in the nonreference sequences. We also identified 13,317 novel single nucleotide polymorphisms (SNPs) by mapping the sequences that could not be aligned to <em>ARS1-UI_Ramb_v2.0</em> to the nonreference sequences. Population genetic analysis, including principal component analysis (PCA), phylogenetic tree, and ADMIXTURE based on the novel SNPs revealed a clear phylogenetic relationship of the world's domestic sheep, as well as their close wild relatives. Additionally, pangenome-wide presence and absence variations (PAVs) analysis exhibited a decreasing trend in gene number from wildto domestic populations. Several genes, including <em>GZMH</em>, <em>NFE2L3</em>, <em>GPR146</em> and <em>CALHM6</em> with significant changes of presence frequencies during the evolutionary history of sheep were identified by PAV selection analysis. Functional annotation revealed that these genes were primarily associated with immune responses. Our results highlight the implications of the sheep pan-genome in identifying previously unknown genetic variations.These findings broaden our knowledge about the genetic diversity in sheep genomes, and provide insight into the domestication and breeding history of sheep.</div></div>","PeriodicalId":12521,"journal":{"name":"Genomics","volume":"117 3","pages":"Article 111047"},"PeriodicalIF":3.4000,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sheep pan-genome retrieves the lost sequences and genes during domestication and selection\",\"authors\":\"Jiaxin Liu , Dongxin Mo , Lingyun Luo, Yilong Shi, Songsong Xu\",\"doi\":\"10.1016/j.ygeno.2025.111047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The reference genome plays a crucial role in uncovering genomic variations, which increase our understanding of the molecular mechanisms influencing biological traits. However, most of the sheep reference genomes derive from a single individual, which couldn't adequately represent the genetic diversity of sheep. The map-to-pan strategy was used to construct the sheep pan-genome based on 801 samples with short read whole genome sequencing data including 724 domestic individuals from 151 sheep populations/breeds and 77 wild individuals from seven genus <em>Ovis</em> species, and a total of 195 Mb of nonreference sequences were assembled that absent from the <em>ARS-UI_Ramb_v2.0</em> reference. MAKER2 pipeline, integrating ab initio gene prediction, RNA-Seq, and protein homology was used to annotate the nonreference sequences. As a result, a total of additional 2678 genes were predicted in the nonreference sequences. We also identified 13,317 novel single nucleotide polymorphisms (SNPs) by mapping the sequences that could not be aligned to <em>ARS1-UI_Ramb_v2.0</em> to the nonreference sequences. Population genetic analysis, including principal component analysis (PCA), phylogenetic tree, and ADMIXTURE based on the novel SNPs revealed a clear phylogenetic relationship of the world's domestic sheep, as well as their close wild relatives. Additionally, pangenome-wide presence and absence variations (PAVs) analysis exhibited a decreasing trend in gene number from wildto domestic populations. Several genes, including <em>GZMH</em>, <em>NFE2L3</em>, <em>GPR146</em> and <em>CALHM6</em> with significant changes of presence frequencies during the evolutionary history of sheep were identified by PAV selection analysis. Functional annotation revealed that these genes were primarily associated with immune responses. Our results highlight the implications of the sheep pan-genome in identifying previously unknown genetic variations.These findings broaden our knowledge about the genetic diversity in sheep genomes, and provide insight into the domestication and breeding history of sheep.</div></div>\",\"PeriodicalId\":12521,\"journal\":{\"name\":\"Genomics\",\"volume\":\"117 3\",\"pages\":\"Article 111047\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0888754325000631\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0888754325000631","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
Sheep pan-genome retrieves the lost sequences and genes during domestication and selection
The reference genome plays a crucial role in uncovering genomic variations, which increase our understanding of the molecular mechanisms influencing biological traits. However, most of the sheep reference genomes derive from a single individual, which couldn't adequately represent the genetic diversity of sheep. The map-to-pan strategy was used to construct the sheep pan-genome based on 801 samples with short read whole genome sequencing data including 724 domestic individuals from 151 sheep populations/breeds and 77 wild individuals from seven genus Ovis species, and a total of 195 Mb of nonreference sequences were assembled that absent from the ARS-UI_Ramb_v2.0 reference. MAKER2 pipeline, integrating ab initio gene prediction, RNA-Seq, and protein homology was used to annotate the nonreference sequences. As a result, a total of additional 2678 genes were predicted in the nonreference sequences. We also identified 13,317 novel single nucleotide polymorphisms (SNPs) by mapping the sequences that could not be aligned to ARS1-UI_Ramb_v2.0 to the nonreference sequences. Population genetic analysis, including principal component analysis (PCA), phylogenetic tree, and ADMIXTURE based on the novel SNPs revealed a clear phylogenetic relationship of the world's domestic sheep, as well as their close wild relatives. Additionally, pangenome-wide presence and absence variations (PAVs) analysis exhibited a decreasing trend in gene number from wildto domestic populations. Several genes, including GZMH, NFE2L3, GPR146 and CALHM6 with significant changes of presence frequencies during the evolutionary history of sheep were identified by PAV selection analysis. Functional annotation revealed that these genes were primarily associated with immune responses. Our results highlight the implications of the sheep pan-genome in identifying previously unknown genetic variations.These findings broaden our knowledge about the genetic diversity in sheep genomes, and provide insight into the domestication and breeding history of sheep.
期刊介绍:
Genomics is a forum for describing the development of genome-scale technologies and their application to all areas of biological investigation.
As a journal that has evolved with the field that carries its name, Genomics focuses on the development and application of cutting-edge methods, addressing fundamental questions with potential interest to a wide audience. Our aim is to publish the highest quality research and to provide authors with rapid, fair and accurate review and publication of manuscripts falling within our scope.