The paradigm of genomic selection: Does it need an update?

Johannes A. Lenstra
{"title":"The paradigm of genomic selection: Does it need an update?","authors":"Johannes A. Lenstra","doi":"10.1002/aro2.88","DOIUrl":null,"url":null,"abstract":"<p>The genetics and genomics of livestock is, as for other species, a dynamic and successful field of research. It is divided into two clearly different, although closely interacting disciplines: the molecular and the quantitative genetics. Remarkably, this contrast has a close parallel in the opposing views during a short and fierce war (1904–1906) between Mendelians and biometricians. Although the accepted views soon became more balanced [<span>1, 2</span>], the 20th century saw the emergence of two distinct genetic disciplines.</p><p>The development of the molecular genetics is an amazing and unending series of pioneering success stories featuring a legion of Nobel prize winners [<span>3</span>]: from chromosomes to DNA and to the central dogma; from recombinant DNA to PCR, microsatellites and SNPs; the routine whole-genome sequencing (WGS) with telomere to telomere genomes and pangenomes as the newest toys; and now also the CRISPR/Cas9 gene editing, although not yet of primary relevance for livestock [<span>4, 5</span>]. This was all typical laboratory science, which now has become a lot cleaner by automation and a growing emphasis on bioinformatics.</p><p>It illustrates the hectic progress that the promises made after one breakthrough were fulfilled after the next. Southern blotting of restriction fragment length polymorphism (RFLP) markers in the 80s and a little later the PCR–RFLP did not deliver the intended dense genetic map of a genome, so the discovery at the end of the decade of the microsatellites was most timely. This allowed the genetic mapping of monogenic traits, but until 20 years ago most causative mutations in livestock species were found via the candidate gene approach [<span>1, 6</span>]. In the new millennium microsatellites were replaced by high-density genome-wide SNP arrays, which deliver accurate genetic localizations. At the same time, WGS became affordable and monogenic causative variants became sitting ducks. However, we did not unravel the molecular mechanisms of complex traits [<span>6, 7</span>], so now we accept a less than satisfactory infinitesimal model of countless small contributions [<span>4</span>].</p><p>Starting during the decade of WWII, the quantitative geneticists, who never touch a pipette, started to provide scientific support to the breeding industry and developed the concept of breeding values [<span>8</span>]. For a long time, this was solely based on phenotypes, but they did not hesitate to exploit the advances in the molecular field. During the last 2 decades of the millennium the concept or dream of master-assisted selection was an important source of inspiration [<span>9, 10</span>]. This led to genetic localizations of enough quantitative trait loci (QTL) to fill the Animal QTLdb, but these explain only a small part of the phenotypic variation [<span>4</span>].</p><p>Again, we needed another breakthrough to fulfill the promises already made. In a visionary paper, Meuwissen et al. proposed genetic selection (GS) based on the predicted contributions to the breeding value of variants across the whole genome [<span>11</span>]. GS became a resounding success [<span>7</span>], a triumph for quantitative genetics, which now ensures a continuous genetic progress for the highly productive breeds all around the world. Breeders are happy, so why should we still care for the underlying molecular mechanisms?</p><p>Of course, we care [<span>7</span>]. The molecular geneticists did not sit on their hands. WGS data reveal a multitude of missense and nonsense mutations, and we can predict their functional consequences. If a deleterious mutation results in a loss-of-function of an indispensable protein, there are for this mutation no homozygotes in the population. This autozygous depletion by embryonic lethality is also observed on the haplotype level [<span>4</span>]. Less drastic effects of autozygous genotypes (or of compound heterozygotes if the parental and maternal gene copies carry different recessive deleterious mutations) are sterility, a genetic disorder, reduced fitness and/or low productivity. Deleterious mutations may also be dominant in haploinsufficient genes all this also holds for the regulatory mutations controlling gene expressions. These are more difficult to identify in WGS datasets, but may be detected via their deleterious effects.</p><p>Fitness and performance are polygenic traits, but their causative variants may be linked to “intermediate phenotypes” or “endophenotypes”, for instance gene expression levels, enzyme activities or metabolite concentrations [<span>4, 12</span>].</p><p>A more recent and important development is the discovery by novel long-read sequencing of large structural variations (SVs): deletions, copy number variations or divergent (and therefore non-recombining) alleles involving up to millions of base pair. These were so far largely overlooked by short-read WGS, but change the gene repertoire, disrupt topologically associating domains and are associated with genetic diseases and several other traits [<span>13-16</span>]. Because of these observations, SVs are now considered as a major source of phenotypic variation.</p><p>There are several examples of balancing selection, often by a favorable effect of heterozygote genotypes on production and a deleterious effect of homozygote mutant genotypes on fitness [<span>4</span>]. However, it is plausible that for most gene variants effects on fitness and on agricultural performance correlate. Given the multitude of variants that potentially affect fitness, linkage disequilibrium with these variants is likely to cover a large part of the genome. Plausibly, this explains the perception of the infinitesimal model that underlies the GS.</p><p>It may make sense to improve GS by accommodating the effect of causative variants [<span>4</span>] as far as their contributions to breeding values can be verified. This of course also applies to the large SVs. We do not know yet how many causative variants, many of which have low minor allele frequencies, would have the same predictive power as the current GS.</p><p>Another open question: would a panel of 10,000 or 20,000 of the most consequential causative variants already be useful? Why these numbers? These are presently the numbers of variants that can be genotyped in low-density bead arrays for $ 50 or less, depending on the number of samples. For the highly productive cattle breeds, this is not too high to invest in a single cow. Thus, we would see another paradigm shift, already mentioned by Georges et al. [<span>4</span>]: the routine testing of both sires and dams, allowing a two-dimensional GS (2D-GS) of breeding mates. Instead of “one sire fits all”, a DNA-mediated female choice would accomplish an individual heterosis, which is expected to improve performance, health and well-being of the offspring as well as the genetic diversity of the population.</p><p>The feasibility of 2D-GS obviously depends on the number of variants that can be tested, the additional income per tested female and practical considerations. It will be more difficult for other species and minor breeds, so new and cheaper technologies are welcome. Only variants with a broad breed distribution would be useful across breeds. However, the many different breeds with separate histories and selection regimes ensures together with de novo variants an endless supply of causative variants, which are worth investigating. For instance, they may implicate genes for which we do not have yet a clue about their function.</p><p>Will this work? Or is it just another dream? At least, we would collect a lot of data on the functional effects of DNA variants. This is of fundamental interest and remains the core business of molecular genetics.</p><p><b>Johannes A. Lenstra</b>: Conceptualization; investigation; methodology; writing—original draft; writing—review &amp; editing.</p><p>The author declares no conflicts of interest.</p>","PeriodicalId":100086,"journal":{"name":"Animal Research and One Health","volume":"2 4","pages":"360-362"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/aro2.88","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Animal Research and One Health","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/aro2.88","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The genetics and genomics of livestock is, as for other species, a dynamic and successful field of research. It is divided into two clearly different, although closely interacting disciplines: the molecular and the quantitative genetics. Remarkably, this contrast has a close parallel in the opposing views during a short and fierce war (1904–1906) between Mendelians and biometricians. Although the accepted views soon became more balanced [1, 2], the 20th century saw the emergence of two distinct genetic disciplines.

The development of the molecular genetics is an amazing and unending series of pioneering success stories featuring a legion of Nobel prize winners [3]: from chromosomes to DNA and to the central dogma; from recombinant DNA to PCR, microsatellites and SNPs; the routine whole-genome sequencing (WGS) with telomere to telomere genomes and pangenomes as the newest toys; and now also the CRISPR/Cas9 gene editing, although not yet of primary relevance for livestock [4, 5]. This was all typical laboratory science, which now has become a lot cleaner by automation and a growing emphasis on bioinformatics.

It illustrates the hectic progress that the promises made after one breakthrough were fulfilled after the next. Southern blotting of restriction fragment length polymorphism (RFLP) markers in the 80s and a little later the PCR–RFLP did not deliver the intended dense genetic map of a genome, so the discovery at the end of the decade of the microsatellites was most timely. This allowed the genetic mapping of monogenic traits, but until 20 years ago most causative mutations in livestock species were found via the candidate gene approach [1, 6]. In the new millennium microsatellites were replaced by high-density genome-wide SNP arrays, which deliver accurate genetic localizations. At the same time, WGS became affordable and monogenic causative variants became sitting ducks. However, we did not unravel the molecular mechanisms of complex traits [6, 7], so now we accept a less than satisfactory infinitesimal model of countless small contributions [4].

Starting during the decade of WWII, the quantitative geneticists, who never touch a pipette, started to provide scientific support to the breeding industry and developed the concept of breeding values [8]. For a long time, this was solely based on phenotypes, but they did not hesitate to exploit the advances in the molecular field. During the last 2 decades of the millennium the concept or dream of master-assisted selection was an important source of inspiration [9, 10]. This led to genetic localizations of enough quantitative trait loci (QTL) to fill the Animal QTLdb, but these explain only a small part of the phenotypic variation [4].

Again, we needed another breakthrough to fulfill the promises already made. In a visionary paper, Meuwissen et al. proposed genetic selection (GS) based on the predicted contributions to the breeding value of variants across the whole genome [11]. GS became a resounding success [7], a triumph for quantitative genetics, which now ensures a continuous genetic progress for the highly productive breeds all around the world. Breeders are happy, so why should we still care for the underlying molecular mechanisms?

Of course, we care [7]. The molecular geneticists did not sit on their hands. WGS data reveal a multitude of missense and nonsense mutations, and we can predict their functional consequences. If a deleterious mutation results in a loss-of-function of an indispensable protein, there are for this mutation no homozygotes in the population. This autozygous depletion by embryonic lethality is also observed on the haplotype level [4]. Less drastic effects of autozygous genotypes (or of compound heterozygotes if the parental and maternal gene copies carry different recessive deleterious mutations) are sterility, a genetic disorder, reduced fitness and/or low productivity. Deleterious mutations may also be dominant in haploinsufficient genes all this also holds for the regulatory mutations controlling gene expressions. These are more difficult to identify in WGS datasets, but may be detected via their deleterious effects.

Fitness and performance are polygenic traits, but their causative variants may be linked to “intermediate phenotypes” or “endophenotypes”, for instance gene expression levels, enzyme activities or metabolite concentrations [4, 12].

A more recent and important development is the discovery by novel long-read sequencing of large structural variations (SVs): deletions, copy number variations or divergent (and therefore non-recombining) alleles involving up to millions of base pair. These were so far largely overlooked by short-read WGS, but change the gene repertoire, disrupt topologically associating domains and are associated with genetic diseases and several other traits [13-16]. Because of these observations, SVs are now considered as a major source of phenotypic variation.

There are several examples of balancing selection, often by a favorable effect of heterozygote genotypes on production and a deleterious effect of homozygote mutant genotypes on fitness [4]. However, it is plausible that for most gene variants effects on fitness and on agricultural performance correlate. Given the multitude of variants that potentially affect fitness, linkage disequilibrium with these variants is likely to cover a large part of the genome. Plausibly, this explains the perception of the infinitesimal model that underlies the GS.

It may make sense to improve GS by accommodating the effect of causative variants [4] as far as their contributions to breeding values can be verified. This of course also applies to the large SVs. We do not know yet how many causative variants, many of which have low minor allele frequencies, would have the same predictive power as the current GS.

Another open question: would a panel of 10,000 or 20,000 of the most consequential causative variants already be useful? Why these numbers? These are presently the numbers of variants that can be genotyped in low-density bead arrays for $ 50 or less, depending on the number of samples. For the highly productive cattle breeds, this is not too high to invest in a single cow. Thus, we would see another paradigm shift, already mentioned by Georges et al. [4]: the routine testing of both sires and dams, allowing a two-dimensional GS (2D-GS) of breeding mates. Instead of “one sire fits all”, a DNA-mediated female choice would accomplish an individual heterosis, which is expected to improve performance, health and well-being of the offspring as well as the genetic diversity of the population.

The feasibility of 2D-GS obviously depends on the number of variants that can be tested, the additional income per tested female and practical considerations. It will be more difficult for other species and minor breeds, so new and cheaper technologies are welcome. Only variants with a broad breed distribution would be useful across breeds. However, the many different breeds with separate histories and selection regimes ensures together with de novo variants an endless supply of causative variants, which are worth investigating. For instance, they may implicate genes for which we do not have yet a clue about their function.

Will this work? Or is it just another dream? At least, we would collect a lot of data on the functional effects of DNA variants. This is of fundamental interest and remains the core business of molecular genetics.

Johannes A. Lenstra: Conceptualization; investigation; methodology; writing—original draft; writing—review & editing.

The author declares no conflicts of interest.

基因组选择范式:是否需要更新?
家畜的遗传学和基因组学与其他物种一样,是一个充满活力和成功的研究领域。它分为两个明显不同但又密切相关的学科:分子遗传学和数量遗传学。值得注意的是,这种对比与孟德尔学派和生物计量学派之间短暂而激烈的战争(1904-1906 年)期间的对立观点密切相关。尽管公认的观点很快变得更加平衡[1, 2],但 20 世纪出现了两个截然不同的遗传学学科。分子遗传学的发展是一连串令人惊叹、无休止的开创性成功故事,诺贝尔奖获得者层出不穷[3]:从染色体到 DNA,再到中心教条;从 DNA 重组到 PCR、微卫星和 SNP;常规的全基因组测序(WGS),端粒到端粒基因组和泛基因组是最新的玩具;现在还有 CRISPR/Cas9 基因编辑技术,尽管对家畜来说还不是最重要的[4, 5]。这些都是典型的实验室科学,而现在由于自动化和对生物信息学的日益重视,实验室科学已经变得更加洁净。80 年代的南方印迹限制性片段长度多态性(RFLP)标记和稍后的 PCR-RFLP,都没有绘制出预期的基因组密集遗传图谱。这使得单基因性状的基因图谱得以绘制,但直到 20 年前,家畜物种中的大多数致病突变都是通过候选基因方法发现的 [1,6]。进入新千年后,高密度全基因组 SNP 阵列取代了微卫星,提供了准确的基因定位。与此同时,WGS 也变得经济实惠,单基因致病变异也变得唾手可得。然而,我们并没有解开复杂性状的分子机制[6, 7],所以现在我们接受了一个由无数微小贡献组成的不太令人满意的无限小模型[4]。从二战十年开始,从不碰移液管的数量遗传学家开始为育种行业提供科学支持,并提出了育种价值的概念[8]。在很长一段时间里,这完全是基于表型,但他们毫不犹豫地利用了分子领域的进步。在千禧年的最后 20 年里,主辅助选择的概念或梦想是一个重要的灵感来源[9, 10]。这导致了足够多的数量性状基因座(QTL)的遗传定位,以填充动物 QTLdb,但这些基因座只能解释表型变异的一小部分[4]。Meuwissen 等人在一篇富有远见的论文中提出了基于全基因组变异对育种价值的预测贡献的遗传选择(GS)[11]。GS 取得了巨大成功[7],是数量遗传学的一次胜利,它确保了全世界高产品种在遗传方面的不断进步。育种者很高兴,那么我们为什么还要关心其背后的分子机制呢?分子遗传学家并没有坐以待毙。WGS 数据揭示了大量的错义突变和无义突变,我们可以预测它们的功能性后果。如果一个有害突变导致一种不可或缺的蛋白质功能缺失,那么人群中就不会出现这种突变的同源基因。在单倍型水平上也能观察到这种因胚胎致死而导致的同源基因损耗[4]。自合基因型(或复合杂合子,如果亲本和母本基因拷贝携带不同的隐性有害突变)的较轻微影响是不育、遗传紊乱、体质下降和/或生产力低下。在单倍体基因中,有害突变也可能是显性的,控制基因表达的调控突变也是如此。体质和性能是多基因性状,但其致病变异可能与 "中间表型 "或 "内表型 "有关,例如基因表达水平、酶活性或代谢物浓度[4, 12]。最近的一项重要进展是通过新型长读数测序发现了大型结构变异(SVs):涉及多达数百万碱基对的缺失、拷贝数变异或不同等位基因(因此是非重组的)。迄今为止,短线程 WGS 在很大程度上忽略了这些变异,但它们却改变了基因库,破坏了拓扑关联域,并与遗传疾病和其他一些性状有关 [13-16]。由于这些观察结果,SVs 现在被认为是表型变异的一个主要来源。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信