Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences

IF 2.3 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Joseph Hannon Bozorgmehr
{"title":"Four classic “de novo” genes all have plausible homologs and likely evolved from retro-duplicated or pseudogenic sequences","authors":"Joseph Hannon Bozorgmehr","doi":"10.1007/s00438-023-02090-6","DOIUrl":null,"url":null,"abstract":"<p>Despite being previously regarded as extremely unlikely, the idea that entirely novel protein-coding genes can emerge from non-coding sequences has gradually become accepted over the past two decades. Examples of “de novo origination”, resulting in lineage-specific “orphan” genes, lacking coding orthologs, are now produced every year. However, many are likely cases of duplicates that are difficult to recognize. Here, I re-examine the claims and show that four very well-known examples of genes alleged to have emerged completely “from scratch”— <i>FLJ33706</i> in humans, <i>Goddard</i> in fruit flies, <i>BSC4</i> in baker’s yeast and <i>AFGP2</i> in codfish—may have plausible evolutionary ancestors in pre-existing genes. The first two are likely highly diverged retrogenes coding for regulatory proteins that have been misidentified as orphans. The antifreeze glycoprotein, moreover, may not have evolved from repetitive non-genic sequences but, as in several other related cases, from an apolipoprotein that could have become pseudogenized before later being reactivated. These findings detract from various claims made about de novo gene birth and show there has been a tendency not to invest the necessary effort in searching for homologs outside of a very limited syntenic or phylostratigraphic methodology. A robust approach is used for improving detection that draws upon similarities, not just in terms of statistical sequence analysis, but also relating to biochemistry and function, to obviate notable failures to identify homologs.</p>","PeriodicalId":18816,"journal":{"name":"Molecular Genetics and Genomics","volume":null,"pages":null},"PeriodicalIF":2.3000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Genetics and Genomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00438-023-02090-6","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Despite being previously regarded as extremely unlikely, the idea that entirely novel protein-coding genes can emerge from non-coding sequences has gradually become accepted over the past two decades. Examples of “de novo origination”, resulting in lineage-specific “orphan” genes, lacking coding orthologs, are now produced every year. However, many are likely cases of duplicates that are difficult to recognize. Here, I re-examine the claims and show that four very well-known examples of genes alleged to have emerged completely “from scratch”— FLJ33706 in humans, Goddard in fruit flies, BSC4 in baker’s yeast and AFGP2 in codfish—may have plausible evolutionary ancestors in pre-existing genes. The first two are likely highly diverged retrogenes coding for regulatory proteins that have been misidentified as orphans. The antifreeze glycoprotein, moreover, may not have evolved from repetitive non-genic sequences but, as in several other related cases, from an apolipoprotein that could have become pseudogenized before later being reactivated. These findings detract from various claims made about de novo gene birth and show there has been a tendency not to invest the necessary effort in searching for homologs outside of a very limited syntenic or phylostratigraphic methodology. A robust approach is used for improving detection that draws upon similarities, not just in terms of statistical sequence analysis, but also relating to biochemistry and function, to obviate notable failures to identify homologs.

Abstract Image

四个典型的 "从头开始 "基因都有可信的同源物,很可能是从逆向复制或假基因序列进化而来的
尽管以前人们认为从非编码序列中产生全新蛋白质编码基因的可能性极小,但在过去二十年中,这种观点已逐渐被人们所接受。现在,每年都有 "从头开始 "的例子,这些例子产生了缺乏编码同源物的特异性 "孤儿 "基因。然而,其中许多可能是难以识别的重复基因。在这里,我重新审视了这些说法,并证明了四个非常著名的据称完全 "从零开始 "的基因实例--人类的 FLJ33706、果蝇的 Goddard、面包酵母中的 BSC4 和鳕鱼中的 AFGP2--可能在进化过程中与先前存在的基因有着似是而非的祖先关系。前两个基因很可能是编码调控蛋白的高度分化的逆源基因,但被误认为是 "孤儿"。此外,抗冻糖蛋白可能不是从重复的非基因序列进化而来,而是像其他几个相关案例一样,从一种脂蛋白进化而来,这种脂蛋白在后来被重新激活之前可能已经被假基因化。这些发现有悖于关于新基因诞生的各种说法,并表明人们倾向于不投入必要的精力,在非常有限的同系或植物地层学方法之外寻找同源物。我们采用了一种稳健的方法来提高检测效率,这种方法不仅在统计序列分析方面,而且在生物化学和功能方面都利用了相似性,从而避免了同源物鉴定的显著失败。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Molecular Genetics and Genomics
Molecular Genetics and Genomics 生物-生化与分子生物学
CiteScore
5.10
自引率
3.20%
发文量
134
审稿时长
1 months
期刊介绍: Molecular Genetics and Genomics (MGG) publishes peer-reviewed articles covering all areas of genetics and genomics. Any approach to the study of genes and genomes is considered, be it experimental, theoretical or synthetic. MGG publishes research on all organisms that is of broad interest to those working in the fields of genetics, genomics, biology, medicine and biotechnology. The journal investigates a broad range of topics, including these from recent issues: mechanisms for extending longevity in a variety of organisms; screening of yeast metal homeostasis genes involved in mitochondrial functions; molecular mapping of cultivar-specific avirulence genes in the rice blast fungus and more.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信