{"title":"Genomic AT Bias Coupled with Amino Acid Metabolism Modulates Codon Usage.","authors":"Lucio Aliperti Car, Ignacio E Sánchez","doi":"10.1007/s00239-025-10251-x","DOIUrl":null,"url":null,"abstract":"<p><p>Encoding of protein-coding sequences in a genome through evolution leads to characteristic proportions of codons and amino acids. Here, we present a simplified maximum entropy model that groups together codons with the same GC (guanine + cytosine) content and coding for the same amino acid and accounts for the stoichiometry of genetic elements in over 50000 genomes with seven interpretable parameters. Our model includes both the cost of a codon given a genomic GC content and the metabolic cost of the corresponding amino acid. Both costs are essential for accurate prediction of codon and amino acid abundances. The best implementation of the model includes a universal equilibrium value for the genomic GC content below 50%, as suggested by the literature. It also splits the twenty amino acids in two groups forming strong (bases C and G) or weak (bases A and U) Watson Crick base pairs with the anticodon, differing in the strength of GC-dependent selection. The entropy-cost trade-off suggests that each organism has sorted out the genome encoding problem given a value for its genomic GC content. The empirical boundaries to this trade-off suggest minimal values for the amino acid and codon entropies, which may limit the GC content of natural genomes.</p>","PeriodicalId":16366,"journal":{"name":"Journal of Molecular Evolution","volume":" ","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Molecular Evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1007/s00239-025-10251-x","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Encoding of protein-coding sequences in a genome through evolution leads to characteristic proportions of codons and amino acids. Here, we present a simplified maximum entropy model that groups together codons with the same GC (guanine + cytosine) content and coding for the same amino acid and accounts for the stoichiometry of genetic elements in over 50000 genomes with seven interpretable parameters. Our model includes both the cost of a codon given a genomic GC content and the metabolic cost of the corresponding amino acid. Both costs are essential for accurate prediction of codon and amino acid abundances. The best implementation of the model includes a universal equilibrium value for the genomic GC content below 50%, as suggested by the literature. It also splits the twenty amino acids in two groups forming strong (bases C and G) or weak (bases A and U) Watson Crick base pairs with the anticodon, differing in the strength of GC-dependent selection. The entropy-cost trade-off suggests that each organism has sorted out the genome encoding problem given a value for its genomic GC content. The empirical boundaries to this trade-off suggest minimal values for the amino acid and codon entropies, which may limit the GC content of natural genomes.
期刊介绍:
Journal of Molecular Evolution covers experimental, computational, and theoretical work aimed at deciphering features of molecular evolution and the processes bearing on these features, from the initial formation of macromolecular systems through their evolution at the molecular level, the co-evolution of their functions in cellular and organismal systems, and their influence on organismal adaptation, speciation, and ecology. Topics addressed include the evolution of informational macromolecules and their relation to more complex levels of biological organization, including populations and taxa, as well as the molecular basis for the evolution of ecological interactions of species and the use of molecular data to infer fundamental processes in evolutionary ecology. This coverage accommodates such subfields as new genome sequences, comparative structural and functional genomics, population genetics, the molecular evolution of development, the evolution of gene regulation and gene interaction networks, and in vitro evolution of DNA and RNA, molecular evolutionary ecology, and the development of methods and theory that enable molecular evolutionary inference, including but not limited to, phylogenetic methods.