Francesco Cicconardi, Callum F McLellan, Alice Seguret, W Owen McMillan, Stephen H Montgomery
{"title":"Convergent Molecular Evolution Associated With Repeated Transitions to Gregarious Larval Behavior in Heliconiini.","authors":"Francesco Cicconardi, Callum F McLellan, Alice Seguret, W Owen McMillan, Stephen H Montgomery","doi":"10.1093/molbev/msaf179","DOIUrl":"10.1093/molbev/msaf179","url":null,"abstract":"<p><p>Collective behavior forms the basis for many antipredator strategies. Within Lepidoptera, larval gregariousness has evolved convergently across many phylogenetically disparate lineages. While the selection pressures shaping variation in larval social behaviors are well investigated, much less is known about the mechanisms that control social attraction and behavioral coordination. Similarly, little is known about how secondary selection pressures associated with social living shape genome evolution. Here, using genomic data for over 60 species from an adaptive radiation of Neotropical butterflies, the Heliconiini, in which gregarious behavior has evolved repeatedly, we explore the molecular basis of repeated convergent shifts toward gregarious larvae. We focus on three main areas of genomic evolution: differential selection on homologous genes, accelerated rates of evolution on noncoding regions of key genes, and differential gene expression in the brains of solitary and gregarious larvae. We identify strong signatures of convergent molecular evolution, on both coding and noncoding loci, in Heliconiini lineages, which evolved gregarious behavior. Molecular convergence is also detected at the transcriptomic level in larval brains, suggesting convergent shifts in gene regulation in neural tissue. Among loci showing strong signals of convergent evolution in gregarious lineages, we identify several strong candidates linked to neural activity, feeding behavior, and immune pathways. Our results suggest social living profoundly changes the selection pressures acting on multiple physiological, immunological, and behavioral traits.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342998/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144743141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anaísa B Moreno, Kiran Paranjape, Martina Cederblom, Elisabeth Kay, Christian Dobre-Lereanu, Dan I Andersson, Lionel Guy
{"title":"Host-Specific Adaptation of Legionella pneumophila to Single and Multiple Hosts.","authors":"Anaísa B Moreno, Kiran Paranjape, Martina Cederblom, Elisabeth Kay, Christian Dobre-Lereanu, Dan I Andersson, Lionel Guy","doi":"10.1093/molbev/msaf161","DOIUrl":"10.1093/molbev/msaf161","url":null,"abstract":"<p><p>Legionella pneumophila is an endosymbiotic bacterial species able to infect and reproduce in various protist and human hosts. Upon entry into human lungs, they may infect lung macrophages, causing Legionnaires' disease (LD), an atypical pneumonia, using similar mechanisms as in their protozoan hosts, despite the 2 hosts being separated by a billion years of evolution. In this study, we used experimental evolution to identify genes conferring host specificity to L. pneumophila. To this end, we passaged L. pneumophila in 2 different hosts-Acanthamoeba castellanii and the human macrophage-like cells U937-separately and by switching between the hosts twice a week for a year. In total, we identified 1,518 mutations present in at least 5% of the population at the time of sampling. Forty-nine mutations were fixed in the 18 populations at the end of the experiment. Two interesting groups of mutations included (i) mutations in 4 different strain-specific genes involved in lipopolysaccharide (LPS) synthesis, found only in the lineages passaged with A. castellanii and (ii) mutations in the gene coding for LerC, a key regulator of protein effector expression, which was independently mutated in 6 lineages grown in presence of the macrophage cells. We propose that the mutations degrading the function of the regulator LerC improve the fitness of L. pneumophila in human-derived cells and that modifications in the LPS are beneficial for growth in A. castellanii. This study is a first step in further investigating determinants of host specificity in L. pneumophila.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12308824/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144560525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karl A Widney, Lauren C Phillips, Leo M Rusch, Shelley D Copley
{"title":"Mutations Elevate an Underground Pathway to a Physiologically Relevant Protopathway.","authors":"Karl A Widney, Lauren C Phillips, Leo M Rusch, Shelley D Copley","doi":"10.1093/molbev/msaf193","DOIUrl":"10.1093/molbev/msaf193","url":null,"abstract":"<p><p>Underground metabolic pathways-leaks in the metabolic network caused by promiscuous enzyme activities and nonenzymatic transformations-can provide the starting point for emergence of novel protopathways if a mutation or environmental change increases flux to a physiologically significant level. This early stage in pathway evolution, in which promiscuous enzymes are still serving their native functions and proper regulation has not yet emerged, is typically hidden from our view. We previously used laboratory evolution to evolve a novel four-step protopathway in ΔpdxB E. coli, which lacks an enzyme required for synthesis of pyridoxal 5'-phosphate (PLP). By sequencing population genomic DNA from samples archived during the evolution experiment, we have identified mutations that rose and fell in abundance in the population leading to JK1, the dominant clone after 150 population doublings. We have identified the order in which the four mutations arose in JK1 and the physiological effect of each mutation. The first mutation increases the rate of PLP synthesis. The second mutation did not impact PLP synthesis but rather created a cheater that thrived in the population by scavenging nutrients released from the fragile parental cells. Notably, the dominant lineages at the end of the experiment all derived from this cheater strain. The third mutation in JK1 destroyed a PLP phosphatase, which preserves precious PLP. Finally, the fourth mutation improved growth in glucose after the PLP synthesis problem had been solved. Together, these mutations resulted in restoration of PLP synthesis and a 32-fold increase in growth rate.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12393044/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144835720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rowan Green, Huw Richards, Deniz Ozbilek, Francesca Tyrrell, Victoria Barton, Ziang Zhang, Simon C Lovell, Danna R Gifford, Mato Lagator, Andrew J McBain, Rok Krašovec, Christopher G Knight
{"title":"Antimutator and Mutational Spectrum Effects Can Combine to Reduce Evolutionary Potential in Escherichia coli ΔnudJ.","authors":"Rowan Green, Huw Richards, Deniz Ozbilek, Francesca Tyrrell, Victoria Barton, Ziang Zhang, Simon C Lovell, Danna R Gifford, Mato Lagator, Andrew J McBain, Rok Krašovec, Christopher G Knight","doi":"10.1093/molbev/msaf182","DOIUrl":"10.1093/molbev/msaf182","url":null,"abstract":"<p><p>The rate of spontaneous mutation is a key factor in determining the capacity of a population to adapt to a novel environment, for example, a bacterial population exposed to antibiotics. Genetic and environmental factors controlling the mutation rate commonly also cause shifts in the relative rates of different mutational classes, i.e. the mutational spectrum. When the mutational spectrum is altered, the relatively enriched and depleted mutations may differ in their fitness effects. Here, we explore how a reduced mutation rate and altered mutational spectrum can contribute to adaptation in Escherichia coli. We measure mutation rates across a set of Nudix hydrolase deletants, finding multiple strains with an antimutator phenotype. We focus on the antimutator ΔnudJ, which can cause a 6-fold mutation rate reduction relative to the wildtype, with an altered mutational spectrum biased towards A > C transversions. Its reduced mutation rate, most pronounced at low population densities, appears to occur via NudJ's role in nucleotide and/or prenyl metabolism, with a reduced internal ATP pool. Its effects may be reversed by mutations to genes, including waaZ, affecting the outer membrane. Not only does nudJ deletion reduce the probability of antibiotic resistance arising at all but through enhancing an existing hotspot for low fitness A > C rifampicin resistance mutations reduces the expected fitness of strains when resistance does arise. Thus, our findings with ΔnudJ suggest future anti-evolution drug strategies could suppress spontaneous resistance evolution not only through minimizing resistance mutations but also by specifically limiting access to the fittest mutations.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12359138/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144743140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James S Horton, Joshua L Cherry, Gretel Waugh, Tiffany B Taylor
{"title":"GnT Motifs Can Increase T:A→G:C Mutation Rates Over 1000-fold in Bacteria.","authors":"James S Horton, Joshua L Cherry, Gretel Waugh, Tiffany B Taylor","doi":"10.1093/molbev/msaf183","DOIUrl":"10.1093/molbev/msaf183","url":null,"abstract":"<p><p>Nucleotides across a genome do not mutate at equal frequencies. Instead, specific nucleotide positions can exhibit much higher mutation rates than the genomic average due to their immediate nucleotide neighbors. These \"mutational hotspots\" can play a prominent role in adaptive evolution, yet we lack knowledge of which short nucleotide sequences drive hotspots. In this work, we employ a combination of experimental evolution with Pseudomonas fluorescens and bioinformatic analysis of various Salmonella species to characterize a short nucleotide motif (≥8 bp) that can drive T:A→G:C mutation rates >1000-fold higher than the baseline T→G rate in bacteria. First, we experimentally confirm previous analysis showing that homopolymeric tracts (≥3) of G with a 3' T frequently mutate so that the T is replaced with a G, resulting in an extension of the guanine tract, i.e. GGGT → GGGG. We then demonstrate that the potency of this T:A→G:C hotspot is dependent on the nucleotides immediately flanking the GnT sequence. We find that the dinucleotide immediately 5' to a G4 tract and the dinucleotide immediately 3' to the T strongly affect the T:A→G:C mutation rate, which ranges from ∼5-fold higher than the typical rate to over 1000-fold higher depending on the flanking elements. GnT motifs are therefore comprised of several modular nucleotide components which each exert a significant, quantifiable effect on the mutation rate. This work advances our ability to accurately identify the position and quantify the mutagenicity of hotspot motifs predicated on short nucleotide sequences.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12344412/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144784834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoran Xue, Yunchen Gong, Stephen I Wright, Spencer C H Barrett
{"title":"The Genomic Basis of the Tristylous Floral Polymorphism: Evidence for a Role of Gene Duplications in a Region of Restricted Recombination.","authors":"Haoran Xue, Yunchen Gong, Stephen I Wright, Spencer C H Barrett","doi":"10.1093/molbev/msaf170","DOIUrl":"10.1093/molbev/msaf170","url":null,"abstract":"<p><p>Tristyly is an angiosperm sexual polymorphism characterized by three flower morphs maintained in populations by negative frequency-dependent selection resulting from disassortative mating among morphs. The floral morphs possess reciprocal stigma and anther heights controlled by two epistatically interacting diallelic loci (S and M). Although considerable progress has been made on determining the genetic architecture and genes governing the related heterostylous polymorphism distyly, our understanding of these aspects of the genetic basis of tristyly has not been examined. Here, we address this knowledge gap by investigating the genomic basis of tristyly in Eichhornia paniculata (Pontederiaceae), an annual bee-pollinated herb native to the Neotropics, primarily N.E. Brazil. With chromosome-level genome assemblies of E. paniculata, we dentified the S- and M-loci on either side of a large region of low recombination on the same chromosome. The S-locus consisted of two divergent haplotypes: the S-haplotype (2.51 Mb) with three S-haplotype-specific genes and the s-haplotype (596 kb) with five s-haplotype-specific genes. Two of the S-haplotype-specific genes, LAZY1-S and HRGP-S, were specifically expressed in styles and stamens, respectively, making them candidate tristyly genes and providing evidence for this locus functioning as a hemizygous supergene. The M-locus contained one gene (LAZY1-M), homologous to LAZY1-S, present in the M-haplotype but absent from the m-haplotype. Estimates of gene ages and phylogenetic reconstruction were consistent with the theoretical prediction that the S-locus evolved before the M-locus. Evidence for reuse of the same gene highlights the potential role of gene duplication in the evolution of epistatic multilocus polymorphisms.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12391759/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144667988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexey Markin, Catherine A Macken, Amy L Baker, Tavis K Anderson
{"title":"Revealing Reassortment in Influenza A Viruses with TreeSort.","authors":"Alexey Markin, Catherine A Macken, Amy L Baker, Tavis K Anderson","doi":"10.1093/molbev/msaf133","DOIUrl":"10.1093/molbev/msaf133","url":null,"abstract":"<p><p>Reassortment among influenza A viruses (IAV) facilitates evolution and has been associated with interspecies transmission and pandemics. We introduce a novel tool called TreeSort that accurately identifies recent and ancestral reassortment events on datasets with thousands of IAV whole genomes. TreeSort uses the phylogeny of a selected IAV segment as a reference and finds the branches on the phylogeny where reassortment has occurred with high probability. The tool reports the particular gene segments that were involved in reassortment and how different they are from prior gene pairings. Using TreeSort, we studied reassortment patterns of different IAV subtypes isolated in avian, swine, and human hosts. Avian IAV demonstrated more reassortment than human and swine IAV, with the avian H7 subtype displaying the most frequent reassortment. Reassortment in the swine and human H3 subtypes was more frequent than in the swine and human H1 subtypes, respectively. The highly pathogenic avian influenza H5N1 clade 2.3.4.4b had elevated reassortment rates in the 2020 to 2023 period; however, the surface protein-encoding genes (HA, NA, and MP) co-evolved together with almost no reassortment among these genes. We observed similar co-evolutionary patterns with very low rates of reassortment among the surface proteins for the human H1 and H3 lineages, suggesting that strong co-evolution and preferential pairings among surface proteins are a consequence of high viral fitness. Our algorithm enables real-time tracking of IAV reassortment within and across different hosts and can identify novel viruses for pandemic risk assessment. TreeSort is available at https://github.com/flu-crew/TreeSort.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":"42 8","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342482/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144835729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuanghui Chen, Yan Lu, Hao Chen, Yuwen Pan, Jiaojiao Liu, Shilin Li, Li Jin, Dolikun Mamatyusupu, Shuhua Xu
{"title":"Tracing the Genetic Heritage of the Kirgiz People: Dual-Wave Admixture and Ancestry-Biased Adaptation.","authors":"Shuanghui Chen, Yan Lu, Hao Chen, Yuwen Pan, Jiaojiao Liu, Shilin Li, Li Jin, Dolikun Mamatyusupu, Shuhua Xu","doi":"10.1093/molbev/msaf196","DOIUrl":"10.1093/molbev/msaf196","url":null,"abstract":"<p><p>The Kirgiz, a Turkic-speaking ethnic group with a rich nomadic heritage, represent a pivotal population for understanding human migration and adaptation in Central Asia. However, their genetic origins and admixture history remain largely unexplored. Here, we present the first comprehensive genomic study of Kirgiz populations from Xinjiang, China (XJ.KGZ, n = 36) and their counterparts in Kyrgyzstan (KRG), integrating genome-wide data of 2,406 global individuals. Our analyses reveal four primary ancestry components in XJ.KGZ: East Asian (41.7%), Siberian (25.6%), West Eurasian (25.2%), and South Asian (7.6%). Despite close genetic affinity (FST = 0.13%), XJ.KGZ and KRG diverged ∼447 years ago, with limited gene flow post-split. A two-wave admixture model elucidates their demographic history: an initial East-West Eurasian mixture ∼2,225 years ago, likely reflecting west-east contacts during the period of the Warring States and the Qin Dynasty, followed by secondary admixture events (∼875 to 425 years ago) linked to historical migrations under Mongol and post-Mongol rule. Local adaptation signatures implicate genes critical for cellular tight junction (e.g. PATJ), pathogen invasion (e.g. OR14I1), and cardiac functions (e.g. RYR2) with allele frequency deviations suggesting ancestry-specific selection. While no classical high-altitude adaptation genes (e.g. EPAS1) showed selection signals, RYR2 and C10orf67-implicated in hypoxia response in Tibetan fauna-displayed Western ancestry bias, hinting at convergent adaptation mechanisms. This study advances our understanding of the genetic makeup and admixture history of the Kirgiz people and provides novel insights into human dispersal in Central Asia.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12391873/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144835728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Danielle H Drabeck, Myana Anderson, Emma Y Roback, Elizabeth R Lusczek, Andrew N Tri, Jens Flensted Lassen, Amanda E Kowalczyk, Suzanne McGaugh, Tinen L Iles
{"title":"Metabolomics-Guided Genomic Comparisons Reveal Convergent Evolution of Hibernation Genes in Mammals.","authors":"Danielle H Drabeck, Myana Anderson, Emma Y Roback, Elizabeth R Lusczek, Andrew N Tri, Jens Flensted Lassen, Amanda E Kowalczyk, Suzanne McGaugh, Tinen L Iles","doi":"10.1093/molbev/msaf188","DOIUrl":"10.1093/molbev/msaf188","url":null,"abstract":"<p><p>Hibernation exists in several unrelated mammalian lineages, allowing animals to survive extreme 0environmental conditions through profound physiological shifts, including reduced metabolic rate, heart rate, respiration, and body temperature. These physiological shifts allow hibernators to rely solely on fat reserves, simultaneously avoiding the adverse effects of prolonged immobility seen in nonhibernating species. Although research on individual species has highlighted key aspects of these adaptations, the genetic basis of hibernation across mammals remains poorly understood. Synthesizing both single species and comparative approaches, we use metabolomic data from waking and hibernating black bears (Ursus americanus) to guide bioinformatic analyses of genes using tests of selection and evolutionary rate convergence across independent lineages of hibernating mammals. Our analyses reveal significant changes in carnitine levels between states. Using public databases, we generate candidate genes which may contribute to regulation of carnitine, and use these to test for signatures of selection across several independent lineages of hibernating mammals. We also utilize a dataset of 19k proteins across 120 mammalian genomes to identify genes evolving at convergent rates across hibernating mammals. Using both approaches, we find several novel genes likely to impact carnitine metabolism and related functions vital to hibernation such as metabolic shifts, oxidative stress, and tissue preservation. These findings provide new insights into the genetic basis of hibernation and offer promising targets for translational research, including the development of clinical therapies that mimic hibernation-like states for applications in medicine and space exploration.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12379893/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144855748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frederick A Matsen, Kevin Sung, Mackenzie M Johnson, Will Dumm, David Rich, Tyler N Starr, Yun S Song, Philip Bradley, Julia Fukuyama, Hugh K Haddox
{"title":"A Sitewise Model of Natural Selection on Individual Antibodies via a Transformer-Encoder.","authors":"Frederick A Matsen, Kevin Sung, Mackenzie M Johnson, Will Dumm, David Rich, Tyler N Starr, Yun S Song, Philip Bradley, Julia Fukuyama, Hugh K Haddox","doi":"10.1093/molbev/msaf186","DOIUrl":"10.1093/molbev/msaf186","url":null,"abstract":"<p><p>During affinity maturation, antibodies are selected for their ability to fold and to bind a target antigen between rounds of somatic hypermutation. Previous studies have identified patterns of selection in antibodies using B cell repertoire sequencing data. However, these studies are constrained by needing to group many sequences or sites to make aggregate predictions. In this paper, we develop a transformer-encoder selection model of maximum resolution: given a single antibody sequence, it predicts the strength of selection on each amino acid site. Specifically, the model predicts for each site whether evolution will be slower than expected relative to a model of the neutral mutation process (purifying selection) or faster than expected (diversifying selection). We show that the model does an excellent job of modeling the process of natural selection on held out data, and does not need to be enormous or trained on vast amounts of data to perform well. The patterns of purifying vs diversifying natural selection do not neatly partition into the complementarity-determining vs framework regions: for example, there are many sites in framework that experience strong diversifying selection. There is a weak correlation between selection factors and solvent accessibility. When considering evolutionary shifts down a tree of antibody evolution, affinity maturation generally shifts sites towards purifying natural selection, however this effect depends on the region, with the biggest shifts toward purifying selection happening in the third complementarity-determining region. We observe distinct evolution between gene families but a limited relationship between germline diversity and selection strength.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12375951/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144835716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}