Frederick A Matsen, Kevin Sung, Mackenzie M Johnson, Will Dumm, David Rich, Tyler N Starr, Yun S Song, Philip Bradley, Julia Fukuyama, Hugh K Haddox
{"title":"A Sitewise Model of Natural Selection on Individual Antibodies via a Transformer-Encoder.","authors":"Frederick A Matsen, Kevin Sung, Mackenzie M Johnson, Will Dumm, David Rich, Tyler N Starr, Yun S Song, Philip Bradley, Julia Fukuyama, Hugh K Haddox","doi":"10.1093/molbev/msaf186","DOIUrl":null,"url":null,"abstract":"<p><p>During affinity maturation, antibodies are selected for their ability to fold and to bind a target antigen between rounds of somatic hypermutation. Previous studies have identified patterns of selection in antibodies using B cell repertoire sequencing data. However, these studies are constrained by needing to group many sequences or sites to make aggregate predictions. In this paper, we develop a transformer-encoder selection model of maximum resolution: given a single antibody sequence, it predicts the strength of selection on each amino acid site. Specifically, the model predicts for each site whether evolution will be slower than expected relative to a model of the neutral mutation process (purifying selection) or faster than expected (diversifying selection). We show that the model does an excellent job of modeling the process of natural selection on held out data, and does not need to be enormous or trained on vast amounts of data to perform well. The patterns of purifying vs diversifying natural selection do not neatly partition into the complementarity-determining vs framework regions: for example, there are many sites in framework that experience strong diversifying selection. There is a weak correlation between selection factors and solvent accessibility. When considering evolutionary shifts down a tree of antibody evolution, affinity maturation generally shifts sites towards purifying natural selection, however this effect depends on the region, with the biggest shifts toward purifying selection happening in the third complementarity-determining region. We observe distinct evolution between gene families but a limited relationship between germline diversity and selection strength.</p>","PeriodicalId":18730,"journal":{"name":"Molecular biology and evolution","volume":" ","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12375951/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular biology and evolution","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/molbev/msaf186","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
During affinity maturation, antibodies are selected for their ability to fold and to bind a target antigen between rounds of somatic hypermutation. Previous studies have identified patterns of selection in antibodies using B cell repertoire sequencing data. However, these studies are constrained by needing to group many sequences or sites to make aggregate predictions. In this paper, we develop a transformer-encoder selection model of maximum resolution: given a single antibody sequence, it predicts the strength of selection on each amino acid site. Specifically, the model predicts for each site whether evolution will be slower than expected relative to a model of the neutral mutation process (purifying selection) or faster than expected (diversifying selection). We show that the model does an excellent job of modeling the process of natural selection on held out data, and does not need to be enormous or trained on vast amounts of data to perform well. The patterns of purifying vs diversifying natural selection do not neatly partition into the complementarity-determining vs framework regions: for example, there are many sites in framework that experience strong diversifying selection. There is a weak correlation between selection factors and solvent accessibility. When considering evolutionary shifts down a tree of antibody evolution, affinity maturation generally shifts sites towards purifying natural selection, however this effect depends on the region, with the biggest shifts toward purifying selection happening in the third complementarity-determining region. We observe distinct evolution between gene families but a limited relationship between germline diversity and selection strength.
期刊介绍:
Molecular Biology and Evolution
Journal Overview:
Publishes research at the interface of molecular (including genomics) and evolutionary biology
Considers manuscripts containing patterns, processes, and predictions at all levels of organization: population, taxonomic, functional, and phenotypic
Interested in fundamental discoveries, new and improved methods, resources, technologies, and theories advancing evolutionary research
Publishes balanced reviews of recent developments in genome evolution and forward-looking perspectives suggesting future directions in molecular evolution applications.