Learning the language of antibody hypervariability

IF 9.1 1区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Proceedings of the National Academy of Sciences of the United States of America Pub Date : 2024-12-30 DOI:10.1073/pnas.2418918121

Rohit Singh, Chiho Im, Yu Qiu, Brian Mackness, Abhinav Gupta, Taylor Joren, Samuel Sledzieski, Lena Erlach, Maria Wendt, Yves Fomekong Nanfack, Bryan Bryson, Bonnie Berger

{"title":"Learning the language of antibody hypervariability","authors":"Rohit Singh, Chiho Im, Yu Qiu, Brian Mackness, Abhinav Gupta, Taylor Joren, Samuel Sledzieski, Lena Erlach, Maria Wendt, Yves Fomekong Nanfack, Bryan Bryson, Bonnie Berger","doi":"10.1073/pnas.2418918121","DOIUrl":null,"url":null,"abstract":"Protein language models (PLMs) have demonstrated impressive success in modeling proteins. However, general-purpose “foundational” PLMs have limited performance in modeling antibodies due to the latter’s hypervariable regions, which do not conform to the evolutionary conservation principles that such models rely on. In this study, we propose a transfer learning framework called Antibody Mutagenesis-Augmented Processing (AbMAP), which fine-tunes foundational models for antibody-sequence inputs by supervising on antibody structure and binding specificity examples. Our learned feature representations accurately predict mutational effects on antigen binding, paratope identification, and other key antibody properties. We experimentally validate AbMAP for antibody optimization by applying it to refine a set of antibodies that bind to a SARS-CoV-2 peptide, and obtain an 82% hit-rate and up to 22-fold increase in binding affinity. AbMAP also unlocks large-scale analyses of immune repertoires, revealing that B-cell receptor repertoires of individuals, while remarkably different in sequence, converge toward similar structural and functional coverage. Importantly, AbMAP’s transfer learning approach can be readily adapted to advances in foundational PLMs. We anticipate AbMAP will accelerate the efficient design and modeling of antibodies, expedite the discovery of antibody-based therapeutics, and deepen our understanding of humoral immunity.","PeriodicalId":20548,"journal":{"name":"Proceedings of the National Academy of Sciences of the United States of America","volume":"169 1","pages":""},"PeriodicalIF":9.1000,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the National Academy of Sciences of the United States of America","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1073/pnas.2418918121","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Protein language models (PLMs) have demonstrated impressive success in modeling proteins. However, general-purpose “foundational” PLMs have limited performance in modeling antibodies due to the latter’s hypervariable regions, which do not conform to the evolutionary conservation principles that such models rely on. In this study, we propose a transfer learning framework called Antibody Mutagenesis-Augmented Processing (AbMAP), which fine-tunes foundational models for antibody-sequence inputs by supervising on antibody structure and binding specificity examples. Our learned feature representations accurately predict mutational effects on antigen binding, paratope identification, and other key antibody properties. We experimentally validate AbMAP for antibody optimization by applying it to refine a set of antibodies that bind to a SARS-CoV-2 peptide, and obtain an 82% hit-rate and up to 22-fold increase in binding affinity. AbMAP also unlocks large-scale analyses of immune repertoires, revealing that B-cell receptor repertoires of individuals, while remarkably different in sequence, converge toward similar structural and functional coverage. Importantly, AbMAP’s transfer learning approach can be readily adapted to advances in foundational PLMs. We anticipate AbMAP will accelerate the efficient design and modeling of antibodies, expedite the discovery of antibody-based therapeutics, and deepen our understanding of humoral immunity.

查看原文本刊更多论文

学习抗体高变异性的语言

蛋白质语言模型（PLMs）在蛋白质建模方面取得了令人印象深刻的成功。然而，通用的“基础”plm在建模抗体方面表现有限，因为后者的高变区不符合这些模型所依赖的进化守恒原则。在这项研究中，我们提出了一个称为抗体诱变-增强处理（AbMAP）的迁移学习框架，该框架通过监督抗体结构和结合特异性示例来微调抗体序列输入的基础模型。我们学习到的特征表征准确地预测了抗原结合、假面识别和其他关键抗体特性的突变效应。我们通过实验验证AbMAP用于抗体优化，将其应用于改进一组与SARS-CoV-2肽结合的抗体，并获得82%的命中率和高达22倍的结合亲和力。AbMAP还开启了对免疫库的大规模分析，揭示了个体的b细胞受体库，尽管序列显著不同，但趋同于相似的结构和功能覆盖。重要的是，AbMAP的迁移学习方法可以很容易地适应基础PLMs的进步。我们预计AbMAP将加速抗体的有效设计和建模，加速基于抗体的治疗方法的发现，并加深我们对体液免疫的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the National Academy of Sciences of the United States of America 综合性期刊-综合性期刊

CiteScore

19.00

自引率

0.90%

发文量

3575

审稿时长

2.5 months

期刊介绍： The Proceedings of the National Academy of Sciences (PNAS), a peer-reviewed journal of the National Academy of Sciences (NAS), serves as an authoritative source for high-impact, original research across the biological, physical, and social sciences. With a global scope, the journal welcomes submissions from researchers worldwide, making it an inclusive platform for advancing scientific knowledge.