AbGPT: De Novo Antibody Design via Generative Language Modeling

Desmond Kuan, Amir Barati Farimani
{"title":"AbGPT: De Novo Antibody Design via Generative Language Modeling","authors":"Desmond Kuan, Amir Barati Farimani","doi":"arxiv-2409.06090","DOIUrl":null,"url":null,"abstract":"The adaptive immune response, largely mediated by B-cell receptors (BCRs),\nplays a crucial role for effective pathogen neutralization due to its diversity\nand antigen specificity. Designing BCRs de novo, or from scratch, has been\nchallenging because of their complex structure and diverse binding\nrequirements. Protein language models (PLMs) have shown remarkable performance\nin contextualizing and performing various downstream tasks without relying on\nstructural information. However, these models often lack a comprehensive\nunderstanding of the entire protein space, which limits their application in\nantibody design. In this study, we introduce Antibody Generative Pretrained\nTransformer (AbGPT), a model fine-tuned from a foundational PLM to enable a\nmore informed design of BCR sequences. Using a custom generation and filtering\npipeline, AbGPT successfully generated a high-quality library of 15,000 BCR\nsequences, demonstrating a strong understanding of the intrinsic variability\nand conserved regions within the antibody repertoire.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The adaptive immune response, largely mediated by B-cell receptors (BCRs), plays a crucial role for effective pathogen neutralization due to its diversity and antigen specificity. Designing BCRs de novo, or from scratch, has been challenging because of their complex structure and diverse binding requirements. Protein language models (PLMs) have shown remarkable performance in contextualizing and performing various downstream tasks without relying on structural information. However, these models often lack a comprehensive understanding of the entire protein space, which limits their application in antibody design. In this study, we introduce Antibody Generative Pretrained Transformer (AbGPT), a model fine-tuned from a foundational PLM to enable a more informed design of BCR sequences. Using a custom generation and filtering pipeline, AbGPT successfully generated a high-quality library of 15,000 BCR sequences, demonstrating a strong understanding of the intrinsic variability and conserved regions within the antibody repertoire.
AbGPT:通过生成语言建模进行新抗体设计
适应性免疫反应主要由 B 细胞受体(BCR)介导,由于其多样性和抗原特异性,BCR 对有效中和病原体起着至关重要的作用。由于 B 细胞受体结构复杂、结合要求多样,因此从头开始或从零开始设计 B 细胞受体一直是一项挑战。蛋白质语言模型(PLM)在不依赖结构信息的情况下对各种下游任务进行上下文关联和执行方面表现出了卓越的性能。然而,这些模型往往缺乏对整个蛋白质空间的全面了解,这限制了它们在抗体设计中的应用。在这项研究中,我们引入了抗体生成预训练转换器(AbGPT),它是一种从基础 PLM 微调而来的模型,能够更明智地设计 BCR 序列。利用定制的生成和过滤管道,AbGPT 成功生成了一个包含 15,000 个 BCR 序列的高质量文库,证明了它对抗体库中的内在变异性和保守区域有很强的理解能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信