{"title":"AbGPT:通过生成语言建模进行新抗体设计","authors":"Desmond Kuan, Amir Barati Farimani","doi":"arxiv-2409.06090","DOIUrl":null,"url":null,"abstract":"The adaptive immune response, largely mediated by B-cell receptors (BCRs),\nplays a crucial role for effective pathogen neutralization due to its diversity\nand antigen specificity. Designing BCRs de novo, or from scratch, has been\nchallenging because of their complex structure and diverse binding\nrequirements. Protein language models (PLMs) have shown remarkable performance\nin contextualizing and performing various downstream tasks without relying on\nstructural information. However, these models often lack a comprehensive\nunderstanding of the entire protein space, which limits their application in\nantibody design. In this study, we introduce Antibody Generative Pretrained\nTransformer (AbGPT), a model fine-tuned from a foundational PLM to enable a\nmore informed design of BCR sequences. Using a custom generation and filtering\npipeline, AbGPT successfully generated a high-quality library of 15,000 BCR\nsequences, demonstrating a strong understanding of the intrinsic variability\nand conserved regions within the antibody repertoire.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AbGPT: De Novo Antibody Design via Generative Language Modeling\",\"authors\":\"Desmond Kuan, Amir Barati Farimani\",\"doi\":\"arxiv-2409.06090\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The adaptive immune response, largely mediated by B-cell receptors (BCRs),\\nplays a crucial role for effective pathogen neutralization due to its diversity\\nand antigen specificity. Designing BCRs de novo, or from scratch, has been\\nchallenging because of their complex structure and diverse binding\\nrequirements. Protein language models (PLMs) have shown remarkable performance\\nin contextualizing and performing various downstream tasks without relying on\\nstructural information. However, these models often lack a comprehensive\\nunderstanding of the entire protein space, which limits their application in\\nantibody design. In this study, we introduce Antibody Generative Pretrained\\nTransformer (AbGPT), a model fine-tuned from a foundational PLM to enable a\\nmore informed design of BCR sequences. Using a custom generation and filtering\\npipeline, AbGPT successfully generated a high-quality library of 15,000 BCR\\nsequences, demonstrating a strong understanding of the intrinsic variability\\nand conserved regions within the antibody repertoire.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06090\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
适应性免疫反应主要由 B 细胞受体(BCR)介导,由于其多样性和抗原特异性,BCR 对有效中和病原体起着至关重要的作用。由于 B 细胞受体结构复杂、结合要求多样,因此从头开始或从零开始设计 B 细胞受体一直是一项挑战。蛋白质语言模型(PLM)在不依赖结构信息的情况下对各种下游任务进行上下文关联和执行方面表现出了卓越的性能。然而,这些模型往往缺乏对整个蛋白质空间的全面了解,这限制了它们在抗体设计中的应用。在这项研究中,我们引入了抗体生成预训练转换器(AbGPT),它是一种从基础 PLM 微调而来的模型,能够更明智地设计 BCR 序列。利用定制的生成和过滤管道,AbGPT 成功生成了一个包含 15,000 个 BCR 序列的高质量文库,证明了它对抗体库中的内在变异性和保守区域有很强的理解能力。
AbGPT: De Novo Antibody Design via Generative Language Modeling
The adaptive immune response, largely mediated by B-cell receptors (BCRs),
plays a crucial role for effective pathogen neutralization due to its diversity
and antigen specificity. Designing BCRs de novo, or from scratch, has been
challenging because of their complex structure and diverse binding
requirements. Protein language models (PLMs) have shown remarkable performance
in contextualizing and performing various downstream tasks without relying on
structural information. However, these models often lack a comprehensive
understanding of the entire protein space, which limits their application in
antibody design. In this study, we introduce Antibody Generative Pretrained
Transformer (AbGPT), a model fine-tuned from a foundational PLM to enable a
more informed design of BCR sequences. Using a custom generation and filtering
pipeline, AbGPT successfully generated a high-quality library of 15,000 BCR
sequences, demonstrating a strong understanding of the intrinsic variability
and conserved regions within the antibody repertoire.