{"title":"AbGPT: De Novo Antibody Design via Generative Language Modeling","authors":"Desmond Kuan, Amir Barati Farimani","doi":"arxiv-2409.06090","DOIUrl":null,"url":null,"abstract":"The adaptive immune response, largely mediated by B-cell receptors (BCRs),\nplays a crucial role for effective pathogen neutralization due to its diversity\nand antigen specificity. Designing BCRs de novo, or from scratch, has been\nchallenging because of their complex structure and diverse binding\nrequirements. Protein language models (PLMs) have shown remarkable performance\nin contextualizing and performing various downstream tasks without relying on\nstructural information. However, these models often lack a comprehensive\nunderstanding of the entire protein space, which limits their application in\nantibody design. In this study, we introduce Antibody Generative Pretrained\nTransformer (AbGPT), a model fine-tuned from a foundational PLM to enable a\nmore informed design of BCR sequences. Using a custom generation and filtering\npipeline, AbGPT successfully generated a high-quality library of 15,000 BCR\nsequences, demonstrating a strong understanding of the intrinsic variability\nand conserved regions within the antibody repertoire.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The adaptive immune response, largely mediated by B-cell receptors (BCRs),
plays a crucial role for effective pathogen neutralization due to its diversity
and antigen specificity. Designing BCRs de novo, or from scratch, has been
challenging because of their complex structure and diverse binding
requirements. Protein language models (PLMs) have shown remarkable performance
in contextualizing and performing various downstream tasks without relying on
structural information. However, these models often lack a comprehensive
understanding of the entire protein space, which limits their application in
antibody design. In this study, we introduce Antibody Generative Pretrained
Transformer (AbGPT), a model fine-tuned from a foundational PLM to enable a
more informed design of BCR sequences. Using a custom generation and filtering
pipeline, AbGPT successfully generated a high-quality library of 15,000 BCR
sequences, demonstrating a strong understanding of the intrinsic variability
and conserved regions within the antibody repertoire.