De Novo Design of Peptide Binders to Conformationally Diverse Targets with Contrastive Language Modeling.

Suhaas Bhat, Kalyan Palepu, Lauren Hong, Joey Mao, Tianzheng Ye, Rema Iyer, Lin Zhao, Tianlai Chen, Sophia Vincoff, Rio Watson, Tian Wang, Divya Srijay, Venkata Srikar Kavirayuni, Kseniia Kholina, Shrey Goel, Pranay Vure, Aniruddha J Desphande, Scott H Soderling, Matthew P DeLisa, Pranam Chatterjee
{"title":"<i>De Novo</i> Design of Peptide Binders to Conformationally Diverse Targets with Contrastive Language Modeling.","authors":"Suhaas Bhat, Kalyan Palepu, Lauren Hong, Joey Mao, Tianzheng Ye, Rema Iyer, Lin Zhao, Tianlai Chen, Sophia Vincoff, Rio Watson, Tian Wang, Divya Srijay, Venkata Srikar Kavirayuni, Kseniia Kholina, Shrey Goel, Pranay Vure, Aniruddha J Desphande, Scott H Soderling, Matthew P DeLisa, Pranam Chatterjee","doi":"10.1101/2023.06.26.546591","DOIUrl":null,"url":null,"abstract":"<p><p>Designing binders to target undruggable proteins presents a formidable challenge in drug discovery, requiring innovative approaches to overcome the lack of putative binding sites. Recently, generative models have been trained to design binding proteins via three-dimensional structures of target proteins, but as a result, struggle to design binders to disordered or conformationally unstable targets. In this work, we provide a generalizable algorithmic framework to design short, target-binding linear peptides, requiring only the amino acid sequence of the target protein. To do this, we propose a process to generate naturalistic peptide candidates through Gaussian perturbation of the peptidic latent space of the ESM-2 protein language model, and subsequently screen these novel linear sequences for target-selective interaction activity via a CLIP-based contrastive learning architecture. By integrating these generative and discriminative steps, we create a <b>Pep</b>tide <b>Pr</b>ioritization via <b>CLIP</b> (<b>PepPrCLIP</b>) pipeline and validate highly-ranked, target-specific peptides experimentally, both as inhibitory peptides and as fusions to E3 ubiquitin ligase domains, demonstrating functionally potent binding and degradation of conformationally diverse protein targets <i>in vitro</i>. Overall, our design strategy provides a modular toolkit for designing short binding linear peptides to any target protein without the reliance on stable and ordered tertiary structure, enabling generation of programmable modulators to undruggable and disordered proteins such as transcription factors and fusion oncoproteins.</p>","PeriodicalId":72407,"journal":{"name":"bioRxiv : the preprint server for biology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11291000/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2023.06.26.546591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Designing binders to target undruggable proteins presents a formidable challenge in drug discovery, requiring innovative approaches to overcome the lack of putative binding sites. Recently, generative models have been trained to design binding proteins via three-dimensional structures of target proteins, but as a result, struggle to design binders to disordered or conformationally unstable targets. In this work, we provide a generalizable algorithmic framework to design short, target-binding linear peptides, requiring only the amino acid sequence of the target protein. To do this, we propose a process to generate naturalistic peptide candidates through Gaussian perturbation of the peptidic latent space of the ESM-2 protein language model, and subsequently screen these novel linear sequences for target-selective interaction activity via a CLIP-based contrastive learning architecture. By integrating these generative and discriminative steps, we create a Peptide Prioritization via CLIP (PepPrCLIP) pipeline and validate highly-ranked, target-specific peptides experimentally, both as inhibitory peptides and as fusions to E3 ubiquitin ligase domains, demonstrating functionally potent binding and degradation of conformationally diverse protein targets in vitro. Overall, our design strategy provides a modular toolkit for designing short binding linear peptides to any target protein without the reliance on stable and ordered tertiary structure, enabling generation of programmable modulators to undruggable and disordered proteins such as transcription factors and fusion oncoproteins.

利用对比语言建模从新设计多肽与形态各异靶标的结合剂
为不可药用蛋白质设计结合蛋白是药物发现中的一项艰巨挑战,需要创新方法来克服缺乏推定结合位点的问题。最近,人们已经训练了生成模型,通过目标蛋白质的三维结构来设计结合蛋白,但结果却难以设计出无序或构象不稳定目标的结合蛋白。在这项工作中,我们提供了一个可通用的算法框架,只需目标蛋白质的氨基酸序列,就能设计出短的、与目标结合的线性肽。为此,我们提出了通过对 ESM-2 蛋白语言模型的肽潜空间进行高斯扰动来生成自然候选肽的方法,随后通过基于 CLIP 的对比学习架构来筛选这些新颖的线性序列,以确定其是否具有靶向选择性相互作用活性。通过整合这些生成和鉴别步骤,我们创建了一个通过 CLIP(PepPrCLIP)进行肽潮筛选(Pep tide Pr ioritization via CLIP)的管道,并在实验中验证了排名靠前的靶标特异性肽段,这些肽段既可以作为抑制肽段,也可以作为与 E3 泛素连接酶结构域融合的肽段,在体外对构象各异的蛋白质靶标进行了功能强大的结合和降解。总之,我们的设计策略提供了一个模块化工具包,可用于设计与任何靶蛋白结合的短线性肽,而无需依赖稳定有序的三级结构,这样就能生成可编程的调节剂,调节转录因子和融合肿瘤蛋白等不可药用和无序的蛋白。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信