Analysis and Prediction of Chymotrypsin Substrate Preferences through Large Data Acquisition with Target-Free mRNA Display.

IF 2.6 4区生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY

ChemBioChem Pub Date : 2024-11-15 DOI:10.1002/cbic.202400760

Sabrina E Iskandar, Lindsey Guan, Rumit Maini, Christopher J Hipolito, Congliang Sun, Lisa A Vasicek, Dan Sindhikara, Adam Weinglass, S Adrian Saldanha

{"title":"Analysis and Prediction of Chymotrypsin Substrate Preferences through Large Data Acquisition with Target-Free mRNA Display.","authors":"Sabrina E Iskandar, Lindsey Guan, Rumit Maini, Christopher J Hipolito, Congliang Sun, Lisa A Vasicek, Dan Sindhikara, Adam Weinglass, S Adrian Saldanha","doi":"10.1002/cbic.202400760","DOIUrl":null,"url":null,"abstract":"<p><p>Oral delivery of peptide therapeutics is limited by degradation by gut proteases like chymotrypsin. Existing databases of peptidases are limited in size and do not enable systematic analyses of protease substrate preferences, especially for non-natural amino acids. Thus, stability optimization of hit compounds is time and resource intensive. To accelerate the stability optimization of peptide ligands, we generated large datasets of chymotrypsin-resistant peptides via mRNA display to create a predictive model for chymotrypsin-resistant sequences. Through analysis of enriched motifs, we recapitulate known chymotrypsin cleavage sites, reveal positionally dependent effects of monomers on peptide cleavage, and report previously unidentified protective and destabilizing residues. We then developed a machine-learning-based model predicting peptide resistance to chymotrypsin cleavage and validated both model performance and the NGS experimental data by measuring chymotrypsin half-lives for a subset of peptides. Finally, we simulated stability predictions on non-natural amino acids through a leucine hold-out model and observed robust performance. Overall, we demonstrate the utility of mRNA display as a tool for big data generation and show that pairing mRNA display with machine learning yields valuable predictions for chymotrypsin cleavage. Expansion of this workflow to additional proteases could provide complementary predictive models that focus future peptide drug discovery efforts.</p>","PeriodicalId":140,"journal":{"name":"ChemBioChem","volume":" ","pages":"e202400760"},"PeriodicalIF":2.6000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ChemBioChem","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/cbic.202400760","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Oral delivery of peptide therapeutics is limited by degradation by gut proteases like chymotrypsin. Existing databases of peptidases are limited in size and do not enable systematic analyses of protease substrate preferences, especially for non-natural amino acids. Thus, stability optimization of hit compounds is time and resource intensive. To accelerate the stability optimization of peptide ligands, we generated large datasets of chymotrypsin-resistant peptides via mRNA display to create a predictive model for chymotrypsin-resistant sequences. Through analysis of enriched motifs, we recapitulate known chymotrypsin cleavage sites, reveal positionally dependent effects of monomers on peptide cleavage, and report previously unidentified protective and destabilizing residues. We then developed a machine-learning-based model predicting peptide resistance to chymotrypsin cleavage and validated both model performance and the NGS experimental data by measuring chymotrypsin half-lives for a subset of peptides. Finally, we simulated stability predictions on non-natural amino acids through a leucine hold-out model and observed robust performance. Overall, we demonstrate the utility of mRNA display as a tool for big data generation and show that pairing mRNA display with machine learning yields valuable predictions for chymotrypsin cleavage. Expansion of this workflow to additional proteases could provide complementary predictive models that focus future peptide drug discovery efforts.

查看原文本刊更多论文

通过无目标 mRNA 显示的大数据采集分析和预测糜蛋白酶底物偏好。

多肽疗法的口服给药受到糜蛋白酶等肠道蛋白酶降解的限制。现有的肽酶数据库规模有限，无法对蛋白酶底物偏好进行系统分析，特别是对非天然氨基酸。因此，对命中化合物进行稳定性优化既耗费时间又耗费资源。为了加快多肽配体的稳定性优化，我们通过 mRNA 展示生成了大量抗糜蛋白酶多肽数据集，从而创建了抗糜蛋白酶序列的预测模型。通过分析富集的基序，我们再现了已知的糜蛋白酶裂解位点，揭示了单体对肽裂解的位置依赖效应，并报告了之前未发现的保护性和不稳定性残基。然后，我们开发了一个基于机器学习的模型，预测肽对糜蛋白酶裂解的抗性，并通过测量一部分肽的糜蛋白酶半衰期验证了模型性能和 NGS 实验数据。最后，我们通过一个亮氨酸滞留模型模拟了非天然氨基酸的稳定性预测，并观察到了稳健的性能。总之，我们证明了 mRNA 展示作为大数据生成工具的实用性，并表明将 mRNA 展示与机器学习相结合可对糜蛋白酶裂解进行有价值的预测。将这一工作流程扩展到其他蛋白酶，可以提供互补的预测模型，使未来的多肽药物发现工作重点突出。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ChemBioChem 生物-生化与分子生物学

CiteScore

6.10

自引率

3.10%

发文量

407

审稿时长

1 months

期刊介绍： ChemBioChem (Impact Factor 2018: 2.641) publishes important breakthroughs across all areas at the interface of chemistry and biology, including the fields of chemical biology, bioorganic chemistry, bioinorganic chemistry, synthetic biology, biocatalysis, bionanotechnology, and biomaterials. It is published on behalf of Chemistry Europe, an association of 16 European chemical societies, and supported by the Asian Chemical Editorial Society (ACES).