利用蛋白质语言模型进行序列设计的强化学习

arXiv - QuanBio - Biomolecules Pub Date : 2024-07-03 DOI:arxiv-2407.03154

Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Derek Nowrouzezahrai, Samira Ebrahimi Kahou, Riashat Islam

{"title":"利用蛋白质语言模型进行序列设计的强化学习","authors":"Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Derek Nowrouzezahrai, Samira Ebrahimi Kahou, Riashat Islam","doi":"arxiv-2407.03154","DOIUrl":null,"url":null,"abstract":"Protein sequence design, determined by amino acid sequences, are essential to\nprotein engineering problems in drug discovery. Prior approaches have resorted\nto evolutionary strategies or Monte-Carlo methods for protein design, but often\nfail to exploit the structure of the combinatorial search space, to generalize\nto unseen sequences. In the context of discrete black box optimization over\nlarge search spaces, learning a mutation policy to generate novel sequences\nwith reinforcement learning is appealing. Recent advances in protein language\nmodels (PLMs) trained on large corpora of protein sequences offer a potential\nsolution to this problem by scoring proteins according to their biological\nplausibility (such as the TM-score). In this work, we propose to use PLMs as a\nreward function to generate new sequences. Yet the PLM can be computationally\nexpensive to query due to its large size. To this end, we propose an\nalternative paradigm where optimization can be performed on scores from a\nsmaller proxy model that is periodically finetuned, jointly while learning the\nmutation policy. We perform extensive experiments on various sequence lengths\nto benchmark RL-based approaches, and provide comprehensive evaluations along\nbiological plausibility and diversity of the protein. Our experimental results\ninclude favorable evaluations of the proposed sequences, along with high\ndiversity scores, demonstrating that RL is a strong candidate for biological\nsequence design. Finally, we provide a modular open source implementation can\nbe easily integrated in most RL training loops, with support for replacing the\nreward model with other PLMs, to spur further research in this domain. The code\nfor all experiments is provided in the supplementary material.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Sequence Design Leveraging Protein Language Models\",\"authors\":\"Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Derek Nowrouzezahrai, Samira Ebrahimi Kahou, Riashat Islam\",\"doi\":\"arxiv-2407.03154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein sequence design, determined by amino acid sequences, are essential to\\nprotein engineering problems in drug discovery. Prior approaches have resorted\\nto evolutionary strategies or Monte-Carlo methods for protein design, but often\\nfail to exploit the structure of the combinatorial search space, to generalize\\nto unseen sequences. In the context of discrete black box optimization over\\nlarge search spaces, learning a mutation policy to generate novel sequences\\nwith reinforcement learning is appealing. Recent advances in protein language\\nmodels (PLMs) trained on large corpora of protein sequences offer a potential\\nsolution to this problem by scoring proteins according to their biological\\nplausibility (such as the TM-score). In this work, we propose to use PLMs as a\\nreward function to generate new sequences. Yet the PLM can be computationally\\nexpensive to query due to its large size. To this end, we propose an\\nalternative paradigm where optimization can be performed on scores from a\\nsmaller proxy model that is periodically finetuned, jointly while learning the\\nmutation policy. We perform extensive experiments on various sequence lengths\\nto benchmark RL-based approaches, and provide comprehensive evaluations along\\nbiological plausibility and diversity of the protein. Our experimental results\\ninclude favorable evaluations of the proposed sequences, along with high\\ndiversity scores, demonstrating that RL is a strong candidate for biological\\nsequence design. Finally, we provide a modular open source implementation can\\nbe easily integrated in most RL training loops, with support for replacing the\\nreward model with other PLMs, to spur further research in this domain. The code\\nfor all experiments is provided in the supplementary material.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"11 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.03154\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.03154","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

由氨基酸序列决定的蛋白质序列设计是药物发现中蛋白质工程问题的关键。以前的方法采用进化策略或蒙特卡洛方法来进行蛋白质设计，但往往无法利用组合搜索空间的结构来泛化到未见过的序列。在离散黑箱优化和超大搜索空间的背景下，利用强化学习来学习突变策略以生成新序列是很有吸引力的。最近在蛋白质语言模型（PLMs）方面取得的进展为解决这一问题提供了可能，PLMs 根据蛋白质的生物学可信度（如 TM 分数）对蛋白质进行评分。在这项工作中，我们建议使用 PLM 作为生成新序列的前向函数。然而，由于 PLM 体积庞大，查询 PLM 的计算成本很高。为此，我们提出了另一种范式，即在学习变异策略的同时，对来自较小代理模型的分数进行优化，并定期对代理模型进行微调。我们在不同长度的序列上进行了广泛的实验，对基于 RL 的方法进行了基准测试，并根据蛋白质的生物学可信度和多样性进行了综合评估。我们的实验结果包括对提议序列的良好评价以及高多样性得分，这表明 RL 是生物序列设计的有力候选方案。最后，我们提供了一个模块化开源实现，可以轻松集成到大多数 RL 训练循环中，并支持用其他 PLM 替换前向模型，以促进该领域的进一步研究。补充材料中提供了所有实验的代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement Learning for Sequence Design Leveraging Protein Language Models

Protein sequence design, determined by amino acid sequences, are essential to protein engineering problems in drug discovery. Prior approaches have resorted to evolutionary strategies or Monte-Carlo methods for protein design, but often fail to exploit the structure of the combinatorial search space, to generalize to unseen sequences. In the context of discrete black box optimization over large search spaces, learning a mutation policy to generate novel sequences with reinforcement learning is appealing. Recent advances in protein language models (PLMs) trained on large corpora of protein sequences offer a potential solution to this problem by scoring proteins according to their biological plausibility (such as the TM-score). In this work, we propose to use PLMs as a reward function to generate new sequences. Yet the PLM can be computationally expensive to query due to its large size. To this end, we propose an alternative paradigm where optimization can be performed on scores from a smaller proxy model that is periodically finetuned, jointly while learning the mutation policy. We perform extensive experiments on various sequence lengths to benchmark RL-based approaches, and provide comprehensive evaluations along biological plausibility and diversity of the protein. Our experimental results include favorable evaluations of the proposed sequences, along with high diversity scores, demonstrating that RL is a strong candidate for biological sequence design. Finally, we provide a modular open source implementation can be easily integrated in most RL training loops, with support for replacing the reward model with other PLMs, to spur further research in this domain. The code for all experiments is provided in the supplementary material.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - QuanBio - Biomolecules

自引率

0.00%

发文量