GFlowNet 预培训与廉价奖励

Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio
{"title":"GFlowNet 预培训与廉价奖励","authors":"Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio","doi":"arxiv-2409.09702","DOIUrl":null,"url":null,"abstract":"Generative Flow Networks (GFlowNets), a class of generative models have\nrecently emerged as a suitable framework for generating diverse and\nhigh-quality molecular structures by learning from unnormalized reward\ndistributions. Previous works in this direction often restrict exploration by\nusing predefined molecular fragments as building blocks, limiting the chemical\nspace that can be accessed. In this work, we introduce Atomic GFlowNets\n(A-GFNs), a foundational generative model leveraging individual atoms as\nbuilding blocks to explore drug-like chemical space more comprehensively. We\npropose an unsupervised pre-training approach using offline drug-like molecule\ndatasets, which conditions A-GFNs on inexpensive yet informative molecular\ndescriptors such as drug-likeliness, topological polar surface area, and\nsynthetic accessibility scores. These properties serve as proxy rewards,\nguiding A-GFNs towards regions of chemical space that exhibit desirable\npharmacological properties. We further our method by implementing a\ngoal-conditioned fine-tuning process, which adapts A-GFNs to optimize for\nspecific target properties. In this work, we pretrain A-GFN on the ZINC15\noffline dataset and employ robust evaluation metrics to show the effectiveness\nof our approach when compared to other relevant baseline methods in drug\ndesign.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"65 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GFlowNet Pretraining with Inexpensive Rewards\",\"authors\":\"Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio\",\"doi\":\"arxiv-2409.09702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative Flow Networks (GFlowNets), a class of generative models have\\nrecently emerged as a suitable framework for generating diverse and\\nhigh-quality molecular structures by learning from unnormalized reward\\ndistributions. Previous works in this direction often restrict exploration by\\nusing predefined molecular fragments as building blocks, limiting the chemical\\nspace that can be accessed. In this work, we introduce Atomic GFlowNets\\n(A-GFNs), a foundational generative model leveraging individual atoms as\\nbuilding blocks to explore drug-like chemical space more comprehensively. We\\npropose an unsupervised pre-training approach using offline drug-like molecule\\ndatasets, which conditions A-GFNs on inexpensive yet informative molecular\\ndescriptors such as drug-likeliness, topological polar surface area, and\\nsynthetic accessibility scores. These properties serve as proxy rewards,\\nguiding A-GFNs towards regions of chemical space that exhibit desirable\\npharmacological properties. We further our method by implementing a\\ngoal-conditioned fine-tuning process, which adapts A-GFNs to optimize for\\nspecific target properties. In this work, we pretrain A-GFN on the ZINC15\\noffline dataset and employ robust evaluation metrics to show the effectiveness\\nof our approach when compared to other relevant baseline methods in drug\\ndesign.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"65 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09702\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

生成流网络(GFlowNets)是最近出现的一类生成模型,它是通过学习非规范化奖励分布生成多样化和高质量分子结构的合适框架。在这一方向上,以前的工作通常通过使用预定义的分子片段作为构建模块来限制探索,从而限制了可访问的化学空间。在这项工作中,我们引入了原子 GFlowNets(A-GFNs),这是一种利用单个原子作为构建模块的基础生成模型,可以更全面地探索类药物的化学空间。我们提出了一种使用离线类药物分子集进行无监督预训练的方法,该方法以廉价但信息丰富的分子描述符(如药物可能性、拓扑极性表面积和合成可及性得分)为 A-GFNs 的条件。这些特性可以作为替代奖励,引导 A-GFN 向化学空间中表现出理想药理特性的区域前进。我们还通过实施前置条件微调过程进一步完善了我们的方法,该过程可调整 A-GFN 以优化特定的目标特性。在这项工作中,我们在 ZINC15 离线数据集上对 A-GFN 进行了预训练,并采用了稳健的评估指标来显示我们的方法与药物设计中其他相关基线方法相比的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GFlowNet Pretraining with Inexpensive Rewards
Generative Flow Networks (GFlowNets), a class of generative models have recently emerged as a suitable framework for generating diverse and high-quality molecular structures by learning from unnormalized reward distributions. Previous works in this direction often restrict exploration by using predefined molecular fragments as building blocks, limiting the chemical space that can be accessed. In this work, we introduce Atomic GFlowNets (A-GFNs), a foundational generative model leveraging individual atoms as building blocks to explore drug-like chemical space more comprehensively. We propose an unsupervised pre-training approach using offline drug-like molecule datasets, which conditions A-GFNs on inexpensive yet informative molecular descriptors such as drug-likeliness, topological polar surface area, and synthetic accessibility scores. These properties serve as proxy rewards, guiding A-GFNs towards regions of chemical space that exhibit desirable pharmacological properties. We further our method by implementing a goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for specific target properties. In this work, we pretrain A-GFN on the ZINC15 offline dataset and employ robust evaluation metrics to show the effectiveness of our approach when compared to other relevant baseline methods in drug design.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信