{"title":"GFlowNet 预培训与廉价奖励","authors":"Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio","doi":"arxiv-2409.09702","DOIUrl":null,"url":null,"abstract":"Generative Flow Networks (GFlowNets), a class of generative models have\nrecently emerged as a suitable framework for generating diverse and\nhigh-quality molecular structures by learning from unnormalized reward\ndistributions. Previous works in this direction often restrict exploration by\nusing predefined molecular fragments as building blocks, limiting the chemical\nspace that can be accessed. In this work, we introduce Atomic GFlowNets\n(A-GFNs), a foundational generative model leveraging individual atoms as\nbuilding blocks to explore drug-like chemical space more comprehensively. We\npropose an unsupervised pre-training approach using offline drug-like molecule\ndatasets, which conditions A-GFNs on inexpensive yet informative molecular\ndescriptors such as drug-likeliness, topological polar surface area, and\nsynthetic accessibility scores. These properties serve as proxy rewards,\nguiding A-GFNs towards regions of chemical space that exhibit desirable\npharmacological properties. We further our method by implementing a\ngoal-conditioned fine-tuning process, which adapts A-GFNs to optimize for\nspecific target properties. In this work, we pretrain A-GFN on the ZINC15\noffline dataset and employ robust evaluation metrics to show the effectiveness\nof our approach when compared to other relevant baseline methods in drug\ndesign.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"65 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GFlowNet Pretraining with Inexpensive Rewards\",\"authors\":\"Mohit Pandey, Gopeshh Subbaraj, Emmanuel Bengio\",\"doi\":\"arxiv-2409.09702\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative Flow Networks (GFlowNets), a class of generative models have\\nrecently emerged as a suitable framework for generating diverse and\\nhigh-quality molecular structures by learning from unnormalized reward\\ndistributions. Previous works in this direction often restrict exploration by\\nusing predefined molecular fragments as building blocks, limiting the chemical\\nspace that can be accessed. In this work, we introduce Atomic GFlowNets\\n(A-GFNs), a foundational generative model leveraging individual atoms as\\nbuilding blocks to explore drug-like chemical space more comprehensively. We\\npropose an unsupervised pre-training approach using offline drug-like molecule\\ndatasets, which conditions A-GFNs on inexpensive yet informative molecular\\ndescriptors such as drug-likeliness, topological polar surface area, and\\nsynthetic accessibility scores. These properties serve as proxy rewards,\\nguiding A-GFNs towards regions of chemical space that exhibit desirable\\npharmacological properties. We further our method by implementing a\\ngoal-conditioned fine-tuning process, which adapts A-GFNs to optimize for\\nspecific target properties. In this work, we pretrain A-GFN on the ZINC15\\noffline dataset and employ robust evaluation metrics to show the effectiveness\\nof our approach when compared to other relevant baseline methods in drug\\ndesign.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"65 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.09702\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09702","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Generative Flow Networks (GFlowNets), a class of generative models have
recently emerged as a suitable framework for generating diverse and
high-quality molecular structures by learning from unnormalized reward
distributions. Previous works in this direction often restrict exploration by
using predefined molecular fragments as building blocks, limiting the chemical
space that can be accessed. In this work, we introduce Atomic GFlowNets
(A-GFNs), a foundational generative model leveraging individual atoms as
building blocks to explore drug-like chemical space more comprehensively. We
propose an unsupervised pre-training approach using offline drug-like molecule
datasets, which conditions A-GFNs on inexpensive yet informative molecular
descriptors such as drug-likeliness, topological polar surface area, and
synthetic accessibility scores. These properties serve as proxy rewards,
guiding A-GFNs towards regions of chemical space that exhibit desirable
pharmacological properties. We further our method by implementing a
goal-conditioned fine-tuning process, which adapts A-GFNs to optimize for
specific target properties. In this work, we pretrain A-GFN on the ZINC15
offline dataset and employ robust evaluation metrics to show the effectiveness
of our approach when compared to other relevant baseline methods in drug
design.