Alexia Jolicoeur-Martineau, Aristide Baratin, Kisoo Kwon, Boris Knyazev, Yan Zhang
{"title":"Any-Property-Conditional Molecule Generation with Self-Criticism using Spanning Trees","authors":"Alexia Jolicoeur-Martineau, Aristide Baratin, Kisoo Kwon, Boris Knyazev, Yan Zhang","doi":"arxiv-2407.09357","DOIUrl":null,"url":null,"abstract":"Generating novel molecules is challenging, with most representations leading\nto generative models producing many invalid molecules. Spanning Tree-based\nGraph Generation (STGG) is a promising approach to ensure the generation of\nvalid molecules, outperforming state-of-the-art SMILES and graph diffusion\nmodels for unconditional generation. In the real world, we want to be able to\ngenerate molecules conditional on one or multiple desired properties rather\nthan unconditionally. Thus, in this work, we extend STGG to\nmulti-property-conditional generation. Our approach, STGG+, incorporates a\nmodern Transformer architecture, random masking of properties during training\n(enabling conditioning on any subset of properties and classifier-free\nguidance), an auxiliary property-prediction loss (allowing the model to\nself-criticize molecules and select the best ones), and other improvements. We\nshow that STGG+ achieves state-of-the-art performance on in-distribution and\nout-of-distribution conditional generation, and reward maximization.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"54 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.09357","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generating novel molecules is challenging, with most representations leading
to generative models producing many invalid molecules. Spanning Tree-based
Graph Generation (STGG) is a promising approach to ensure the generation of
valid molecules, outperforming state-of-the-art SMILES and graph diffusion
models for unconditional generation. In the real world, we want to be able to
generate molecules conditional on one or multiple desired properties rather
than unconditionally. Thus, in this work, we extend STGG to
multi-property-conditional generation. Our approach, STGG+, incorporates a
modern Transformer architecture, random masking of properties during training
(enabling conditioning on any subset of properties and classifier-free
guidance), an auxiliary property-prediction loss (allowing the model to
self-criticize molecules and select the best ones), and other improvements. We
show that STGG+ achieves state-of-the-art performance on in-distribution and
out-of-distribution conditional generation, and reward maximization.