Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski
{"title":"用 GFlowNets 生成细胞形态学引导的小分子化合物","authors":"Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski","doi":"arxiv-2408.05196","DOIUrl":null,"url":null,"abstract":"High-content phenotypic screening, including high-content imaging (HCI), has\ngained popularity in the last few years for its ability to characterize novel\ntherapeutics without prior knowledge of the protein target. When combined with\ndeep learning techniques to predict and represent molecular-phenotype\ninteractions, these advancements hold the potential to significantly accelerate\nand enhance drug discovery applications. This work focuses on the novel task of\nHCI-guided molecular design. Generative models for molecule design could be\nguided by HCI data, for example with a supervised model that links molecules to\nphenotypes of interest as a reward function. However, limited labeled data,\ncombined with the high-dimensional readouts, can make training these methods\nchallenging and impractical. We consider an alternative approach in which we\nleverage an unsupervised multimodal joint embedding to define a latent\nsimilarity as a reward for GFlowNets. The proposed model learns to generate new\nmolecules that could produce phenotypic effects similar to those of the given\nimage target, without relying on pre-annotated phenotypic labels. We\ndemonstrate that the proposed method generates molecules with high\nmorphological and structural similarity to the target, increasing the\nlikelihood of similar biological activity, as confirmed by an independent\noracle model.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cell Morphology-Guided Small Molecule Generation with GFlowNets\",\"authors\":\"Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski\",\"doi\":\"arxiv-2408.05196\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-content phenotypic screening, including high-content imaging (HCI), has\\ngained popularity in the last few years for its ability to characterize novel\\ntherapeutics without prior knowledge of the protein target. When combined with\\ndeep learning techniques to predict and represent molecular-phenotype\\ninteractions, these advancements hold the potential to significantly accelerate\\nand enhance drug discovery applications. This work focuses on the novel task of\\nHCI-guided molecular design. Generative models for molecule design could be\\nguided by HCI data, for example with a supervised model that links molecules to\\nphenotypes of interest as a reward function. However, limited labeled data,\\ncombined with the high-dimensional readouts, can make training these methods\\nchallenging and impractical. We consider an alternative approach in which we\\nleverage an unsupervised multimodal joint embedding to define a latent\\nsimilarity as a reward for GFlowNets. The proposed model learns to generate new\\nmolecules that could produce phenotypic effects similar to those of the given\\nimage target, without relying on pre-annotated phenotypic labels. We\\ndemonstrate that the proposed method generates molecules with high\\nmorphological and structural similarity to the target, increasing the\\nlikelihood of similar biological activity, as confirmed by an independent\\noracle model.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.05196\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cell Morphology-Guided Small Molecule Generation with GFlowNets
High-content phenotypic screening, including high-content imaging (HCI), has
gained popularity in the last few years for its ability to characterize novel
therapeutics without prior knowledge of the protein target. When combined with
deep learning techniques to predict and represent molecular-phenotype
interactions, these advancements hold the potential to significantly accelerate
and enhance drug discovery applications. This work focuses on the novel task of
HCI-guided molecular design. Generative models for molecule design could be
guided by HCI data, for example with a supervised model that links molecules to
phenotypes of interest as a reward function. However, limited labeled data,
combined with the high-dimensional readouts, can make training these methods
challenging and impractical. We consider an alternative approach in which we
leverage an unsupervised multimodal joint embedding to define a latent
similarity as a reward for GFlowNets. The proposed model learns to generate new
molecules that could produce phenotypic effects similar to those of the given
image target, without relying on pre-annotated phenotypic labels. We
demonstrate that the proposed method generates molecules with high
morphological and structural similarity to the target, increasing the
likelihood of similar biological activity, as confirmed by an independent
oracle model.