用 GFlowNets 生成细胞形态学引导的小分子化合物

Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski
{"title":"用 GFlowNets 生成细胞形态学引导的小分子化合物","authors":"Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski","doi":"arxiv-2408.05196","DOIUrl":null,"url":null,"abstract":"High-content phenotypic screening, including high-content imaging (HCI), has\ngained popularity in the last few years for its ability to characterize novel\ntherapeutics without prior knowledge of the protein target. When combined with\ndeep learning techniques to predict and represent molecular-phenotype\ninteractions, these advancements hold the potential to significantly accelerate\nand enhance drug discovery applications. This work focuses on the novel task of\nHCI-guided molecular design. Generative models for molecule design could be\nguided by HCI data, for example with a supervised model that links molecules to\nphenotypes of interest as a reward function. However, limited labeled data,\ncombined with the high-dimensional readouts, can make training these methods\nchallenging and impractical. We consider an alternative approach in which we\nleverage an unsupervised multimodal joint embedding to define a latent\nsimilarity as a reward for GFlowNets. The proposed model learns to generate new\nmolecules that could produce phenotypic effects similar to those of the given\nimage target, without relying on pre-annotated phenotypic labels. We\ndemonstrate that the proposed method generates molecules with high\nmorphological and structural similarity to the target, increasing the\nlikelihood of similar biological activity, as confirmed by an independent\noracle model.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cell Morphology-Guided Small Molecule Generation with GFlowNets\",\"authors\":\"Stephen Zhewen Lu, Ziqing Lu, Ehsan Hajiramezanali, Tommaso Biancalani, Yoshua Bengio, Gabriele Scalia, Michał Koziarski\",\"doi\":\"arxiv-2408.05196\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-content phenotypic screening, including high-content imaging (HCI), has\\ngained popularity in the last few years for its ability to characterize novel\\ntherapeutics without prior knowledge of the protein target. When combined with\\ndeep learning techniques to predict and represent molecular-phenotype\\ninteractions, these advancements hold the potential to significantly accelerate\\nand enhance drug discovery applications. This work focuses on the novel task of\\nHCI-guided molecular design. Generative models for molecule design could be\\nguided by HCI data, for example with a supervised model that links molecules to\\nphenotypes of interest as a reward function. However, limited labeled data,\\ncombined with the high-dimensional readouts, can make training these methods\\nchallenging and impractical. We consider an alternative approach in which we\\nleverage an unsupervised multimodal joint embedding to define a latent\\nsimilarity as a reward for GFlowNets. The proposed model learns to generate new\\nmolecules that could produce phenotypic effects similar to those of the given\\nimage target, without relying on pre-annotated phenotypic labels. We\\ndemonstrate that the proposed method generates molecules with high\\nmorphological and structural similarity to the target, increasing the\\nlikelihood of similar biological activity, as confirmed by an independent\\noracle model.\",\"PeriodicalId\":501022,\"journal\":{\"name\":\"arXiv - QuanBio - Biomolecules\",\"volume\":\"24 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Biomolecules\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.05196\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Biomolecules","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.05196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

高内涵表型筛选,包括高内涵成像(HCI),因其能够在不预先了解蛋白质靶点的情况下表征新型治疗药物而在过去几年中越来越受欢迎。如果结合深度学习技术来预测和表征分子与表型之间的相互作用,这些进展将有可能大大加快和提高药物发现应用的速度。这项工作的重点是以HCI 为指导的分子设计这一新颖任务。分子设计的生成模型可以由HCI数据引导,例如使用监督模型将分子与感兴趣的表型联系起来作为奖励函数。然而,有限的标记数据加上高维读数,会使这些方法的训练变得困难和不切实际。我们考虑了另一种方法,即利用无监督多模态联合嵌入来定义潜在相似性,作为 GFlow 网络的奖励。我们提出的模型可以学习生成新分子,从而产生与给定图像目标相似的表型效应,而无需依赖预先标注的表型标签。我们证明,所提出的方法生成的分子在形态和结构上与靶标具有高度相似性,从而提高了类似生物活性的可能性,这一点已被一个独立的oracle模型所证实。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cell Morphology-Guided Small Molecule Generation with GFlowNets
High-content phenotypic screening, including high-content imaging (HCI), has gained popularity in the last few years for its ability to characterize novel therapeutics without prior knowledge of the protein target. When combined with deep learning techniques to predict and represent molecular-phenotype interactions, these advancements hold the potential to significantly accelerate and enhance drug discovery applications. This work focuses on the novel task of HCI-guided molecular design. Generative models for molecule design could be guided by HCI data, for example with a supervised model that links molecules to phenotypes of interest as a reward function. However, limited labeled data, combined with the high-dimensional readouts, can make training these methods challenging and impractical. We consider an alternative approach in which we leverage an unsupervised multimodal joint embedding to define a latent similarity as a reward for GFlowNets. The proposed model learns to generate new molecules that could produce phenotypic effects similar to those of the given image target, without relying on pre-annotated phenotypic labels. We demonstrate that the proposed method generates molecules with high morphological and structural similarity to the target, increasing the likelihood of similar biological activity, as confirmed by an independent oracle model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信