PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks.

IF 5.6 2区 化学 Q1 CHEMISTRY, MEDICINAL
Hubert Rybka,Tomasz Danel,Sabina Podlewska
{"title":"PROFIS: Design of Target-Focused Libraries by Probing Continuous Fingerprint Space with Recurrent Neural Networks.","authors":"Hubert Rybka,Tomasz Danel,Sabina Podlewska","doi":"10.1021/acs.jcim.5c00698","DOIUrl":null,"url":null,"abstract":"This study introduces PROFIS, a new generative model capable of the design of structurally novel and target-focused compound libraries. The model relies on a recurrent neural network that was trained to decode embedded molecular fingerprints into SMILES strings. To identify potential novel ligands, a biological activity predictor is first trained on the low-dimensional fingerprint embedding space, enabling the identification of high-activity subspaces for a given drug target. The search for latent representations that are expected to yield active structures upon decoding to SMILES is conducted with a Bayesian optimization algorithm. We present the rationale for using SMILES as the output notation of the recurrent neural network and compare its performance with models trained to decode DeepSMILES and SELFIES strings. The paper demonstrates the application of this protocol to generate candidate ligands of the dopamine D2 receptor. It also emphasizes the effectiveness of our approach in scaffold-hopping, which is valuable for designing ligands outside the already explored chemical space. We present how passing engineered molecular fingerprints through PROFIS network can be utilized to generate diverse libraries of analogs for a drug molecule of choice. It is worth noting that the protocol is versatile and it can be employed for any biological target, given the availability of a dataset containing known ligands. The potential for widespread use of PROFIS is secured by scripts shared by the authors on GitHub.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"42 1","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.5c00698","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

Abstract

This study introduces PROFIS, a new generative model capable of the design of structurally novel and target-focused compound libraries. The model relies on a recurrent neural network that was trained to decode embedded molecular fingerprints into SMILES strings. To identify potential novel ligands, a biological activity predictor is first trained on the low-dimensional fingerprint embedding space, enabling the identification of high-activity subspaces for a given drug target. The search for latent representations that are expected to yield active structures upon decoding to SMILES is conducted with a Bayesian optimization algorithm. We present the rationale for using SMILES as the output notation of the recurrent neural network and compare its performance with models trained to decode DeepSMILES and SELFIES strings. The paper demonstrates the application of this protocol to generate candidate ligands of the dopamine D2 receptor. It also emphasizes the effectiveness of our approach in scaffold-hopping, which is valuable for designing ligands outside the already explored chemical space. We present how passing engineered molecular fingerprints through PROFIS network can be utilized to generate diverse libraries of analogs for a drug molecule of choice. It is worth noting that the protocol is versatile and it can be employed for any biological target, given the availability of a dataset containing known ligands. The potential for widespread use of PROFIS is secured by scripts shared by the authors on GitHub.
利用递归神经网络探测连续指纹空间的目标库设计。
本文介绍了一种新的生成模型PROFIS,该模型能够设计结构新颖和目标集中的化合物库。该模型依赖于一个经过训练的递归神经网络,该网络可以将嵌入的分子指纹解码为SMILES字符串。为了识别潜在的新型配体,首先在低维指纹嵌入空间上训练生物活性预测器,从而能够识别给定药物靶标的高活性子空间。使用贝叶斯优化算法搜索解码到SMILES后预期产生活动结构的潜在表示。我们提出了使用SMILES作为递归神经网络输出符号的基本原理,并将其性能与用于解码DeepSMILES和自拍字符串的训练模型进行了比较。本文演示了该方法在多巴胺D2受体候选配体生成中的应用。它还强调了我们的方法在支架跳跃中的有效性,这对于在已经探索的化学空间之外设计配体是有价值的。我们介绍了如何通过PROFIS网络传递工程分子指纹,从而为选择的药物分子生成不同的类似物库。值得注意的是,该协议是通用的,它可以用于任何生物目标,给定包含已知配体的数据集的可用性。作者在GitHub上分享的脚本保证了PROFIS广泛使用的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
9.80
自引率
10.70%
发文量
529
审稿时长
1.4 months
期刊介绍: The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信