利用化学反应知识图谱拓展化学空间

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY
Emma Rydholm, Tomas Bastys, Emma Svensson, Christos Kannas, Ola Engkvist and Thierry Kogej
{"title":"利用化学反应知识图谱拓展化学空间","authors":"Emma Rydholm, Tomas Bastys, Emma Svensson, Christos Kannas, Ola Engkvist and Thierry Kogej","doi":"10.1039/D3DD00230F","DOIUrl":null,"url":null,"abstract":"<p >In this work, we present a new molecular <em>de novo</em> design approach which utilizes a knowledge graph encoding chemical reactions, extracted from the publicly available USPTO (United States Patent and Trademark Office) dataset. Our proposed method can be used to expand the chemical space by performing forward synthesis prediction by finding new combinations of reactants in the knowledge graph and can in this way generate libraries of <em>de novo</em> compounds along with a valid synthetic route. The forward synthesis prediction of novel compounds involves two steps. In the first step, a graph neural network-based link prediction model is used to suggest pairs of existing reactant nodes in the graph that are likely to react. In the second step, product prediction is performed using a molecular transformer model to obtain the potential products for the suggested reactant pairs. We achieve a ROC–AUC score of 0.861 for link prediction in the knowledge graph and for the product prediction, a top-1 accuracy of 0.924. The method's utility is demonstrated by generating a set of <em>de novo</em> compounds by predicting high probability reactions in the USPTO. The generated compounds are diverse in nature and many exhibit drug-like properties. A brief comparison with a template-based library design is provided. Furthermore, evaluation of the potential activity using a quantitative structure–activity relationship (QSAR) model suggested the presence of potential dopamine receptor D2 (DRD2) modulators among the proposed compounds. In summary, our results suggest that the proposed method can expand the easily accessible chemical space, by combining known compounds, and identify novel drug-like compounds for a specific target.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 7","pages":" 1378-1388"},"PeriodicalIF":6.2000,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2024/dd/d3dd00230f?page=search","citationCount":"0","resultStr":"{\"title\":\"Expanding the chemical space using a chemical reaction knowledge graph†\",\"authors\":\"Emma Rydholm, Tomas Bastys, Emma Svensson, Christos Kannas, Ola Engkvist and Thierry Kogej\",\"doi\":\"10.1039/D3DD00230F\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >In this work, we present a new molecular <em>de novo</em> design approach which utilizes a knowledge graph encoding chemical reactions, extracted from the publicly available USPTO (United States Patent and Trademark Office) dataset. Our proposed method can be used to expand the chemical space by performing forward synthesis prediction by finding new combinations of reactants in the knowledge graph and can in this way generate libraries of <em>de novo</em> compounds along with a valid synthetic route. The forward synthesis prediction of novel compounds involves two steps. In the first step, a graph neural network-based link prediction model is used to suggest pairs of existing reactant nodes in the graph that are likely to react. In the second step, product prediction is performed using a molecular transformer model to obtain the potential products for the suggested reactant pairs. We achieve a ROC–AUC score of 0.861 for link prediction in the knowledge graph and for the product prediction, a top-1 accuracy of 0.924. The method's utility is demonstrated by generating a set of <em>de novo</em> compounds by predicting high probability reactions in the USPTO. The generated compounds are diverse in nature and many exhibit drug-like properties. A brief comparison with a template-based library design is provided. Furthermore, evaluation of the potential activity using a quantitative structure–activity relationship (QSAR) model suggested the presence of potential dopamine receptor D2 (DRD2) modulators among the proposed compounds. In summary, our results suggest that the proposed method can expand the easily accessible chemical space, by combining known compounds, and identify novel drug-like compounds for a specific target.</p>\",\"PeriodicalId\":72816,\"journal\":{\"name\":\"Digital discovery\",\"volume\":\" 7\",\"pages\":\" 1378-1388\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.rsc.org/en/content/articlepdf/2024/dd/d3dd00230f?page=search\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2024/dd/d3dd00230f\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2024/dd/d3dd00230f","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

在这项工作中,我们提出了一种新的分子从头设计方法,该方法利用了从公开的美国专利商标局(USPTO)数据集中提取的化学反应知识图谱。我们提出的方法可以通过在知识图谱中寻找新的反应物组合来进行前向合成预测,从而扩展化学空间,并以这种方式生成新化合物库和有效的合成路线。新化合物的前向合成预测包括两个步骤。第一步,使用基于图神经网络的链接预测模型来建议图中可能发生反应的现有反应物节点对。第二步,使用分子转换器模型进行产物预测,以获得建议反应物对的潜在产物。我们对知识图谱中的链接预测的 ROC-AUC 得分为 0.861,对产品预测的 top-1 准确率为 0.924。通过预测美国专利商标局(USPTO)中的高概率反应生成一组新化合物,证明了该方法的实用性。生成的化合物性质多样,许多具有类似药物的性质。该方法与基于模板的化合物库设计进行了简要比较。此外,使用定量结构-活性关系(QSAR)模型对潜在活性进行评估后发现,在提议的化合物中存在潜在的多巴胺受体 D2(DRD2)调节剂。总之,我们的研究结果表明,所提出的方法可以通过组合已知化合物来扩展容易获得的化学空间,并针对特定靶点鉴定出新型的类药物化合物。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Expanding the chemical space using a chemical reaction knowledge graph†

Expanding the chemical space using a chemical reaction knowledge graph†

Expanding the chemical space using a chemical reaction knowledge graph†

In this work, we present a new molecular de novo design approach which utilizes a knowledge graph encoding chemical reactions, extracted from the publicly available USPTO (United States Patent and Trademark Office) dataset. Our proposed method can be used to expand the chemical space by performing forward synthesis prediction by finding new combinations of reactants in the knowledge graph and can in this way generate libraries of de novo compounds along with a valid synthetic route. The forward synthesis prediction of novel compounds involves two steps. In the first step, a graph neural network-based link prediction model is used to suggest pairs of existing reactant nodes in the graph that are likely to react. In the second step, product prediction is performed using a molecular transformer model to obtain the potential products for the suggested reactant pairs. We achieve a ROC–AUC score of 0.861 for link prediction in the knowledge graph and for the product prediction, a top-1 accuracy of 0.924. The method's utility is demonstrated by generating a set of de novo compounds by predicting high probability reactions in the USPTO. The generated compounds are diverse in nature and many exhibit drug-like properties. A brief comparison with a template-based library design is provided. Furthermore, evaluation of the potential activity using a quantitative structure–activity relationship (QSAR) model suggested the presence of potential dopamine receptor D2 (DRD2) modulators among the proposed compounds. In summary, our results suggest that the proposed method can expand the easily accessible chemical space, by combining known compounds, and identify novel drug-like compounds for a specific target.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信