Journal of Cheminformatics最新文献

筛选
英文 中文
Computer-aided pattern scoring (C@PS): a novel cheminformatic workflow to predict ligands with rare modes-of-action 计算机辅助模式评分(C@PS):预测具有罕见作用模式配体的新型化学信息学工作流程
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-09-23 DOI: 10.1186/s13321-024-00901-5
Sven Marcel Stefan, Katja Stefan, Vigneshwaran Namasivayam
{"title":"Computer-aided pattern scoring (C@PS): a novel cheminformatic workflow to predict ligands with rare modes-of-action","authors":"Sven Marcel Stefan,&nbsp;Katja Stefan,&nbsp;Vigneshwaran Namasivayam","doi":"10.1186/s13321-024-00901-5","DOIUrl":"10.1186/s13321-024-00901-5","url":null,"abstract":"<div><p>The identification, establishment, and exploration of potential pharmacological drug targets are major steps of the drug development pipeline. Target validation requires diverse chemical tools that come with a spectrum of functionality, <i>e.g.</i>, inhibitors, activators, and other modulators. Particularly tools with rare modes-of-action allow for a proper kinetic and functional characterization of the targets-of-interest (<i>e.g.</i>, channels, enzymes, receptors, or transporters). Despite, functional innovation is a prime criterion for patentability and commercial exploitation, which may lead to therapeutic benefit. Unfortunately, data on new, and thus, undruggable or barely druggable targets are scarce and mostly available for mainstream modes-of-action only (<i>e.g.</i>, inhibition). Here we present a novel cheminformatic workflow—computer-aided pattern scoring (C@PS)—which was specifically designed to project its prediction capabilities into an uncharted domain of applicability.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00901-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142276918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EC-Conf: A ultra-fast diffusion model for molecular conformation generation with equivariant consistency EC-Conf:等变一致性分子构象生成超快扩散模型
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-09-03 DOI: 10.1186/s13321-024-00893-2
Zhiguang Fan, Yuedong Yang, Mingyuan Xu, Hongming Chen
{"title":"EC-Conf: A ultra-fast diffusion model for molecular conformation generation with equivariant consistency","authors":"Zhiguang Fan,&nbsp;Yuedong Yang,&nbsp;Mingyuan Xu,&nbsp;Hongming Chen","doi":"10.1186/s13321-024-00893-2","DOIUrl":"10.1186/s13321-024-00893-2","url":null,"abstract":"<p>Despite recent advancement in 3D molecule conformation generation driven by diffusion models, its high computational cost in iterative diffusion/denoising process limits its application. Here, an equivariant consistency model (EC-Conf) was proposed as a fast diffusion method for low-energy conformation generation. In EC-Conf, a modified SE (3)-equivariant transformer model was directly used to encode the Cartesian molecular conformations and a highly efficient consistency diffusion process was carried out to generate molecular conformations. It was demonstrated that, with only one sampling step, it can already achieve comparable quality to other diffusion-based models running with thousands denoising steps. Its performance can be further improved with a few more sampling iterations. The performance of EC-Conf is evaluated on both GEOM-QM9 and GEOM-Drugs sets. Our results demonstrate that the efficiency of EC-Conf for learning the distribution of low energy molecular conformation is at least two magnitudes higher than current SOTA diffusion models and could potentially become a useful tool for conformation generation and sampling.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00893-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142124484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RAIChU: automating the visualisation of natural product biosynthesis RAIChU:实现天然产物生物合成的自动化可视化
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-09-03 DOI: 10.1186/s13321-024-00898-x
Barbara R. Terlouw, Friederike Biermann, Sophie P. J. M. Vromans, Elham Zamani, Eric J. N. Helfrich, Marnix H. Medema
{"title":"RAIChU: automating the visualisation of natural product biosynthesis","authors":"Barbara R. Terlouw,&nbsp;Friederike Biermann,&nbsp;Sophie P. J. M. Vromans,&nbsp;Elham Zamani,&nbsp;Eric J. N. Helfrich,&nbsp;Marnix H. Medema","doi":"10.1186/s13321-024-00898-x","DOIUrl":"10.1186/s13321-024-00898-x","url":null,"abstract":"&lt;div&gt;&lt;p&gt;Natural products are molecules that fulfil a range of important ecological functions. Many natural products have been exploited for pharmaceutical and agricultural applications. In contrast to many other specialised metabolites, the products of modular nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS) systems can often (partially) be predicted from the DNA sequence of the biosynthetic gene clusters. This is because the biosynthetic pathways of NRPS and PKS systems adhere to consistent rulesets. These universal biosynthetic rules can be leveraged to generate biosynthetic models of biosynthetic pathways. While these principles have been largely deciphered, software that leverages these rules to automatically generate visualisations of biosynthetic models has not yet been developed. To enable high-quality automated visualisations of natural product biosynthetic pathways, we developed RAIChU (Reaction Analysis through Illustrating Chemical Units), which produces depictions of biosynthetic transformations of PKS, NRPS, and hybrid PKS/NRPS systems from predicted or experimentally verified module architectures and domain substrate specificities. RAIChU also boasts a library of functions to perform and visualise reactions and pathways whose specifics (e.g., regioselectivity, stereoselectivity) are still difficult to predict, including terpenes, ribosomally synthesised and posttranslationally modified peptides and alkaloids. Additionally, RAIChU includes 34 prevalent tailoring reactions to enable the visualisation of biosynthetic pathways of fully maturated natural products. RAIChU can be integrated into Python pipelines, allowing users to upload and edit results from antiSMASH, a widely used BGC detection and annotation tool, or to build biosynthetic PKS/NRPS systems from scratch. RAIChU’s cluster drawing correctness (100%) and drawing readability (97.66%) were validated on 5000 randomly generated PKS/NRPS systems, and on the MIBiG database. The automated visualisation of these pathways accelerates the generation of biosynthetic models, facilitates the analysis of large (meta-) genomic datasets and reduces human error. RAIChU is available at https://github.com/BTheDragonMaster/RAIChU and https://pypi.org/project/raichu.&lt;/p&gt;&lt;p&gt;&lt;b&gt;Scientific contribution&lt;/b&gt;&lt;/p&gt;&lt;p&gt;RAIChU is the first software package capable of automating high-quality visualisations of natural product biosynthetic pathways. By leveraging universal biosynthetic rules, RAIChU enables the depiction of complex biosynthetic transformations for PKS, NRPS, ribosomally synthesised and posttranslationally modified peptide (RiPP), terpene and alkaloid systems, enhancing predictive and analytical capabilities. This innovation not only streamlines the creation of biosynthetic models, making the analysis of large genomic datasets more efficient and accurate, but also bridges a crucial gap in predicting and visualising the complexities of natural product biosynthesis.&lt;/p&gt;&lt;/div","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00898-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142123087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the generalizability of graph neural networks for predicting collision cross section 评估图神经网络预测碰撞截面的通用性
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-29 DOI: 10.1186/s13321-024-00899-w
Chloe Engler Hart, António José Preto, Shaurya Chanana, David Healey, Tobias Kind, Daniel Domingo-Fernández
{"title":"Evaluating the generalizability of graph neural networks for predicting collision cross section","authors":"Chloe Engler Hart,&nbsp;António José Preto,&nbsp;Shaurya Chanana,&nbsp;David Healey,&nbsp;Tobias Kind,&nbsp;Daniel Domingo-Fernández","doi":"10.1186/s13321-024-00899-w","DOIUrl":"10.1186/s13321-024-00899-w","url":null,"abstract":"<div><p>Ion Mobility coupled with Mass Spectrometry (IM-MS) is a promising analytical technique that enhances molecular characterization by measuring collision cross-section (CCS) values, which are indicative of the molecular size and shape. However, the effective application of CCS values in structural analysis is still constrained by the limited availability of experimental data, necessitating the development of accurate machine learning (ML) models for in silico predictions. In this study, we evaluated state-of-the-art Graph Neural Networks (GNNs), trained to predict CCS values using the largest publicly available dataset to date. Although our results confirm the high accuracy of these models within chemical spaces similar to their training environments, their performance significantly declines when applied to structurally novel regions. This discrepancy raises concerns about the reliability of in silico CCS predictions and underscores the need for releasing further publicly available CCS datasets. To mitigate this, we introduce Mol2CCS which demonstrates how generalization can be partially improved by extending models to account for additional features such as molecular fingerprints, descriptors, and the molecule types. Lastly, we also show how confidence models can support by enhancing the reliability of the CCS estimates.</p><p><b>Scientific contribution</b></p><p>We have benchmarked state-of-the-art graph neural networks for predicting collision cross section. Our work highlights the accuracy of these models when trained and predicted in similar chemical spaces, but also how their accuracy drops when evaluated in structurally novel regions. Lastly, we conclude by presenting potential approaches to mitigate this issue.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00899-w","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142089943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BuildAMol: a versatile Python toolkit for fragment-based molecular design BuildAMol:基于片段的分子设计的通用 Python 工具包。
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-25 DOI: 10.1186/s13321-024-00900-6
Noah Kleinschmidt, Thomas Lemmin
{"title":"BuildAMol: a versatile Python toolkit for fragment-based molecular design","authors":"Noah Kleinschmidt,&nbsp;Thomas Lemmin","doi":"10.1186/s13321-024-00900-6","DOIUrl":"10.1186/s13321-024-00900-6","url":null,"abstract":"<div><p>In recent years computational methods for molecular modeling have become a prime focus of computational biology and cheminformatics. Many dedicated systems exist for modeling specific classes of molecules such as proteins or small drug-like ligands. These are often heavily tailored toward the automated generation of molecular structures based on some meta-input by the user and are not intended for expert-driven structure assembly. Dedicated manual or semi-automated assembly software tools exist for a variety of molecule classes but are limited in the scope of structures they can produce. In this work we present BuildAMol, a highly flexible and extendable, general-purpose fragment-based molecular assembly toolkit. Written in Python and featuring a well-documented, user-friendly API, BuildAMol empowers researchers with a framework for detailed manual or semi-automated construction of diverse molecular models. Unlike specialized software, BuildAMol caters to a broad range of applications. We demonstrate its versatility across various use cases, encompassing generating metal complexes or the modeling of dendrimers or integrated into a drug discovery pipeline. By providing a robust foundation for expert-driven model building, BuildAMol holds promise as a valuable tool for the continuous integration and advancement of powerful deep learning techniques.</p><p><b>Scientific contribution</b></p><p>BuildAMol introduces a cutting-edge framework for molecular modeling that seamlessly blends versatility with user-friendly accessibility. This innovative toolkit integrates modeling, modification, optimization, and visualization functions within a unified API, and facilitates collaboration with other cheminformatics libraries. BuildAMol, with its shallow learning curve, serves as a versatile tool for various molecular applications while also laying the groundwork for the development of specialized software tools, contributing to the progress of molecular research and innovation.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00900-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142054568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning of multimodal networks with topological regularization for drug repositioning 利用拓扑正则化深度学习多模态网络,实现药物重新定位
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-23 DOI: 10.1186/s13321-024-00897-y
Yuto Ohnuki, Manato Akiyama, Yasubumi Sakakibara
{"title":"Deep learning of multimodal networks with topological regularization for drug repositioning","authors":"Yuto Ohnuki,&nbsp;Manato Akiyama,&nbsp;Yasubumi Sakakibara","doi":"10.1186/s13321-024-00897-y","DOIUrl":"10.1186/s13321-024-00897-y","url":null,"abstract":"<div><h3>Motivation</h3><p>Computational techniques for drug-disease prediction are essential in enhancing drug discovery and repositioning. While many methods utilize multimodal networks from various biological databases, few integrate comprehensive multi-omics data, including transcriptomes, proteomes, and metabolomes. We introduce STRGNN, a novel graph deep learning approach that predicts drug-disease relationships using extensive multimodal networks comprising proteins, RNAs, metabolites, and compounds. We have constructed a detailed dataset incorporating multi-omics data and developed a learning algorithm with topological regularization. This algorithm selectively leverages informative modalities while filtering out redundancies.</p><h3>Results</h3><p>STRGNN demonstrates superior accuracy compared to existing methods and has identified several novel drug effects, corroborating existing literature. STRGNN emerges as a powerful tool for drug prediction and discovery. The source code for STRGNN, along with the dataset for performance evaluation, is available at https://github.com/yuto-ohnuki/STRGNN.git.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00897-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142041500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic molecular fragmentation by evolutionary optimisation 通过进化优化实现自动分子破碎
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-19 DOI: 10.1186/s13321-024-00896-z
Fiona C. Y. Yu, Jorge L. Gálvez Vallejo, Giuseppe M. J. Barca
{"title":"Automatic molecular fragmentation by evolutionary optimisation","authors":"Fiona C. Y. Yu,&nbsp;Jorge L. Gálvez Vallejo,&nbsp;Giuseppe M. J. Barca","doi":"10.1186/s13321-024-00896-z","DOIUrl":"10.1186/s13321-024-00896-z","url":null,"abstract":"<div><p>Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ <span>(hbox {mol}^{-1})</span>, respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ <span>(hbox {mol}^{-1})</span> for MBE2 and 24.3 kJ <span>(hbox {mol}^{-1})</span> for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ <span>(hbox {mol}^{-1})</span> were observed at the MBE2 and MBE3 levels, respectively.</p><p><b>Scientific Contribution</b> This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00896-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow 化学信息学民主化:利用 KNIME 自动工作流程进行可解释的化学分组
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-16 DOI: 10.1186/s13321-024-00894-1
José T. Moreira-Filho, Dhruv Ranganath, Mike Conway, Charles Schmitt, Nicole Kleinstreuer, Kamel Mansouri
{"title":"Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow","authors":"José T. Moreira-Filho,&nbsp;Dhruv Ranganath,&nbsp;Mike Conway,&nbsp;Charles Schmitt,&nbsp;Nicole Kleinstreuer,&nbsp;Kamel Mansouri","doi":"10.1186/s13321-024-00894-1","DOIUrl":"10.1186/s13321-024-00894-1","url":null,"abstract":"<div><p>With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, we developed a user-friendly chemical grouping workflow implemented in KNIME, a free, open-source, low/no-code, data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. Furthermore, we implemented tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. We demonstrate the utility of this workflow through a case study using an eye irritation and corrosion dataset.</p><p><b>Scientific contributions</b></p><p>This work presents a novel, comprehensive chemical grouping workflow in KNIME, enhancing accessibility by integrating a user-friendly graphical interface that eliminates the need for extensive programming skills. This workflow uniquely combines several features such as automated molecular descriptor calculation, feature selection, dimensionality reduction, and machine learning algorithms (both supervised and unsupervised), with hyperparameter optimization to refine chemical grouping accuracy. Moreover, we have introduced an innovative interpretative step and natural language summaries to elucidate the underlying reasons for chemical groupings, significantly advancing the usability of the tool and interpretability of the results.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00894-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metis: a python-based user interface to collect expert feedback for generative chemistry models Metis:基于 python- 的用户界面,用于收集生成化学模型的专家反馈意见
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-14 DOI: 10.1186/s13321-024-00892-3
Janosch Menke, Yasmine Nahal, Esben Jannik Bjerrum, Mikhail Kabeshov, Samuel Kaski, Ola Engkvist
{"title":"Metis: a python-based user interface to collect expert feedback for generative chemistry models","authors":"Janosch Menke,&nbsp;Yasmine Nahal,&nbsp;Esben Jannik Bjerrum,&nbsp;Mikhail Kabeshov,&nbsp;Samuel Kaski,&nbsp;Ola Engkvist","doi":"10.1186/s13321-024-00892-3","DOIUrl":"10.1186/s13321-024-00892-3","url":null,"abstract":"<div><p>One challenge that current de novo drug design models face is a disparity between the user’s expectations and the actual output of the model in practical applications. Tailoring models to better align with chemists’ implicit knowledge, expectation and preferences is key to overcoming this obstacle effectively. While interest in preference-based and human-in-the-loop machine learning in chemistry is continuously increasing, no tool currently exists that enables the collection of standardized and chemistry-specific feedback. <span>Metis</span> is a Python-based open-source graphical user interface (GUI), designed to solve this and enable the collection of chemists’ detailed feedback on molecular structures. The GUI enables chemists to explore and evaluate molecules, offering a user-friendly interface for annotating preferences and specifying desired or undesired structural features. By providing chemists the opportunity to give detailed feedback, allows researchers to capture more efficiently the chemist’s implicit knowledge and preferences. This knowledge is crucial to align the chemist’s idea with the de novo design agents. The GUI aims to enhance this collaboration between the human and the “machine” by providing an intuitive platform where chemists can interactively provide feedback on molecular structures, aiding in preference learning and refining de novo design strategies. <span>Metis</span> integrates with the existing de novo framework REINVENT, creating a closed-loop system where human expertise can continuously inform and refine the generative models.</p><p><b>Scientific contribution</b></p><p>We introduce a novel Graphical User Interface, that allows chemists/researchers to give detailed feedback on substructures and properties of small molecules. This tool can be used to learn the preferences of chemists in order to align de novo drug design models with the chemist’s ideas. The GUI can be customized to fit different needs and projects and enables direct integration into de novo REINVENT runs. We believe that <span>Metis</span> can facilitate the discussion and development of novel ways to integrate human feedback that goes beyond binary decisions of liking or disliking a molecule.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00892-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric deep learning for molecular property predictions with chemical accuracy across chemical space 用几何深度学习预测分子性质,实现跨化学空间的化学准确性
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-13 DOI: 10.1186/s13321-024-00895-0
Maarten R. Dobbelaere, István Lengyel, Christian V. Stevens, Kevin M. Van Geem
{"title":"Geometric deep learning for molecular property predictions with chemical accuracy across chemical space","authors":"Maarten R. Dobbelaere,&nbsp;István Lengyel,&nbsp;Christian V. Stevens,&nbsp;Kevin M. Van Geem","doi":"10.1186/s13321-024-00895-0","DOIUrl":"10.1186/s13321-024-00895-0","url":null,"abstract":"<div><p>Chemical engineers heavily rely on precise knowledge of physicochemical properties to model chemical processes. Despite the growing popularity of deep learning, it is only rarely applied for property prediction due to data scarcity and limited accuracy for compounds in industrially-relevant areas of the chemical space. Herein, we present a geometric deep learning framework for predicting gas- and liquid-phase properties based on novel quantum chemical datasets comprising 124,000 molecules. Our findings reveal that the necessity for quantum-chemical information in deep learning models varies significantly depending on the modeled physicochemical property. Specifically, our top-performing geometric model meets the most stringent criteria for “chemically accurate” thermochemistry predictions. We also show that by carefully selecting the appropriate model featurization and evaluating prediction uncertainties, the reliability of the predictions can be strongly enhanced. These insights represent a crucial step towards establishing deep learning as the standard property prediction workflow in both industry and academia.</p><p><b>Scientific contribution</b></p><p>We propose a flexible property prediction tool that can handle two-dimensional and three-dimensional molecular information. A thermochemistry prediction methodology that achieves high-level quantum chemistry accuracy for a broad application range is presented. Trained deep learning models and large novel molecular databases of real-world molecules are provided to offer a directly usable and fast property prediction solution to practitioners.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00895-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141974156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信