Journal of Cheminformatics最新文献

筛选
英文 中文
Automatic molecular fragmentation by evolutionary optimisation 通过进化优化实现自动分子破碎
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-19 DOI: 10.1186/s13321-024-00896-z
Fiona C. Y. Yu, Jorge L. Gálvez Vallejo, Giuseppe M. J. Barca
{"title":"Automatic molecular fragmentation by evolutionary optimisation","authors":"Fiona C. Y. Yu,&nbsp;Jorge L. Gálvez Vallejo,&nbsp;Giuseppe M. J. Barca","doi":"10.1186/s13321-024-00896-z","DOIUrl":"10.1186/s13321-024-00896-z","url":null,"abstract":"<div><p>Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ <span>(hbox {mol}^{-1})</span>, respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ <span>(hbox {mol}^{-1})</span> for MBE2 and 24.3 kJ <span>(hbox {mol}^{-1})</span> for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ <span>(hbox {mol}^{-1})</span> were observed at the MBE2 and MBE3 levels, respectively.</p><p><b>Scientific Contribution</b> This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00896-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow 化学信息学民主化:利用 KNIME 自动工作流程进行可解释的化学分组
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-16 DOI: 10.1186/s13321-024-00894-1
José T. Moreira-Filho, Dhruv Ranganath, Mike Conway, Charles Schmitt, Nicole Kleinstreuer, Kamel Mansouri
{"title":"Democratizing cheminformatics: interpretable chemical grouping using an automated KNIME workflow","authors":"José T. Moreira-Filho,&nbsp;Dhruv Ranganath,&nbsp;Mike Conway,&nbsp;Charles Schmitt,&nbsp;Nicole Kleinstreuer,&nbsp;Kamel Mansouri","doi":"10.1186/s13321-024-00894-1","DOIUrl":"10.1186/s13321-024-00894-1","url":null,"abstract":"<div><p>With the increased availability of chemical data in public databases, innovative techniques and algorithms have emerged for the analysis, exploration, visualization, and extraction of information from these data. One such technique is chemical grouping, where chemicals with common characteristics are categorized into distinct groups based on physicochemical properties, use, biological activity, or a combination. However, existing tools for chemical grouping often require specialized programming skills or the use of commercial software packages. To address these challenges, we developed a user-friendly chemical grouping workflow implemented in KNIME, a free, open-source, low/no-code, data analytics platform. The workflow serves as an all-encompassing tool, expertly incorporating a range of processes such as molecular descriptor calculation, feature selection, dimensionality reduction, hyperparameter search, and supervised and unsupervised machine learning methods, enabling effective chemical grouping and visualization of results. Furthermore, we implemented tools for interpretation, identifying key molecular descriptors for the chemical groups, and using natural language summaries to clarify the rationale behind these groupings. The workflow was designed to run seamlessly in both the KNIME local desktop version and KNIME Server WebPortal as a web application. It incorporates interactive interfaces and guides to assist users in a step-by-step manner. We demonstrate the utility of this workflow through a case study using an eye irritation and corrosion dataset.</p><p><b>Scientific contributions</b></p><p>This work presents a novel, comprehensive chemical grouping workflow in KNIME, enhancing accessibility by integrating a user-friendly graphical interface that eliminates the need for extensive programming skills. This workflow uniquely combines several features such as automated molecular descriptor calculation, feature selection, dimensionality reduction, and machine learning algorithms (both supervised and unsupervised), with hyperparameter optimization to refine chemical grouping accuracy. Moreover, we have introduced an innovative interpretative step and natural language summaries to elucidate the underlying reasons for chemical groupings, significantly advancing the usability of the tool and interpretability of the results.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00894-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Metis: a python-based user interface to collect expert feedback for generative chemistry models Metis:基于 python- 的用户界面,用于收集生成化学模型的专家反馈意见
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-14 DOI: 10.1186/s13321-024-00892-3
Janosch Menke, Yasmine Nahal, Esben Jannik Bjerrum, Mikhail Kabeshov, Samuel Kaski, Ola Engkvist
{"title":"Metis: a python-based user interface to collect expert feedback for generative chemistry models","authors":"Janosch Menke,&nbsp;Yasmine Nahal,&nbsp;Esben Jannik Bjerrum,&nbsp;Mikhail Kabeshov,&nbsp;Samuel Kaski,&nbsp;Ola Engkvist","doi":"10.1186/s13321-024-00892-3","DOIUrl":"10.1186/s13321-024-00892-3","url":null,"abstract":"<div><p>One challenge that current de novo drug design models face is a disparity between the user’s expectations and the actual output of the model in practical applications. Tailoring models to better align with chemists’ implicit knowledge, expectation and preferences is key to overcoming this obstacle effectively. While interest in preference-based and human-in-the-loop machine learning in chemistry is continuously increasing, no tool currently exists that enables the collection of standardized and chemistry-specific feedback. <span>Metis</span> is a Python-based open-source graphical user interface (GUI), designed to solve this and enable the collection of chemists’ detailed feedback on molecular structures. The GUI enables chemists to explore and evaluate molecules, offering a user-friendly interface for annotating preferences and specifying desired or undesired structural features. By providing chemists the opportunity to give detailed feedback, allows researchers to capture more efficiently the chemist’s implicit knowledge and preferences. This knowledge is crucial to align the chemist’s idea with the de novo design agents. The GUI aims to enhance this collaboration between the human and the “machine” by providing an intuitive platform where chemists can interactively provide feedback on molecular structures, aiding in preference learning and refining de novo design strategies. <span>Metis</span> integrates with the existing de novo framework REINVENT, creating a closed-loop system where human expertise can continuously inform and refine the generative models.</p><p><b>Scientific contribution</b></p><p>We introduce a novel Graphical User Interface, that allows chemists/researchers to give detailed feedback on substructures and properties of small molecules. This tool can be used to learn the preferences of chemists in order to align de novo drug design models with the chemist’s ideas. The GUI can be customized to fit different needs and projects and enables direct integration into de novo REINVENT runs. We believe that <span>Metis</span> can facilitate the discussion and development of novel ways to integrate human feedback that goes beyond binary decisions of liking or disliking a molecule.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00892-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geometric deep learning for molecular property predictions with chemical accuracy across chemical space 用几何深度学习预测分子性质,实现跨化学空间的化学准确性
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-13 DOI: 10.1186/s13321-024-00895-0
Maarten R. Dobbelaere, István Lengyel, Christian V. Stevens, Kevin M. Van Geem
{"title":"Geometric deep learning for molecular property predictions with chemical accuracy across chemical space","authors":"Maarten R. Dobbelaere,&nbsp;István Lengyel,&nbsp;Christian V. Stevens,&nbsp;Kevin M. Van Geem","doi":"10.1186/s13321-024-00895-0","DOIUrl":"10.1186/s13321-024-00895-0","url":null,"abstract":"<div><p>Chemical engineers heavily rely on precise knowledge of physicochemical properties to model chemical processes. Despite the growing popularity of deep learning, it is only rarely applied for property prediction due to data scarcity and limited accuracy for compounds in industrially-relevant areas of the chemical space. Herein, we present a geometric deep learning framework for predicting gas- and liquid-phase properties based on novel quantum chemical datasets comprising 124,000 molecules. Our findings reveal that the necessity for quantum-chemical information in deep learning models varies significantly depending on the modeled physicochemical property. Specifically, our top-performing geometric model meets the most stringent criteria for “chemically accurate” thermochemistry predictions. We also show that by carefully selecting the appropriate model featurization and evaluating prediction uncertainties, the reliability of the predictions can be strongly enhanced. These insights represent a crucial step towards establishing deep learning as the standard property prediction workflow in both industry and academia.</p><p><b>Scientific contribution</b></p><p>We propose a flexible property prediction tool that can handle two-dimensional and three-dimensional molecular information. A thermochemistry prediction methodology that achieves high-level quantum chemistry accuracy for a broad application range is presented. Trained deep learning models and large novel molecular databases of real-world molecules are provided to offer a directly usable and fast property prediction solution to practitioners.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00895-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141974156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MolCompass: multi-tool for the navigation in chemical space and visual validation of QSAR/QSPR models MolCompass:用于化学空间导航和 QSAR/QSPR 模型可视化验证的多功能工具。
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-12 DOI: 10.1186/s13321-024-00888-z
Sergey Sosnin
{"title":"MolCompass: multi-tool for the navigation in chemical space and visual validation of QSAR/QSPR models","authors":"Sergey Sosnin","doi":"10.1186/s13321-024-00888-z","DOIUrl":"10.1186/s13321-024-00888-z","url":null,"abstract":"<div><p>The exponential growth of data is challenging for humans because their ability to analyze data is limited. Especially in chemistry, there is a demand for tools that can visualize molecular datasets in a convenient graphical way. We propose a new, ready-to-use, multi-tool, and open-source framework for visualizing and navigating chemical space. This framework adheres to the low-code/no-code (LCNC) paradigm, providing a KNIME node, a web-based tool, and a Python package, making it accessible to a broad cheminformatics community. The core technique of the MolCompass framework employs a pre-trained parametric t-SNE model. We demonstrate how this framework can be adapted for the visualisation of chemical space and visual validation of binary classification QSAR/QSPR models, revealing their weaknesses and identifying model cliffs. All parts of the framework are publicly available on GitHub, providing accessibility to the broad scientific community. </p><p><b>Scientific contribution</b></p><p>We provide an open-source, ready-to-use set of tools for the visualization of chemical space. These tools can be insightful for chemists to analyze compound datasets and for the visual validation of QSAR/QSPR models.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00888-z","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141915826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Building shape-focused pharmacophore models for effective docking screening 为有效对接筛选建立以形状为重点的药理模型
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-09 DOI: 10.1186/s13321-024-00857-6
Paola Moyano-Gómez, Jukka V. Lehtonen, Olli T. Pentikäinen, Pekka A. Postila
{"title":"Building shape-focused pharmacophore models for effective docking screening","authors":"Paola Moyano-Gómez,&nbsp;Jukka V. Lehtonen,&nbsp;Olli T. Pentikäinen,&nbsp;Pekka A. Postila","doi":"10.1186/s13321-024-00857-6","DOIUrl":"10.1186/s13321-024-00857-6","url":null,"abstract":"<p>The performance of molecular docking can be improved by comparing the shape similarity of the flexibly sampled poses against the target proteins’ inverted binding cavities. The effectiveness of these pseudo-ligands or negative image-based models in docking rescoring is boosted further by performing enrichment-driven optimization. Here, we introduce a novel shape-focused pharmacophore modeling algorithm O-LAP that generates a new class of cavity-filling models by clumping together overlapping atomic content via pairwise distance graph clustering. Top-ranked poses of flexibly docked active ligands were used as the modeling input and multiple alternative clustering settings were benchmark-tested thoroughly with five demanding drug targets using random training/test divisions. In docking rescoring, the O-LAP modeling typically improved massively on the default docking enrichment; furthermore, the results indicate that the clustered models work well in rigid docking. The C+ +/Qt5-based algorithm O-LAP is released under the GNU General Public License v3.0 via GitHub (https://github.com/jvlehtonen/overlap-toolkit).</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00857-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141909087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of reinforcement learning in transformer-based molecular design 评估基于变压器的分子设计中的强化学习。
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-08 DOI: 10.1186/s13321-024-00887-0
Jiazhen He, Alessandro Tibo, Jon Paul Janet, Eva Nittinger, Christian Tyrchan, Werngard Czechtizky, Ola Engkvist
{"title":"Evaluation of reinforcement learning in transformer-based molecular design","authors":"Jiazhen He,&nbsp;Alessandro Tibo,&nbsp;Jon Paul Janet,&nbsp;Eva Nittinger,&nbsp;Christian Tyrchan,&nbsp;Werngard Czechtizky,&nbsp;Ola Engkvist","doi":"10.1186/s13321-024-00887-0","DOIUrl":"10.1186/s13321-024-00887-0","url":null,"abstract":"<div><p>Designing compounds with a range of desirable properties is a fundamental challenge in drug discovery. In pre-clinical early drug discovery, novel compounds are often designed based on an already existing promising starting compound through structural modifications for further property optimization. Recently, transformer-based deep learning models have been explored for the task of molecular optimization by training on pairs of similar molecules. This provides a starting point for generating similar molecules to a given input molecule, but has limited flexibility regarding user-defined property profiles. Here, we evaluate the effect of reinforcement learning on transformer-based molecular generative models. The generative model can be considered as a pre-trained model with knowledge of the chemical space close to an input compound, while reinforcement learning can be viewed as a tuning phase, steering the model towards chemical space with user-specific desirable properties. The evaluation of two distinct tasks—molecular optimization and scaffold discovery—suggest that reinforcement learning could guide the transformer-based generative model towards the generation of more compounds of interest. Additionally, the impact of pre-trained models, learning steps and learning rates are investigated.</p><p><b>Scientific contribution</b></p><p>Our study investigates the effect of reinforcement learning on a transformer-based generative model initially trained for generating molecules similar to starting molecules. The reinforcement learning framework is applied to facilitate multiparameter optimisation of starting molecules. This approach allows for more flexibility for optimizing user-specific property profiles and helps finding more ideas of interest.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11312936/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141905428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An automated calculation pipeline for differential pair interaction energies with molecular force fields using the Tinker Molecular Modeling Package 使用 Tinker 分子建模软件包的分子力场差分对相互作用能自动计算管道。
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-08 DOI: 10.1186/s13321-024-00890-5
Felix Bänsch, Mirco Daniel, Harald Lanig, Christoph Steinbeck, Achim Zielesny
{"title":"An automated calculation pipeline for differential pair interaction energies with molecular force fields using the Tinker Molecular Modeling Package","authors":"Felix Bänsch,&nbsp;Mirco Daniel,&nbsp;Harald Lanig,&nbsp;Christoph Steinbeck,&nbsp;Achim Zielesny","doi":"10.1186/s13321-024-00890-5","DOIUrl":"10.1186/s13321-024-00890-5","url":null,"abstract":"<div><p>An automated pipeline for comprehensive calculation of intermolecular interaction energies based on molecular force-fields using the Tinker molecular modelling package is presented. Starting with non-optimized chemically intuitive monomer structures, the pipeline allows the approximation of global minimum energy monomers and dimers, configuration sampling for various monomer–monomer distances, estimation of coordination numbers by molecular dynamics simulations, and the evaluation of differential pair interaction energies. The latter are used to derive Flory–Huggins parameters and isotropic particle–particle repulsions for Dissipative Particle Dynamics (DPD). The computational results for force fields MM3, MMFF94, OPLS-AA and AMOEBA09 are analyzed with Density Functional Theory (DFT) calculations and DPD simulations for a mixture of the non-ionic polyoxyethylene alkyl ether surfactant C<sub>10</sub>E<sub>4</sub> with water to demonstrate the usefulness of the approach.</p><p><b>Scientific Contribution</b></p><p>To our knowledge, there is currently no open computational pipeline for differential pair interaction energies at all. This work aims to contribute an (at least academically available, open) approach based on molecular force fields that provides a robust and efficient computational scheme for their automated calculation for small to medium-sized (organic) molecular dimers. The usefulness of the proposed new calculation scheme is demonstrated for the generation of mesoscopic particles with their mutual repulsive interactions.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11312682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141905427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hamiltonian diversity: effectively measuring molecular diversity by shortest Hamiltonian circuits 哈密顿多样性:通过最短哈密顿电路有效测量分子多样性。
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-07 DOI: 10.1186/s13321-024-00883-4
Xiuyuan Hu, Guoqing Liu, Quanming Yao, Yang Zhao, Hao Zhang
{"title":"Hamiltonian diversity: effectively measuring molecular diversity by shortest Hamiltonian circuits","authors":"Xiuyuan Hu,&nbsp;Guoqing Liu,&nbsp;Quanming Yao,&nbsp;Yang Zhao,&nbsp;Hao Zhang","doi":"10.1186/s13321-024-00883-4","DOIUrl":"10.1186/s13321-024-00883-4","url":null,"abstract":"<div><p>In recent years, significant advancements have been made in molecular generation algorithms aimed at facilitating drug development, and molecular diversity holds paramount importance within the realm of molecular generation. Nonetheless, the effective quantification of molecular diversity remains an elusive challenge, as extant metrics exemplified by Richness and Internal Diversity fall short in concurrently encapsulating the two main aspects of such diversity: quantity and dissimilarity. To address this quandary, we propose Hamiltonian diversity, a novel molecular diversity metric predicated upon the shortest Hamiltonian circuit. This metric embodies both aspects of molecular diversity in principle, and we implement its calculation with high efficiency and accuracy. Furthermore, through empirical experiments we demonstrate the high consistency of Hamiltonian diversity with real-world chemical diversity, and substantiate its effects in promoting diversity of molecular generation algorithms. Our implementation of Hamiltonian diversity in Python is available at: https://github.com/HXYfighter/HamDiv.</p><p><b>Scientific contribution</b></p><p>We propose a more rational molecular diversity metric for the community of cheminformatics and drug development. This metric can be applied to evaluation of existing molecular generation methods and enhancing drug design algorithms.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11308660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141900545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancements in biotransformation pathway prediction: enhancements, datasets, and novel functionalities in enviPath 生物转化途径预测的进展:enviPath 的改进、数据集和新功能。
IF 7.1 2区 化学
Journal of Cheminformatics Pub Date : 2024-08-06 DOI: 10.1186/s13321-024-00881-6
Jasmin Hafner, Tim Lorsbach, Sebastian Schmidt, Liam Brydon, Katharina Dost, Kunyang Zhang, Kathrin Fenner, Jörg Wicker
{"title":"Advancements in biotransformation pathway prediction: enhancements, datasets, and novel functionalities in enviPath","authors":"Jasmin Hafner,&nbsp;Tim Lorsbach,&nbsp;Sebastian Schmidt,&nbsp;Liam Brydon,&nbsp;Katharina Dost,&nbsp;Kunyang Zhang,&nbsp;Kathrin Fenner,&nbsp;Jörg Wicker","doi":"10.1186/s13321-024-00881-6","DOIUrl":"10.1186/s13321-024-00881-6","url":null,"abstract":"<p>enviPath is a widely used database and prediction system for microbial biotransformation pathways of primarily xenobiotic compounds. Data and prediction system are freely available both via a web interface and a public REST API. Since its initial release in 2016, we extended the data available in enviPath and improved the performance of the prediction system and usability of the overall system. We now provide three diverse data sets, covering microbial biotransformation in different environments and under different experimental conditions. This also enabled developing a pathway prediction model that is applicable to a more diverse set of chemicals. In the prediction engine, we implemented a new evaluation tailored towards pathway prediction, which returns a more honest and holistic view on the performance. We also implemented a novel applicability domain algorithm, which allows the user to estimate how well the model will perform on their data. Finally, we improved the implementation to speed up the overall system and provide new functionality via a plugin system.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304562/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141896391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信