Journal of Cheminformatics最新文献_第9页

Addressing standardization and semantics in an electronic lab notebook for multidisciplinary use: LabIMotion 在多学科使用的电子实验笔记本中处理标准化和语义：LabIMotion

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-14 DOI: 10.1186/s13321-025-01021-4

Chia-Lin Lin, Pei-Chi Huang, Christof Wöll, Patrick Théato, Christian Kübel, Lena Pilz, Nicole Jung, Stefan Bräse

{"title":"Addressing standardization and semantics in an electronic lab notebook for multidisciplinary use: LabIMotion","authors":"Chia-Lin Lin, Pei-Chi Huang, Christof Wöll, Patrick Théato, Christian Kübel, Lena Pilz, Nicole Jung, Stefan Bräse","doi":"10.1186/s13321-025-01021-4","DOIUrl":"10.1186/s13321-025-01021-4","url":null,"abstract":"<div><p>This work presents the LabIMotion extension for the Chemotion Electronic Lab Notebook (ELN), expanding its capabilities from organic chemistry to support interdisciplinary research and enabling the description of workflows. LabIMotion enhances documentation by introducing customizable components structured across three levels—<i>Elements</i>, <i>Segments</i>, and <i>Datasets</i>—enabling flexible, hierarchical organization and reuse of data. Through the integration of links to ontologies, the extension ensures precise, machine-readable data, promoting interoperability and adherence to FAIR principles. The extension features an intuitive, user-friendly interface that allows researchers to easily create new ELN content by leveraging a set of customizable, generic methods. Scientists can set up new data fields, can link data fields, or establish workflows, and the extension translates those needs directly into usable functionality at their command. Through this high degree of flexibility, a wide range of specific research needs can be met. The LabIMotion Hub plays a crucial role in distributing and updating components, fostering standardization, and enabling collaborative development within scientific communities. These advancements significantly improve the ELN's adaptability, usability, and relevance across various research disciplines.</p><p><b>Scientific contribution</b></p><p>This work demonstrates how research data management systems can be designed to support discipline-specific requirements in chemistry research while offering a high flexibility and interoperability to deal with interdisciplinary work. The developed software, LabIMotion, offers a versatile approach for integrating novel research aspects into a research data environment, fostering bottom-up processes for defining schemas and standardizing scientific workflows. In particular, the software’s support for community-driven extensions, combined with a clear definition of content and its assignment to ontology terms, provides unique advantages for creating adaptable tools suited to the complexities of the scientific environment.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01021-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143944376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generalizable, fast, and accurate DeepQSPR with fastprop

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-13 DOI: 10.1186/s13321-025-01013-4

Jackson W. Burns, William H. Green

{"title":"Generalizable, fast, and accurate DeepQSPR with fastprop","authors":"Jackson W. Burns, William H. Green","doi":"10.1186/s13321-025-01013-4","DOIUrl":"10.1186/s13321-025-01013-4","url":null,"abstract":"<div><p>Quantitative Structure–Property Relationship studies (QSPR), often referred to interchangeably as QSAR, seek to establish a mapping between molecular structure and an arbitrary target property. Historically this was done on a target-by-target basis with new descriptors being devised to <i>specifically</i> map to a given target. Today software packages exist that calculate thousands of these descriptors, enabling general modeling typically with classical and machine learning methods. Also present today are learned representation methods in which deep learning models generate a target-specific representation during training. The former requires less training data and offers improved speed and interpretability while the latter offers excellent generality, while the intersection of the two remains under-explored. This paper introduces <span>fastprop</span>, a software package and general Deep-QSPR framework that combines a cogent set of molecular descriptors with deep learning to achieve state-of-the-art performance on datasets ranging from tens to tens of thousands of molecules. <span>fastprop</span> provides both a user-friendly Command Line Interface and highly interoperable set of Python modules for the training and deployment of feedforward neural networks for property prediction. This approach yields improvements in speed and interpretability over existing methods while statistically equaling or exceeding their performance across most of the tested benchmarks. <span>fastprop</span> is designed with Research Software Engineering best practices and is free and open source, hosted at github.com/jacksonburns/fastprop.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01013-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143944189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating diversity and securing completeness in algorithmic retrosynthesis

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-13 DOI: 10.1186/s13321-025-00981-x

Florian Mrugalla, Christopher Franz, Yannic Alber, Georg Mogk, Martín Villalba, Thomas Mrziglod, Kevin Schewior

{"title":"Generating diversity and securing completeness in algorithmic retrosynthesis","authors":"Florian Mrugalla, Christopher Franz, Yannic Alber, Georg Mogk, Martín Villalba, Thomas Mrziglod, Kevin Schewior","doi":"10.1186/s13321-025-00981-x","DOIUrl":"10.1186/s13321-025-00981-x","url":null,"abstract":"<p>Chemical synthesis planning has considerably benefited from advances in the field of machine learning. Neural networks can reliably and accurately predict reactions leading to a given, possibly complex, molecule. In this work we focus on algorithms for assembling such predictions to a full synthesis plan that, starting from simple building blocks, produces a given target molecule, a procedure known as retrosynthesis. Objective functions for this task are hard to define and context-specific. In order to generate a diverse set of synthesis plans for chemists to select from, we capture the concept of diversity in a novel chemical diversity score (CDS). Our experiments show that our algorithm outperforms the algorithm predominantly employed in this domain, Monte-Carlo Tree Search, with respect to diversity in terms of our score as well as time efficiency.</p><p>We adapt Depth-First Proof-Number Search (DFPN) (Please refer to https://github.com/Bayer-Group/bayer-retrosynthesis-search for the accompanying source code.) and its variants, which have been applied to retrosynthesis before, to produce a set of solutions, with an explicit focus on diversity. We also make progress on understanding DFPN in terms of completeness, i.e., the ability to find a solution whenever there exists one. DFPN is known to be incomplete, for which we provide a much cleaner example, but we also show that it is complete when reinforced with a threshold-controlling routine from the literature.</p>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-00981-x","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143944190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The published role of artificial intelligence in drug discovery and development: a bibliometric and social network analysis from 1990 to 2023 人工智能在药物发现和开发中的作用：1990年至2023年的文献计量学和社会网络分析

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-08 DOI: 10.1186/s13321-025-00988-4

Murat Koçak, Zafer Akçalı

{"title":"The published role of artificial intelligence in drug discovery and development: a bibliometric and social network analysis from 1990 to 2023","authors":"Murat Koçak, Zafer Akçalı","doi":"10.1186/s13321-025-00988-4","DOIUrl":"10.1186/s13321-025-00988-4","url":null,"abstract":"<div><p>Today, drug discovery and development is one of the fields where Artificial Intelligence (AI) is used extensively. Therefore, this study aims to systematically analyze the scientific literature on the application of AI in drug discovery and development to understand the evolution, trends, and key contributors within this rapidly growing field. By leveraging various bibliometric indicators and visualization techniques, we seek to explore the growth patterns, influential authors and institutions, collaboration networks, and emerging research trends within this domain. Bibliometric and network analysis methods (co-occurrence, co-authorship, and collaboration, etc.) were used to achieve this goal. Bibliometric visualization tools such as Bibliometrix R package software, VOSviewer, and Litmaps were used for comprehensive data analysis. Scientific publications on AI in drug discovery and development were retrieved from the Web of Science Core Collection (WoS CC) database covering 1990–2023. In addition to visualization programs, the InCites database was also used for analysis and visualization. A total of 4059 scientific publications written by 13,932 authors and published in 1071 journals were included in the analysis. The results reveal that the most prolific authors are Ekins (n = 67), Schneider (n = 52), Hou Tj (n = 43), and Cao Ds (n = 34), while the most active institutions are the “Chinese Academy of Science” and “University of California.” The leading scientific journals are “Journal of Chemical Information and Modelling,” “Briefings in Bioinformatics,” and “Journal of Cheminformatics.” The most frequently used author keywords include “protein folding,” “QSAR,” “gene expression data,” “coronavirus,” and “genome rearrangement.” The average number of citations per scientific publication is 28.62, indicating a high impact of research in this field. A significant increase in publications was observed after 2014, with a peak in 2022, followed by a slight decline. International collaboration accounts for 28.06% of the publications, with the USA and China leading in both productivity and influence. The study also identifies key funding organizations, such as the National Natural Science Foundation of China (NSFC) and the United States Department of Health & Human Services, which have significantly supported advancements in this field. In conclusion, this study highlights the transformative role of AI in drug discovery and development, showcasing its potential to accelerate innovation and improve efficiency. The findings provide valuable insights into the current state of research, emerging trends, and future directions, offering a roadmap for researchers, industry professionals, and policymakers to further explore and leverage AI technologies in this domain.</p><p><b>Scientific contribution</b>This study provides a comprehensive bibliometric analysis of 4,059 scientific publications (1990–2023) to map the evolution, trends, and key contrib","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-00988-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143920516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Predicting inhibitors of OATP1B1 via heterogeneous OATP-ligand interaction graph neural network (HOLIgraph) 利用异构otp -配体相互作用图神经网络预测OATP1B1抑制剂（HOLIgraph）

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-05 DOI: 10.1186/s13321-025-01020-5

Mehrsa Mardikoraem, Joelle N. Eaves, Theodore Belecciu, Nathaniel Pascual, Alexander Aljets, Bruno Hagenbuch, Erik M. Shapiro, Benjamin J. Orlando, Daniel R. Woldring

{"title":"Predicting inhibitors of OATP1B1 via heterogeneous OATP-ligand interaction graph neural network (HOLIgraph)","authors":"Mehrsa Mardikoraem, Joelle N. Eaves, Theodore Belecciu, Nathaniel Pascual, Alexander Aljets, Bruno Hagenbuch, Erik M. Shapiro, Benjamin J. Orlando, Daniel R. Woldring","doi":"10.1186/s13321-025-01020-5","DOIUrl":"10.1186/s13321-025-01020-5","url":null,"abstract":"<div><p>Organic anion transporting polypeptides (OATPs) are membrane transporters crucial for drug uptake and distribution in the human body. OATPs can mediate drug-drug interactions (DDIs) in which the interaction of one drug with an OATP impairs the uptake of another drug, resulting in potentially fatal pharmacological effects. Predicting OATP-mediated DDIs is challenging, due to limited information on OATP inhibition mechanisms and inconsistent experimental OATP inhibition data across different studies. This study introduces Heterogeneous OATP-Ligand Interaction Graph Neural Network (HOLIgraph), a novel computational model that integrates molecular modeling with a graph neural network to enhance the prediction of drug-induced OATP inhibition. By combining ligand (i.e., drug) molecular features with protein-ligand interaction data from rigorous docking simulations, HOLIgraph outperforms traditional DDI prediction models which rely solely on ligand molecular features. HOLIgraph achieved a median balanced accuracy of over 90 percent when predicting inhibitors for OATP1B1, significantly outperforming purely ligand-based models. Beyond improving inhibition prediction, the data used to train HOLIgraph can enable the characterization of protein residues involved in inhibitory drug-OATP interactions. We identified certain OATP1B1 residues that preferentially interact with inhibitors, including I46 and K49. We anticipate such interaction information will be valuable to future structural and mechanistic investigations of OATP1B1.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01020-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Application of 3D atom pair map in an attention model for enhanced drug virtual screening 三维原子对图在药物虚拟筛选注意模型中的应用

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-05 DOI: 10.1186/s13321-025-01023-2

Gina Ryu, Wankyu Kim

引用次数: 0

Prediction of blood–brain barrier and Caco-2 permeability through the Enalos Cloud Platform: combining contrastive learning and atom-attention message passing neural networks 通过Enalos云平台预测血脑屏障和Caco-2渗透率：结合对比学习和原子注意信息传递神经网络

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-05 DOI: 10.1186/s13321-025-01007-2

Nikoletta-Maria Koutroumpa, Andreas Tsoumanis, Haralambos Sarimveis, Iseult Lynch, Georgia Melagraki, Antreas Afantitis

{"title":"Prediction of blood–brain barrier and Caco-2 permeability through the Enalos Cloud Platform: combining contrastive learning and atom-attention message passing neural networks","authors":"Nikoletta-Maria Koutroumpa, Andreas Tsoumanis, Haralambos Sarimveis, Iseult Lynch, Georgia Melagraki, Antreas Afantitis","doi":"10.1186/s13321-025-01007-2","DOIUrl":"10.1186/s13321-025-01007-2","url":null,"abstract":"<div><p>In this study, we introduce a novel approach for predicting two key drug properties, blood–brain barrier (BBB) permeability and human intestinal absorption via Caco-2 permeability. Our methodology centers around a specialized neural network, the atom transformer-based Message Passing Neural Network (MPNN), which we have combined with contrastive learning techniques to enhance the process of representing and embedding molecular structures for more accurate property prediction. These innovative models focus on predicting BBB and Caco-2 permeability -two critical factors in drug absorption and distribution- which fall under the broader scope of ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties. The models are readily accessible online through the Enalos Cloud Platform which offers a user-friendly, AI-powered, ready-to-use web service that significantly streamlines the drug design process, enabling users to easily predict and understand the behavior of potential drug compounds within the human body.</p><p><b>Scientific Contribution</b> Our study combines an atom-attention Message Passing Neural Network (AA-MPNN) with contrastive learning (CL), which significantly improves predictive accuracy. Our model leverages self-supervised learning to expand the chemical space used in training and self-attention mechanisms to focus on critical molecular features, enhancing both model accuracy and interpretability. Additionally, the ready-to-use web service based on our model democratizes access to predictive tools for the scientific and regulatory communities.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01007-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging AI to explore structural contexts of post-translational modifications in drug binding 利用人工智能探索药物结合中翻译后修饰的结构背景

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-05-04 DOI: 10.1186/s13321-025-01019-y

Kirill E. Medvedev, R. Dustin Schaeffer, Nick V. Grishin

{"title":"Leveraging AI to explore structural contexts of post-translational modifications in drug binding","authors":"Kirill E. Medvedev, R. Dustin Schaeffer, Nick V. Grishin","doi":"10.1186/s13321-025-01019-y","DOIUrl":"10.1186/s13321-025-01019-y","url":null,"abstract":"<div><p>Post-translational modifications (PTMs) play a crucial role in allowing cells to expand the functionality of their proteins and adaptively regulate their signaling pathways. Defects in PTMs have been linked to numerous developmental disorders and human diseases, including cancer, diabetes, heart, neurodegenerative and metabolic diseases. PTMs are important targets in drug discovery, as they can significantly influence various aspects of drug interactions including binding affinity. The structural consequences of PTMs, such as phosphorylation-induced conformational changes or their effects on ligand binding affinity, have historically been challenging to study on a large scale, primarily due to reliance on experimental methods. Recent advancements in computational power and artificial intelligence, particularly in deep learning algorithms and protein structure prediction tools like AlphaFold3, have opened new possibilities for exploring the structural context of interactions between PTMs and drugs. These AI-driven methods enable accurate modeling of protein structures including prediction of PTM-modified regions and simulation of ligand-binding dynamics on a large scale. In this work, we identified small molecule binding-associated PTMs that can influence drug binding across all human proteins listed as small molecule targets in the DrugDomain database, which we developed recently. 6,131 identified PTMs were mapped to structural domains from Evolutionary Classification of Protein Domains (ECOD) database.</p><p><b>Scientific contribution</b>: Using recent AI-based approaches for protein structure prediction (AlphaFold3, RoseTTAFold All-Atom, Chai-1), we generated 14,178 models of PTM-modified human proteins with docked ligands. Our results demonstrate that these methods can predict PTM effects on small molecule binding, but precise evaluation of their accuracy requires a much larger benchmarking set. We also found that phosphorylation of NADPH-Cytochrome P450 Reductase, observed in cervical and lung cancer, causes significant structural disruption in the binding pocket, potentially impairing protein function. All data and generated models are available from DrugDomain database v1.1 (http://prodata.swmed.edu/DrugDomain/) and GitHub (https://github.com/kirmedvedev/DrugDomain). This resource is the first to our knowledge in offering structural context for small molecule binding-associated PTMs on a large scale.</p><h3>Graphical abstract</h3>\u0000<div><figure><div><div><picture><source><img></source></picture></div></div></figure></div></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01019-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving the accuracy of prediction models for small datasets of Cytochrome P450 inhibition with deep learning 利用深度学习提高细胞色素P450抑制小数据集预测模型的准确性

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-04-30 DOI: 10.1186/s13321-025-01015-2

Elpri Eka Permadi, Reiko Watanabe, Kenji Mizuguchi

{"title":"Improving the accuracy of prediction models for small datasets of Cytochrome P450 inhibition with deep learning","authors":"Elpri Eka Permadi, Reiko Watanabe, Kenji Mizuguchi","doi":"10.1186/s13321-025-01015-2","DOIUrl":"10.1186/s13321-025-01015-2","url":null,"abstract":"<div><p>The cytochrome P450 (CYP) superfamily metabolises a wide range of compounds; however, drug-induced CYP inhibition can lead to adverse interactions. Identifying potential CYP inhibitors is crucial for safe drug administration. This study investigated the application of deep learning techniques to the prediction of CYP inhibition, focusing on the challenges posed by limited datasets for CYP2B6 and CYP2C8 isoforms. To tackle these limitations, we leveraged larger datasets for related CYP isoforms, compiling comprehensive data from public databases containing IC50 values for 12,369 compounds that target seven CYP isoforms. We constructed single-task, fine-tuning, multitask, and multitask models incorporating data imputation on the missing values. Notably, the multitask models with data imputation demonstrated significant improvement in CYP inhibition prediction over the single-task models. Using the most accurate prediction models, we evaluated the inhibitory activity of approved drugs against CYP2B6 and CYP2C8. Among the 1,808 approved drugs analysed, our multitask models with data imputation identified 161 and 154 potential inhibitors of CYP2B6 and CYP2C8, respectively. This study underscores the significant potential of multitask deep learning, particularly when utilising a graph convolutional network with data imputation, to enhance the accuracy of CYP inhibition predictions under the conditions of limited data availability.</p><p><b>Scientific contribution</b></p><p>This study demonstrates that even with small datasets, accurate prediction models can be constructed by utilising related data effectively. Also, our imputation techniques on the missing values improved the prediction accuracy of CYP2B6 and CYP2C8 inhibition significantly.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01015-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LAGNet: better electron density prediction for LCAO-based data and drug-like substances LAGNet：对基于lcao的数据和类药物物质进行更好的电子密度预测

IF 7.1 2区化学

Journal of Cheminformatics Pub Date : 2025-04-29 DOI: 10.1186/s13321-025-01010-7

Konstantin Ushenin, Kuzma Khrabrov, Artem Tsypin, Anton Ber, Egor Rumiantsev, Artur Kadurin

{"title":"LAGNet: better electron density prediction for LCAO-based data and drug-like substances","authors":"Konstantin Ushenin, Kuzma Khrabrov, Artem Tsypin, Anton Ber, Egor Rumiantsev, Artur Kadurin","doi":"10.1186/s13321-025-01010-7","DOIUrl":"10.1186/s13321-025-01010-7","url":null,"abstract":"<div><p>The electron density is an important object in quantum chemistry that is crucial for many downstream tasks in drug design. Recent deep learning approaches predict the electron density around a molecule from atom types and atom positions. Most of these methods use the plane wave (PW) numerical method as a source of ground-truth training data. However, the drug design field mostly uses the Linear Combination of Atomic Orbitals (LCAO) for computation of quantum properties. In this study, we focus on prediction of the electron density for drug-like substances and training neural networks with LCAO-based datasets. Our experiments show that proper handling of large amplitudes of core orbitals is crucial for training on LCAO-based data. We propose to store the electron density with the standard grids instead of the uniform grid. This allowed us to reduce the number of probing points per molecule by 43 times and reduce storage space requirements by 8 times. Finally, we propose a novel architecture based on the DeepDFT model that we name LAGNet. It is specifically designed and tuned for drug-like substances and <span>(nabla ^2)</span>DFT dataset.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01010-7","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143884591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0