LAGNet: better electron density prediction for LCAO-based data and drug-like substances

IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Konstantin Ushenin, Kuzma Khrabrov, Artem Tsypin, Anton Ber, Egor Rumiantsev, Artur Kadurin
{"title":"LAGNet: better electron density prediction for LCAO-based data and drug-like substances","authors":"Konstantin Ushenin,&nbsp;Kuzma Khrabrov,&nbsp;Artem Tsypin,&nbsp;Anton Ber,&nbsp;Egor Rumiantsev,&nbsp;Artur Kadurin","doi":"10.1186/s13321-025-01010-7","DOIUrl":null,"url":null,"abstract":"<div><p>The electron density is an important object in quantum chemistry that is crucial for many downstream tasks in drug design. Recent deep learning approaches predict the electron density around a molecule from atom types and atom positions. Most of these methods use the plane wave (PW) numerical method as a source of ground-truth training data. However, the drug design field mostly uses the Linear Combination of Atomic Orbitals (LCAO) for computation of quantum properties. In this study, we focus on prediction of the electron density for drug-like substances and training neural networks with LCAO-based datasets. Our experiments show that proper handling of large amplitudes of core orbitals is crucial for training on LCAO-based data. We propose to store the electron density with the standard grids instead of the uniform grid. This allowed us to reduce the number of probing points per molecule by 43 times and reduce storage space requirements by 8 times. Finally, we propose a novel architecture based on the DeepDFT model that we name LAGNet. It is specifically designed and tuned for drug-like substances and <span>\\(\\nabla ^2\\)</span>DFT dataset.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01010-7","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-01010-7","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

The electron density is an important object in quantum chemistry that is crucial for many downstream tasks in drug design. Recent deep learning approaches predict the electron density around a molecule from atom types and atom positions. Most of these methods use the plane wave (PW) numerical method as a source of ground-truth training data. However, the drug design field mostly uses the Linear Combination of Atomic Orbitals (LCAO) for computation of quantum properties. In this study, we focus on prediction of the electron density for drug-like substances and training neural networks with LCAO-based datasets. Our experiments show that proper handling of large amplitudes of core orbitals is crucial for training on LCAO-based data. We propose to store the electron density with the standard grids instead of the uniform grid. This allowed us to reduce the number of probing points per molecule by 43 times and reduce storage space requirements by 8 times. Finally, we propose a novel architecture based on the DeepDFT model that we name LAGNet. It is specifically designed and tuned for drug-like substances and \(\nabla ^2\)DFT dataset.

LAGNet:对基于lcao的数据和类药物物质进行更好的电子密度预测
电子密度是量子化学中的一个重要对象,对药物设计中的许多下游任务至关重要。最近的深度学习方法通过原子类型和原子位置来预测分子周围的电子密度。这些方法大多使用平面波(PW)数值方法作为地真值训练数据的来源。然而,药物设计领域大多使用原子轨道线性组合(LCAO)来计算量子性质。在这项研究中,我们专注于预测药物样物质的电子密度,并使用基于lcao的数据集训练神经网络。我们的实验表明,正确处理核心轨道的大振幅对于基于lcao的数据的训练至关重要。我们建议用标准网格来代替均匀网格来存储电子密度。这使我们能够将每个分子的探测点数量减少43倍,并将存储空间需求减少8倍。最后,我们提出了一个基于DeepDFT模型的新架构,我们将其命名为LAGNet。它是专门为药物类物质和$$\nabla ^2$$ DFT数据集设计和调整的。我们提出了一种核抑制模型来正确处理核轨道,并在基于lcao的3、4周期原子数据上训练神经网络。我们表明,使用标准网格而不是均匀网格大大减少了电子密度探测点的数量和数据存储要求。最后,我们提出了LAGNet模型,该模型可以在类药物物质上获得比等变DeepDFT模型更好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信