Which molecules can challenge density-functional tight-binding methods in evaluating the energies of conformers? investigation with machine-learning toolset

IF 0.8 4区物理与天体物理 Q4 PHYSICS, APPLIED

Low Temperature Physics Pub Date : 2024-03-26 DOI:10.1063/10.0024962

Andrii Terets, Tymofii Nikolaienko

{"title":"Which molecules can challenge density-functional tight-binding methods in evaluating the energies of conformers? investigation with machine-learning toolset","authors":"Andrii Terets, Tymofii Nikolaienko","doi":"10.1063/10.0024962","DOIUrl":null,"url":null,"abstract":"Large organic molecules and biomolecules can adopt multiple conformations, with the occurrences determined by their relative energies. Identifying the energetically most favorable conformations is crucial, especially when interpreting spectroscopic experiments conducted under cryogenic conditions. When the effects of irregular surrounding medium, such as noble gas matrices, on the vibrational properties of molecules become important, semi-empirical (SE) quantum-chemical methods are often employed for computational simulations. Although SE methods are computationally more efficient than first-principle quantum-chemical methods, they can be inaccurate in determining the energies of conformers in some molecules while displaying good accuracy in others. In this study, we employ a combination of advanced machine learning techniques, such as graph neural networks, to identify molecules with the highest errors in the relative energies of conformers computed by the semi-empirical tight-binding method GFN1-xTB. The performance of three different machine learning models is assessed by comparing their predicted errors with the actual errors in conformer energies obtained via the GFN1-xTB method. We further applied the ensemble machine-learning model to a larger collection of molecules from the ChEMBL database and identified a set of molecules as being challenging for the GFN1-xTB method. These molecules hold potential for further improvement of the GFN1-xTB method, showcasing the capability of machine learning models in identifying molecules that can challenge its physical model.","PeriodicalId":18077,"journal":{"name":"Low Temperature Physics","volume":"64 1","pages":""},"PeriodicalIF":0.8000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Low Temperature Physics","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1063/10.0024962","RegionNum":4,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PHYSICS, APPLIED","Score":null,"Total":0}

引用次数: 0

Abstract

Large organic molecules and biomolecules can adopt multiple conformations, with the occurrences determined by their relative energies. Identifying the energetically most favorable conformations is crucial, especially when interpreting spectroscopic experiments conducted under cryogenic conditions. When the effects of irregular surrounding medium, such as noble gas matrices, on the vibrational properties of molecules become important, semi-empirical (SE) quantum-chemical methods are often employed for computational simulations. Although SE methods are computationally more efficient than first-principle quantum-chemical methods, they can be inaccurate in determining the energies of conformers in some molecules while displaying good accuracy in others. In this study, we employ a combination of advanced machine learning techniques, such as graph neural networks, to identify molecules with the highest errors in the relative energies of conformers computed by the semi-empirical tight-binding method GFN1-xTB. The performance of three different machine learning models is assessed by comparing their predicted errors with the actual errors in conformer energies obtained via the GFN1-xTB method. We further applied the ensemble machine-learning model to a larger collection of molecules from the ChEMBL database and identified a set of molecules as being challenging for the GFN1-xTB method. These molecules hold potential for further improvement of the GFN1-xTB method, showcasing the capability of machine learning models in identifying molecules that can challenge its physical model.

查看原文本刊更多论文

在评估构象能量时，哪些分子可以挑战密度函数紧密结合方法？利用机器学习工具集进行研究

大型有机分子和生物大分子可以采用多种构象，其发生率取决于它们的相对能量。识别能量上最有利的构象至关重要，尤其是在解释低温条件下进行的光谱实验时。当周围不规则介质（如惰性气体基质）对分子振动特性的影响变得重要时，通常会采用半经验（SE）量子化学方法进行计算模拟。虽然半经验量子化学方法在计算上比第一原理量子化学方法更有效率，但在确定某些分子的构象能量时可能不准确，而在另一些分子中却表现出很好的准确性。在本研究中，我们结合使用了图神经网络等先进的机器学习技术，以识别在半经验紧密结合方法 GFN1-xTB 计算的构象相对能量中误差最大的分子。通过比较三种不同机器学习模型的预测误差和通过 GFN1-xTB 方法获得的构象能量的实际误差，评估了它们的性能。我们进一步将集合机器学习模型应用于 ChEMBL 数据库中更多的分子集合，并确定了一组对 GFN1-xTB 方法具有挑战性的分子。这些分子具有进一步改进 GFN1-xTB 方法的潜力，展示了机器学习模型识别可能挑战其物理模型的分子的能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Low Temperature Physics 物理-物理：应用

CiteScore

1.20

自引率

25.00%

发文量

138

审稿时长

3 months

期刊介绍： Guided by an international editorial board, Low Temperature Physics (LTP) communicates the results of important experimental and theoretical studies conducted at low temperatures. LTP offers key work in such areas as superconductivity, magnetism, lattice dynamics, quantum liquids and crystals, cryocrystals, low-dimensional and disordered systems, electronic properties of normal metals and alloys, and critical phenomena. The journal publishes original articles on new experimental and theoretical results as well as review articles, brief communications, memoirs, and biographies. Low Temperature Physics, a translation of the copyrighted Journal FIZIKA NIZKIKH TEMPERATUR, is a monthly journal containing English reports of current research in the field of the low temperature physics. The translation began with the 1975 issues. One volume is published annually beginning with the January issues.