Comprehensive evaluation of end-point free energy techniques in carboxylated-pillar[6]arene host–guest binding: II. regression and dielectric constant

IF 3 3区生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY

Journal of Computer-Aided Molecular Design Pub Date : 2022-11-17 DOI:10.1007/s10822-022-00487-w

Xiao Liu, Lei Zheng, Yalong Cong, Zhihao Gong, Zhixiang Yin, John Z. H. Zhang, Zhirong Liu, Zhaoxi Sun

{"title":"Comprehensive evaluation of end-point free energy techniques in carboxylated-pillar[6]arene host–guest binding: II. regression and dielectric constant","authors":"Xiao Liu, Lei Zheng, Yalong Cong, Zhihao Gong, Zhixiang Yin, John Z. H. Zhang, Zhirong Liu, Zhaoxi Sun","doi":"10.1007/s10822-022-00487-w","DOIUrl":null,"url":null,"abstract":"<div><p>End-point free energy calculations as a powerful tool have been widely applied in protein–ligand and protein–protein interactions. It is often recognized that these end-point techniques serve as an option of intermediate accuracy and computational cost compared with more rigorous statistical mechanic models (e.g., alchemical transformation) and coarser molecular docking. However, it is observed that this intermediate level of accuracy does not hold in relatively simple and prototypical host–guest systems. Specifically, in our previous work investigating a set of carboxylated-pillar[6]arene host–guest complexes, end-point methods provide free energy estimates deviating significantly from the experimental reference, and the rank of binding affinities is also incorrectly computed. These observations suggest the unsuitability and inapplicability of standard end-point free energy techniques in host–guest systems, and alteration and development are required to make them practically usable. In this work, we consider two ways to improve the performance of end-point techniques. The first one is the PBSA_E regression that varies the weights of different free energy terms in the end-point calculation procedure, while the second one is considering the interior dielectric constant as an additional variable in the end-point equation. By detailed investigation of the calculation procedure and the simulation outcome, we prove that these two treatments (i.e., regression and dielectric constant) are manipulating the end-point equation in a somehow similar way, i.e., weakening the electrostatic contribution and strengthening the non-polar terms, although there are still many detailed differences between these two methods. With the trained end-point scheme, the RMSE of the computed affinities is improved from the standard ~ 12 kcal/mol to ~ 2.4 kcal/mol, which is comparable to another altered end-point method (ELIE) trained with system-specific data. By tuning PBSA_E weighting factors with the host-specific data, it is possible to further decrease the prediction error to ~ 2.1 kcal/mol. These observations along with the extremely efficient optimized-structure computation procedure suggest the regression (i.e., PBSA_E as well as its GBSA_E extension) as a practically applicable solution that brings end-point methods back into the library of usable tools for host–guest binding. However, the dielectric-constant-variable scheme cannot effectively minimize the experiment-calculation discrepancy for absolute binding affinities, but is able to improve the calculation of affinity ranks. This phenomenon is somehow different from the protein–ligand case and suggests the difference between host–guest and biomacromolecular (protein–ligand and protein–protein) systems. Therefore, the spectrum of tools usable for protein–ligand complexes could be unsuitable for host–guest binding, and numerical validations are necessary to screen out really workable solutions in these ‘prototypical’ situations.\n</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"36 12","pages":"879 - 894"},"PeriodicalIF":3.0000,"publicationDate":"2022-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-022-00487-w","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}

引用次数: 6

Abstract

End-point free energy calculations as a powerful tool have been widely applied in protein–ligand and protein–protein interactions. It is often recognized that these end-point techniques serve as an option of intermediate accuracy and computational cost compared with more rigorous statistical mechanic models (e.g., alchemical transformation) and coarser molecular docking. However, it is observed that this intermediate level of accuracy does not hold in relatively simple and prototypical host–guest systems. Specifically, in our previous work investigating a set of carboxylated-pillar[6]arene host–guest complexes, end-point methods provide free energy estimates deviating significantly from the experimental reference, and the rank of binding affinities is also incorrectly computed. These observations suggest the unsuitability and inapplicability of standard end-point free energy techniques in host–guest systems, and alteration and development are required to make them practically usable. In this work, we consider two ways to improve the performance of end-point techniques. The first one is the PBSA_E regression that varies the weights of different free energy terms in the end-point calculation procedure, while the second one is considering the interior dielectric constant as an additional variable in the end-point equation. By detailed investigation of the calculation procedure and the simulation outcome, we prove that these two treatments (i.e., regression and dielectric constant) are manipulating the end-point equation in a somehow similar way, i.e., weakening the electrostatic contribution and strengthening the non-polar terms, although there are still many detailed differences between these two methods. With the trained end-point scheme, the RMSE of the computed affinities is improved from the standard ~ 12 kcal/mol to ~ 2.4 kcal/mol, which is comparable to another altered end-point method (ELIE) trained with system-specific data. By tuning PBSA_E weighting factors with the host-specific data, it is possible to further decrease the prediction error to ~ 2.1 kcal/mol. These observations along with the extremely efficient optimized-structure computation procedure suggest the regression (i.e., PBSA_E as well as its GBSA_E extension) as a practically applicable solution that brings end-point methods back into the library of usable tools for host–guest binding. However, the dielectric-constant-variable scheme cannot effectively minimize the experiment-calculation discrepancy for absolute binding affinities, but is able to improve the calculation of affinity ranks. This phenomenon is somehow different from the protein–ligand case and suggests the difference between host–guest and biomacromolecular (protein–ligand and protein–protein) systems. Therefore, the spectrum of tools usable for protein–ligand complexes could be unsuitable for host–guest binding, and numerical validations are necessary to screen out really workable solutions in these ‘prototypical’ situations.

Abstract Image

查看原文本刊更多论文

羧基-柱[6]芳烃主客体结合终点自由能技术的综合评价[j]。回归和介电常数

终点自由能计算作为一种强有力的工具，在蛋白质-配体和蛋白质-蛋白质相互作用中得到了广泛的应用。人们通常认识到，与更严格的统计力学模型(例如炼金术转化)和更粗糙的分子对接相比，这些端点技术可以作为中间精度和计算成本的选择。然而，我们观察到，在相对简单和典型的主客系统中，这种中间精度水平并不成立。具体来说，在我们之前研究一组羧基柱[6]芳烃主客体配合物的工作中，终点法提供的自由能估计与实验参考有很大偏差，并且结合亲和等级的计算也不正确。这些观察结果表明，标准端点自由能技术在主客系统中的不适宜性和不适用性，需要改变和发展以使其实际可用。在这项工作中，我们考虑了两种方法来提高端点技术的性能。第一种是改变终点计算过程中不同自由能项权重的PBSA_E回归，第二种是在终点方程中考虑内部介电常数作为附加变量。通过对计算过程和模拟结果的详细研究，我们证明了这两种处理(即回归和介电常数)以某种类似的方式操纵终点方程，即削弱静电贡献和加强非极性项，尽管这两种方法之间仍有许多详细的差异。使用训练终点方案，计算亲和力的RMSE从标准的~ 12 kcal/mol提高到~ 2.4 kcal/mol，这与使用系统特定数据训练的另一种改变终点方法(ELIE)相当。通过调整PBSA_E权重因子与宿主特定数据，可以进一步降低预测误差至~ 2.1 kcal/mol。这些观察结果以及极其高效的优化结构计算过程表明，回归(即PBSA_E及其GBSA_E扩展)是一种实际适用的解决方案，它将端点方法带回了主机-客户机绑定的可用工具库中。然而，电介质常数-变量格式不能有效地减小绝对结合亲和度的实验计算差异，但可以提高亲和度等级的计算。这种现象在某种程度上不同于蛋白质-配体的情况，并表明了宿主-客体和生物大分子(蛋白质-配体和蛋白质-蛋白质)系统之间的差异。因此，用于蛋白质-配体复合物的工具谱可能不适合宿主-客体结合，在这些“原型”情况下，需要进行数值验证以筛选出真正可行的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computer-Aided Molecular Design 生物-计算机：跨学科应用

CiteScore

8.00

自引率

8.60%

发文量

审稿时长

3 months

期刊介绍： The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.