Jia-Xi Chang, Jian-Wei Zou, Chao-Yuan Lou, Jia-Xin Ye, Rui Feng, Zi-Yuan Li, Gui-Xiang Hu
{"title":"气体-离子液体分配:QSPR模型和机理解释。","authors":"Jia-Xi Chang, Jian-Wei Zou, Chao-Yuan Lou, Jia-Xin Ye, Rui Feng, Zi-Yuan Li, Gui-Xiang Hu","doi":"10.1002/minf.202200223","DOIUrl":null,"url":null,"abstract":"<p><p>The present work was devoted to explore the quantitative structure-property relationships for gas-to-ionic liquid partition coefficients (log K<sub>ILA</sub> ). A series of linear models were first established for the representative dataset (IL01). The optimal model was a four-parameter equation (1Ed) consisting of two electrostatic potential-based descriptors ( <math> <semantics><mrow><mi>Σ</mi> <msubsup><mi>V</mi> <mrow><mi>s</mi> <mo>,</mo> <mi>i</mi> <mi>n</mi> <mi>d</mi></mrow> <mo>-</mo></msubsup> </mrow> <annotation>${{\\rm { \\Sigma }}{V}_{s,ind}^{-}}$</annotation> </semantics> </math> and V<sub>s,max</sub> ), one 2D matrix-based descriptor (J_D/Dt) and dipole moment (μ). All of the four descriptors introduced in the model can find the corresponding parameters, directly or indirectly, from Abraham's linear solvation energy relationship (LSER) or its theoretical alternatives, which endows the model good interpretability. Gaussian process was utilized to build the nonlinear model. Systematical validations, including 5-fold cross-validation for the training set, the validation for test set, as well as a more rigorous Monte Carlo cross-validation were performed to verify the reliability of the constructed models. Applicability domain of the model was evaluated, and the Williams plot revealed that the model can be used to predict the log K<sub>ILA</sub> values of structurally diverse solutes. The other 13 datasets were also processed in the same way, and all of the linear models with expressions similar to equation 1Ed were obtained. These models, whether linear of nonlinear, represent satisfactory statistical results, which confirms the universality of the method adopted in this study in QSPR modeling of gas-to-IL partition.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Gas-to-ionic liquid partition: QSPR modeling and mechanistic interpretation.\",\"authors\":\"Jia-Xi Chang, Jian-Wei Zou, Chao-Yuan Lou, Jia-Xin Ye, Rui Feng, Zi-Yuan Li, Gui-Xiang Hu\",\"doi\":\"10.1002/minf.202200223\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The present work was devoted to explore the quantitative structure-property relationships for gas-to-ionic liquid partition coefficients (log K<sub>ILA</sub> ). A series of linear models were first established for the representative dataset (IL01). The optimal model was a four-parameter equation (1Ed) consisting of two electrostatic potential-based descriptors ( <math> <semantics><mrow><mi>Σ</mi> <msubsup><mi>V</mi> <mrow><mi>s</mi> <mo>,</mo> <mi>i</mi> <mi>n</mi> <mi>d</mi></mrow> <mo>-</mo></msubsup> </mrow> <annotation>${{\\\\rm { \\\\Sigma }}{V}_{s,ind}^{-}}$</annotation> </semantics> </math> and V<sub>s,max</sub> ), one 2D matrix-based descriptor (J_D/Dt) and dipole moment (μ). All of the four descriptors introduced in the model can find the corresponding parameters, directly or indirectly, from Abraham's linear solvation energy relationship (LSER) or its theoretical alternatives, which endows the model good interpretability. Gaussian process was utilized to build the nonlinear model. Systematical validations, including 5-fold cross-validation for the training set, the validation for test set, as well as a more rigorous Monte Carlo cross-validation were performed to verify the reliability of the constructed models. Applicability domain of the model was evaluated, and the Williams plot revealed that the model can be used to predict the log K<sub>ILA</sub> values of structurally diverse solutes. The other 13 datasets were also processed in the same way, and all of the linear models with expressions similar to equation 1Ed were obtained. These models, whether linear of nonlinear, represent satisfactory statistical results, which confirms the universality of the method adopted in this study in QSPR modeling of gas-to-IL partition.</p>\",\"PeriodicalId\":18853,\"journal\":{\"name\":\"Molecular Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/minf.202200223\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/minf.202200223","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
摘要
本工作致力于探索气体-离子液体分配系数(log KILA)的定量结构-性质关系。首先针对代表性数据集(IL01)建立了一系列线性模型。最优模型是由两个基于静电电位的描述子(Σ Vs, ind - ${{\rm {\ Sigma}}{V}_{s,ind}^{-}}$和V,max)、一个基于二维矩阵的描述子(J_D/Dt)和偶极矩(μ)组成的四参数方程(1Ed)。模型中引入的四种描述符都可以直接或间接地从亚伯拉罕的线性溶剂化能关系(LSER)或其理论替代中找到相应的参数,这赋予了模型良好的可解释性。采用高斯过程建立非线性模型。系统验证,包括对训练集的5倍交叉验证,对测试集的验证,以及更严格的蒙特卡罗交叉验证,以验证所构建模型的可靠性。对模型的适用范围进行了评估,Williams图显示该模型可用于预测结构不同的溶质的对数KILA值。对其余13个数据集也进行同样的处理,得到的线性模型均与方程1Ed相似。这些模型,无论是线性的还是非线性的,都代表了令人满意的统计结果,这证实了本研究采用的方法在气-油划分QSPR建模中的通用性。
Gas-to-ionic liquid partition: QSPR modeling and mechanistic interpretation.
The present work was devoted to explore the quantitative structure-property relationships for gas-to-ionic liquid partition coefficients (log KILA ). A series of linear models were first established for the representative dataset (IL01). The optimal model was a four-parameter equation (1Ed) consisting of two electrostatic potential-based descriptors ( and Vs,max ), one 2D matrix-based descriptor (J_D/Dt) and dipole moment (μ). All of the four descriptors introduced in the model can find the corresponding parameters, directly or indirectly, from Abraham's linear solvation energy relationship (LSER) or its theoretical alternatives, which endows the model good interpretability. Gaussian process was utilized to build the nonlinear model. Systematical validations, including 5-fold cross-validation for the training set, the validation for test set, as well as a more rigorous Monte Carlo cross-validation were performed to verify the reliability of the constructed models. Applicability domain of the model was evaluated, and the Williams plot revealed that the model can be used to predict the log KILA values of structurally diverse solutes. The other 13 datasets were also processed in the same way, and all of the linear models with expressions similar to equation 1Ed were obtained. These models, whether linear of nonlinear, represent satisfactory statistical results, which confirms the universality of the method adopted in this study in QSPR modeling of gas-to-IL partition.
期刊介绍:
Molecular Informatics is a peer-reviewed, international forum for publication of high-quality, interdisciplinary research on all molecular aspects of bio/cheminformatics and computer-assisted molecular design. Molecular Informatics succeeded QSAR & Combinatorial Science in 2010.
Molecular Informatics presents methodological innovations that will lead to a deeper understanding of ligand-receptor interactions, macromolecular complexes, molecular networks, design concepts and processes that demonstrate how ideas and design concepts lead to molecules with a desired structure or function, preferably including experimental validation.
The journal''s scope includes but is not limited to the fields of drug discovery and chemical biology, protein and nucleic acid engineering and design, the design of nanomolecular structures, strategies for modeling of macromolecular assemblies, molecular networks and systems, pharmaco- and chemogenomics, computer-assisted screening strategies, as well as novel technologies for the de novo design of biologically active molecules. As a unique feature Molecular Informatics publishes so-called "Methods Corner" review-type articles which feature important technological concepts and advances within the scope of the journal.