基于机器学习的苯氧基亚胺催化剂活性预测及其构效关系研究。

IF 3.9 2区 化学 Q2 CHEMISTRY, APPLIED
Xiaoke Zhou, Sisi He, Min Xiao, Jing He, Yuan Wang, Yuanqin Zhu, Haixiang He
{"title":"基于机器学习的苯氧基亚胺催化剂活性预测及其构效关系研究。","authors":"Xiaoke Zhou, Sisi He, Min Xiao, Jing He, Yuan Wang, Yuanqin Zhu, Haixiang He","doi":"10.1007/s11030-025-11147-0","DOIUrl":null,"url":null,"abstract":"<p><p>This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R<sup>2</sup> values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q<sup>2</sup> of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine learning-based activity prediction of phenoxy-imine catalysts and its structure-activity relationship study.\",\"authors\":\"Xiaoke Zhou, Sisi He, Min Xiao, Jing He, Yuan Wang, Yuanqin Zhu, Haixiang He\",\"doi\":\"10.1007/s11030-025-11147-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R<sup>2</sup> values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q<sup>2</sup> of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.</p>\",\"PeriodicalId\":708,\"journal\":{\"name\":\"Molecular Diversity\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Molecular Diversity\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1007/s11030-025-11147-0\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-025-11147-0","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

本研究采用机器学习(ML)方法系统地研究了30种ti -苯氧基亚胺(FI-Ti)催化剂的构效关系。在测试算法中,XGBoost表现出较好的预测性能,其R2值为0.998(训练集)和0.859(测试集),交叉验证的Q2值为0.617。特征重要性分析确定了三个复合描述符——odi_homo_1_neg_average GGI2、ALIEmax GATS8d和mol_size_l——作为关键贡献者,它们共同占模型预测能力的bb0 63%。多项式特征展开有效地捕获了描述符之间的非线性相互作用,而SHAP和ICE分析增强了可解释性,揭示了阈值效应和描述符特定趋势。然而,该模型的通用性可能受到有限的数据集大小(30个样本)和依赖于密度泛函理论(DFT)衍生的描述符的限制,需要实验验证。此外,该研究仅关注40°C下的乙烯聚合;更广泛的适用性不同的催化体系或反应条件需要进一步验证。这些发现为催化剂设计提供了一个数据驱动的框架,尽管未来的工作应该整合实验验证和扩展数据集,以完善预测稳健性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine learning-based activity prediction of phenoxy-imine catalysts and its structure-activity relationship study.

This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R2 values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q2 of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Molecular Diversity
Molecular Diversity 化学-化学综合
CiteScore
7.30
自引率
7.90%
发文量
219
审稿时长
2.7 months
期刊介绍: Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including: combinatorial chemistry and parallel synthesis; small molecule libraries; microwave synthesis; flow synthesis; fluorous synthesis; diversity oriented synthesis (DOS); nanoreactors; click chemistry; multiplex technologies; fragment- and ligand-based design; structure/function/SAR; computational chemistry and molecular design; chemoinformatics; screening techniques and screening interfaces; analytical and purification methods; robotics, automation and miniaturization; targeted libraries; display libraries; peptides and peptoids; proteins; oligonucleotides; carbohydrates; natural diversity; new methods of library formulation and deconvolution; directed evolution, origin of life and recombination; search techniques, landscapes, random chemistry and more;
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信