Xiaoke Zhou, Sisi He, Min Xiao, Jing He, Yuan Wang, Yuanqin Zhu, Haixiang He
{"title":"Machine learning-based activity prediction of phenoxy-imine catalysts and its structure-activity relationship study.","authors":"Xiaoke Zhou, Sisi He, Min Xiao, Jing He, Yuan Wang, Yuanqin Zhu, Haixiang He","doi":"10.1007/s11030-025-11147-0","DOIUrl":null,"url":null,"abstract":"<p><p>This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R<sup>2</sup> values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q<sup>2</sup> of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-025-11147-0","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
This study systematically investigates the structure-activity relationships of 30 Ti-phenoxy-imine (FI-Ti) catalysts using machine learning (ML) approaches. Among the tested algorithms, XGBoost demonstrated superior predictive performance, achieving R2 values of 0.998 (training set) and 0.859 (test set), with a cross-validated Q2 of 0.617. Feature importance analysis identified three composite descriptors-ODI_HOMO_1_Neg_Average GGI2, ALIEmax GATS8d, and Mol_Size_L-as critical contributors, collectively accounting for > 63% of the model's predictive power. Polynomial feature expansion effectively captured nonlinear interactions between descriptors, while SHAP and ICE analyses enhanced interpretability, revealing threshold effects and descriptor-specific trends. However, the model's generalizability may be constrained by the limited dataset size (30 samples) and reliance on density functional theory (DFT)-derived descriptors, necessitating experimental validation. Additionally, the study focused solely on ethylene polymerization at 40 °C; broader applicability to diverse catalytic systems or reaction conditions requires further validation. These findings provide a data-driven framework for catalyst design, though future work should integrate experimental validation and expand datasets to refine predictive robustness.
期刊介绍:
Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including:
combinatorial chemistry and parallel synthesis;
small molecule libraries;
microwave synthesis;
flow synthesis;
fluorous synthesis;
diversity oriented synthesis (DOS);
nanoreactors;
click chemistry;
multiplex technologies;
fragment- and ligand-based design;
structure/function/SAR;
computational chemistry and molecular design;
chemoinformatics;
screening techniques and screening interfaces;
analytical and purification methods;
robotics, automation and miniaturization;
targeted libraries;
display libraries;
peptides and peptoids;
proteins;
oligonucleotides;
carbohydrates;
natural diversity;
new methods of library formulation and deconvolution;
directed evolution, origin of life and recombination;
search techniques, landscapes, random chemistry and more;