用于区域选择性预测的末端烯烃加氢甲酰化个性化机器学习模型

IF 11.5 Q1 CHEMISTRY, PHYSICAL

Chem Catalysis Pub Date : 2024-08-20 DOI:10.1016/j.checat.2024.101079

Hao Wang, Yuzhuo Chen, Hang Yu, Menghui Qi, De Xia, Minkai Qin, XuCheng Lv, Bing Lu, Ruiliang Gao, Yong Wang, Shanjun Mao

{"title":"用于区域选择性预测的末端烯烃加氢甲酰化个性化机器学习模型","authors":"Hao Wang, Yuzhuo Chen, Hang Yu, Menghui Qi, De Xia, Minkai Qin, XuCheng Lv, Bing Lu, Ruiliang Gao, Yong Wang, Shanjun Mao","doi":"10.1016/j.checat.2024.101079","DOIUrl":null,"url":null,"abstract":"The integration of machine learning into hydroformylation processes represents a pivotal advancement in high-throughput screening within the chemical industry. This study employs a data-driven approach to develop predictive models for terminal olefin reactions. Using a database of 1,167 entries, we merged reaction embeddings with corresponding labels. The well-trained extreme gradient boosting model achieves a test set coefficient of determination (R2) score of 0.897. However, when applied to specific-olefin tasks, the model shows suboptimal performance. Therefore, tailored models for specific olefins like 1-octene and styrene are developed, achieving improved test set R2 scores of 0.850 and 0.789, respectively, compared to the general-olefin task. Interpretability findings highlight the significance of high-temperature, low-pressure, and low-concentration metals in enhancing linear regioselectivity and providing chemical insights. This study underscores the transformative potential of machine learning as a surrogate model in advancing high-throughput screening and optimizing chemical processes in the industry.","PeriodicalId":53121,"journal":{"name":"Chem Catalysis","volume":"1 1","pages":""},"PeriodicalIF":11.5000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Personalized machine learning models of terminal olefin hydroformylation for regioselectivity prediction\",\"authors\":\"Hao Wang, Yuzhuo Chen, Hang Yu, Menghui Qi, De Xia, Minkai Qin, XuCheng Lv, Bing Lu, Ruiliang Gao, Yong Wang, Shanjun Mao\",\"doi\":\"10.1016/j.checat.2024.101079\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The integration of machine learning into hydroformylation processes represents a pivotal advancement in high-throughput screening within the chemical industry. This study employs a data-driven approach to develop predictive models for terminal olefin reactions. Using a database of 1,167 entries, we merged reaction embeddings with corresponding labels. The well-trained extreme gradient boosting model achieves a test set coefficient of determination (R2) score of 0.897. However, when applied to specific-olefin tasks, the model shows suboptimal performance. Therefore, tailored models for specific olefins like 1-octene and styrene are developed, achieving improved test set R2 scores of 0.850 and 0.789, respectively, compared to the general-olefin task. Interpretability findings highlight the significance of high-temperature, low-pressure, and low-concentration metals in enhancing linear regioselectivity and providing chemical insights. This study underscores the transformative potential of machine learning as a surrogate model in advancing high-throughput screening and optimizing chemical processes in the industry.\",\"PeriodicalId\":53121,\"journal\":{\"name\":\"Chem Catalysis\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":11.5000,\"publicationDate\":\"2024-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chem Catalysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.checat.2024.101079\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chem Catalysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.checat.2024.101079","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

摘要

将机器学习融入加氢甲酰化过程代表了化学工业在高通量筛选方面的重要进步。本研究采用数据驱动方法，为末端烯烃反应开发预测模型。我们使用一个包含 1,167 个条目的数据库，将反应嵌入与相应的标签合并。训练有素的极梯度提升模型的测试集判定系数 (R2) 得分为 0.897。然而，当应用于特定烯烃任务时，该模型显示出不理想的性能。因此，我们开发了针对特定烯烃（如 1-辛烯和苯乙烯）的定制模型，与一般烯烃任务相比，测试集 R2 分数分别提高到 0.850 和 0.789。可解释性研究结果凸显了高温、低压和低浓度金属在提高线性区域选择性和提供化学见解方面的重要性。这项研究强调了机器学习作为一种替代模型在推进高通量筛选和优化工业化学过程方面的变革潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Personalized machine learning models of terminal olefin hydroformylation for regioselectivity prediction

查看原文本刊更多论文

Personalized machine learning models of terminal olefin hydroformylation for regioselectivity prediction

The integration of machine learning into hydroformylation processes represents a pivotal advancement in high-throughput screening within the chemical industry. This study employs a data-driven approach to develop predictive models for terminal olefin reactions. Using a database of 1,167 entries, we merged reaction embeddings with corresponding labels. The well-trained extreme gradient boosting model achieves a test set coefficient of determination (R²) score of 0.897. However, when applied to specific-olefin tasks, the model shows suboptimal performance. Therefore, tailored models for specific olefins like 1-octene and styrene are developed, achieving improved test set R² scores of 0.850 and 0.789, respectively, compared to the general-olefin task. Interpretability findings highlight the significance of high-temperature, low-pressure, and low-concentration metals in enhancing linear regioselectivity and providing chemical insights. This study underscores the transformative potential of machine learning as a surrogate model in advancing high-throughput screening and optimizing chemical processes in the industry.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Chem Catalysis

CiteScore

10.50

自引率

6.40%

发文量

期刊介绍： Chem Catalysis is a monthly journal that publishes innovative research on fundamental and applied catalysis, providing a platform for researchers across chemistry, chemical engineering, and related fields. It serves as a premier resource for scientists and engineers in academia and industry, covering heterogeneous, homogeneous, and biocatalysis. Emphasizing transformative methods and technologies, the journal aims to advance understanding, introduce novel catalysts, and connect fundamental insights to real-world applications for societal benefit.