Carlota Bozal-Ginesta, Sergio Pablo-García, Changhyeok Choi, Albert Tarancón, Alán Aspuru-Guzik
{"title":"Developing machine learning for heterogeneous catalysis with experimental and computational data.","authors":"Carlota Bozal-Ginesta, Sergio Pablo-García, Changhyeok Choi, Albert Tarancón, Alán Aspuru-Guzik","doi":"10.1038/s41570-025-00740-4","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning techniques have emerged as a useful tool for identifying complex patterns and correlations in large datasets, such as associating catalyst performance to its physicochemical properties. In the heterogeneous catalysis communities, machine learning models have mostly been developed using high-throughput quantum chemistry calculations, with only a few case studies resulting in experimentally validated catalyst improvements. This limited success may be due to the use of simplified catalyst structures in computational studies and the lack of comprehensive experimental datasets. In this Review, we bring together studies integrating high-throughput approaches and machine learning for the advancement of solid heterogeneous catalysis, leveraging both experimental and computational data. We systematically analyse trends in the field, based on the descriptors used as model input and output; the materials, devices, or reactions investigated; the dataset size; and the overall achievements. Furthermore, for models reporting unitless R<sup>2</sup> values, we compare the performances based on these mentioned trends.</p>","PeriodicalId":18849,"journal":{"name":"Nature reviews. Chemistry","volume":" ","pages":""},"PeriodicalIF":38.1000,"publicationDate":"2025-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature reviews. Chemistry","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1038/s41570-025-00740-4","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning techniques have emerged as a useful tool for identifying complex patterns and correlations in large datasets, such as associating catalyst performance to its physicochemical properties. In the heterogeneous catalysis communities, machine learning models have mostly been developed using high-throughput quantum chemistry calculations, with only a few case studies resulting in experimentally validated catalyst improvements. This limited success may be due to the use of simplified catalyst structures in computational studies and the lack of comprehensive experimental datasets. In this Review, we bring together studies integrating high-throughput approaches and machine learning for the advancement of solid heterogeneous catalysis, leveraging both experimental and computational data. We systematically analyse trends in the field, based on the descriptors used as model input and output; the materials, devices, or reactions investigated; the dataset size; and the overall achievements. Furthermore, for models reporting unitless R2 values, we compare the performances based on these mentioned trends.
期刊介绍:
Nature Reviews Chemistry is an online-only journal that publishes Reviews, Perspectives, and Comments on various disciplines within chemistry. The Reviews aim to offer balanced and objective analyses of selected topics, providing clear descriptions of relevant scientific literature. The content is designed to be accessible to recent graduates in any chemistry-related discipline while also offering insights for principal investigators and industry-based research scientists. Additionally, Reviews should provide the authors' perspectives on future directions and opinions regarding the major challenges faced by researchers in the field.