Yiwen Yao, Jinbo Zhu, Yan Liu, Guanpeng Ren, Xiao-Yan Li, Pengfei Ou
{"title":"多相催化的大型语言模型","authors":"Yiwen Yao, Jinbo Zhu, Yan Liu, Guanpeng Ren, Xiao-Yan Li, Pengfei Ou","doi":"10.1002/wcms.70046","DOIUrl":null,"url":null,"abstract":"<p>Heterogeneous catalysis has a wide range of applications in chemical manufacturing and sustainable technologies. It uses solid catalysis to enable efficient chemical transformations. Traditional research on active sites and reaction mechanisms relies heavily on experiments and computational methods, such as density functional theory calculations. However, the volume of scientific literature and data is growing fast. This rapid growth has made it increasingly difficult to capture, process, and act on emerging insights systematically. Recently, large language models (LLMs) have emerged as powerful tools to support various stages in catalysis research. Their ability to understand and generate natural language helps them extract useful information from vast amounts of text, assist in catalyst design, aid in planning experiments, and clarify complex descriptors. In this advanced review, we first analyze recent progress in applying LLMs to heterogeneous catalysis, focusing on four key areas: literature mining and knowledge extraction, catalyst design and screening, experiment automation and workflow optimization, and the interpretation of high-dimensional descriptors. We then highlight the challenges in this field despite these advances, most notably the need for domain-specific fine-tuning and the improvement of molecular representation. We conclude by discussing future opportunities for integrating LLMs with complementary machine learning approaches and expert-in-the-loop systems, toward accelerating the rational discovery of next-generation catalysts.</p><p>This article is categorized under:\n\n </p>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"15 5","pages":""},"PeriodicalIF":27.0000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.70046","citationCount":"0","resultStr":"{\"title\":\"Large Language Models for Heterogeneous Catalysis\",\"authors\":\"Yiwen Yao, Jinbo Zhu, Yan Liu, Guanpeng Ren, Xiao-Yan Li, Pengfei Ou\",\"doi\":\"10.1002/wcms.70046\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Heterogeneous catalysis has a wide range of applications in chemical manufacturing and sustainable technologies. It uses solid catalysis to enable efficient chemical transformations. Traditional research on active sites and reaction mechanisms relies heavily on experiments and computational methods, such as density functional theory calculations. However, the volume of scientific literature and data is growing fast. This rapid growth has made it increasingly difficult to capture, process, and act on emerging insights systematically. Recently, large language models (LLMs) have emerged as powerful tools to support various stages in catalysis research. Their ability to understand and generate natural language helps them extract useful information from vast amounts of text, assist in catalyst design, aid in planning experiments, and clarify complex descriptors. In this advanced review, we first analyze recent progress in applying LLMs to heterogeneous catalysis, focusing on four key areas: literature mining and knowledge extraction, catalyst design and screening, experiment automation and workflow optimization, and the interpretation of high-dimensional descriptors. We then highlight the challenges in this field despite these advances, most notably the need for domain-specific fine-tuning and the improvement of molecular representation. We conclude by discussing future opportunities for integrating LLMs with complementary machine learning approaches and expert-in-the-loop systems, toward accelerating the rational discovery of next-generation catalysts.</p><p>This article is categorized under:\\n\\n </p>\",\"PeriodicalId\":236,\"journal\":{\"name\":\"Wiley Interdisciplinary Reviews: Computational Molecular Science\",\"volume\":\"15 5\",\"pages\":\"\"},\"PeriodicalIF\":27.0000,\"publicationDate\":\"2025-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wcms.70046\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wiley Interdisciplinary Reviews: Computational Molecular Science\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.70046\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews: Computational Molecular Science","FirstCategoryId":"92","ListUrlMain":"https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.70046","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Heterogeneous catalysis has a wide range of applications in chemical manufacturing and sustainable technologies. It uses solid catalysis to enable efficient chemical transformations. Traditional research on active sites and reaction mechanisms relies heavily on experiments and computational methods, such as density functional theory calculations. However, the volume of scientific literature and data is growing fast. This rapid growth has made it increasingly difficult to capture, process, and act on emerging insights systematically. Recently, large language models (LLMs) have emerged as powerful tools to support various stages in catalysis research. Their ability to understand and generate natural language helps them extract useful information from vast amounts of text, assist in catalyst design, aid in planning experiments, and clarify complex descriptors. In this advanced review, we first analyze recent progress in applying LLMs to heterogeneous catalysis, focusing on four key areas: literature mining and knowledge extraction, catalyst design and screening, experiment automation and workflow optimization, and the interpretation of high-dimensional descriptors. We then highlight the challenges in this field despite these advances, most notably the need for domain-specific fine-tuning and the improvement of molecular representation. We conclude by discussing future opportunities for integrating LLMs with complementary machine learning approaches and expert-in-the-loop systems, toward accelerating the rational discovery of next-generation catalysts.
期刊介绍:
Computational molecular sciences harness the power of rigorous chemical and physical theories, employing computer-based modeling, specialized hardware, software development, algorithm design, and database management to explore and illuminate every facet of molecular sciences. These interdisciplinary approaches form a bridge between chemistry, biology, and materials sciences, establishing connections with adjacent application-driven fields in both chemistry and biology. WIREs Computational Molecular Science stands as a platform to comprehensively review and spotlight research from these dynamic and interconnected fields.