Materials-discovery workflow guided by symbolic regression for identifying acid-stable oxides for electrocatalysis

IF 9.4 1区 材料科学 Q1 CHEMISTRY, PHYSICAL
Akhil S. Nair, Lucas Foppa, Matthias Scheffler
{"title":"Materials-discovery workflow guided by symbolic regression for identifying acid-stable oxides for electrocatalysis","authors":"Akhil S. Nair, Lucas Foppa, Matthias Scheffler","doi":"10.1038/s41524-025-01596-4","DOIUrl":null,"url":null,"abstract":"<p>The efficiency of active learning (AL) approaches to identify materials with desired properties relies on the knowledge of a few parameters describing the property. However, these parameters are often unknown if the property is governed by a high intricacy of many atomistic processes. Here, we develop an AL workflow based on the sure-independence screening and sparsifying operator (SISSO) symbolic regression approach. SISSO identifies analytical expressions correlated with a given materials property. These expressions depend on a few, key physical parameters, out of many offered <i>primary features</i>. Crucially, we train ensembles of SISSO models in order to quantify mean predictions and their uncertainty, enabling the use of SISSO in AL. We combine bootstrap sampling with Monte-Carlo dropout of primary features to obtain different datasets, which are used to train multiple SISSO models of the ensembles. The ensemble strategy improves the model performance with the feature dropout procedure alleviating the overconfidence issues observed for the widely used bagging ensemble approach. We demonstrate the SISSO-guided AL workflow by identifying acid-stable oxides for water splitting using high-quality DFT-HSE06 calculations. From a pool of 1470 materials, 12 acid-stable materials are identified in only 30 AL iterations. The materials-property maps provided by SISSO along with the uncertainty estimates reduce the risk of missing promising portions of the materials space that were overlooked in the initial, possibly biased dataset.</p>","PeriodicalId":19342,"journal":{"name":"npj Computational Materials","volume":"24 1","pages":""},"PeriodicalIF":9.4000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Computational Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1038/s41524-025-01596-4","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

The efficiency of active learning (AL) approaches to identify materials with desired properties relies on the knowledge of a few parameters describing the property. However, these parameters are often unknown if the property is governed by a high intricacy of many atomistic processes. Here, we develop an AL workflow based on the sure-independence screening and sparsifying operator (SISSO) symbolic regression approach. SISSO identifies analytical expressions correlated with a given materials property. These expressions depend on a few, key physical parameters, out of many offered primary features. Crucially, we train ensembles of SISSO models in order to quantify mean predictions and their uncertainty, enabling the use of SISSO in AL. We combine bootstrap sampling with Monte-Carlo dropout of primary features to obtain different datasets, which are used to train multiple SISSO models of the ensembles. The ensemble strategy improves the model performance with the feature dropout procedure alleviating the overconfidence issues observed for the widely used bagging ensemble approach. We demonstrate the SISSO-guided AL workflow by identifying acid-stable oxides for water splitting using high-quality DFT-HSE06 calculations. From a pool of 1470 materials, 12 acid-stable materials are identified in only 30 AL iterations. The materials-property maps provided by SISSO along with the uncertainty estimates reduce the risk of missing promising portions of the materials space that were overlooked in the initial, possibly biased dataset.

Abstract Image

由符号回归指导的材料发现工作流程,用于识别电催化的酸稳定氧化物
主动学习(AL)方法识别具有所需性能的材料的效率依赖于描述性能的几个参数的知识。然而,如果属性是由许多高度复杂的原子过程控制的,这些参数通常是未知的。在这里,我们开发了一个基于确定独立筛选和稀疏算子(SISSO)符号回归方法的人工智能工作流。SISSO识别与给定材料性质相关的解析表达式。这些表达式依赖于几个关键的物理参数,而不是许多提供的主要特性。至关重要的是,我们训练了SISSO模型的集合,以量化平均预测及其不确定性,从而使SISSO能够在人工智能中使用。我们将bootstrap采样与主要特征的蒙特卡罗dropout结合起来,获得不同的数据集,这些数据集用于训练集合的多个SISSO模型。集成策略通过特征放弃过程改善了模型性能,减轻了广泛使用的bagging集成方法所观察到的过度置信度问题。我们通过使用高质量的DFT-HSE06计算识别用于水分解的酸稳定氧化物,展示了ssiso指导的人工智能工作流程。从1470种材料中,仅用30次AL迭代就识别出了12种酸稳定材料。SISSO提供的材料属性图以及不确定性估计减少了在最初可能有偏差的数据集中被忽视的材料空间中有希望的部分丢失的风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
npj Computational Materials
npj Computational Materials Mathematics-Modeling and Simulation
CiteScore
15.30
自引率
5.20%
发文量
229
审稿时长
6 weeks
期刊介绍: npj Computational Materials is a high-quality open access journal from Nature Research that publishes research papers applying computational approaches for the design of new materials and enhancing our understanding of existing ones. The journal also welcomes papers on new computational techniques and the refinement of current approaches that support these aims, as well as experimental papers that complement computational findings. Some key features of npj Computational Materials include a 2-year impact factor of 12.241 (2021), article downloads of 1,138,590 (2021), and a fast turnaround time of 11 days from submission to the first editorial decision. The journal is indexed in various databases and services, including Chemical Abstracts Service (ACS), Astrophysics Data System (ADS), Current Contents/Physical, Chemical and Earth Sciences, Journal Citation Reports/Science Edition, SCOPUS, EI Compendex, INSPEC, Google Scholar, SCImago, DOAJ, CNKI, and Science Citation Index Expanded (SCIE), among others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信