甲烷氧化偶联中的基准自动化特征工程及领域知识的影响

IF 4.2 Q2 CHEMISTRY, MULTIDISCIPLINARY
Jun Maki , Hiromasa Kaneko
{"title":"甲烷氧化偶联中的基准自动化特征工程及领域知识的影响","authors":"Jun Maki ,&nbsp;Hiromasa Kaneko","doi":"10.1016/j.rechem.2025.102730","DOIUrl":null,"url":null,"abstract":"<div><div>In materials informatics, feature engineering is an essential process that improves model accuracy and model interpretability. In this study, efficient methods for feature engineering were investigated. Existing libraries (TPOT, autofeat, Feature-engine, and xfeat) that automatically or semi-automatically perform feature engineering were used, and feature creation and selection were performed by these libraries. The libraries are unique, with some focusing on automation and others on high-performance feature selection. The target data was an oxidative coupling of methane (OCM) reaction downloaded from the catalytic reaction database. In addition, feature engineering using domain knowledge was performed, and the results obtained using each library were compared. These libraries showed practical performance, especially in feature selection. However, the domain knowledge-based method was more effective in terms of feature creation. Therefore, in order to efficiently construct models with higher performance, it is effective to combine feature creation using domain knowledge and feature selection using libraries. Adsorption of methane or oxygen on the catalyst surface is an essential factor in the OCM reaction, and the metal elements selected as important features in this study could have contributed to the reaction in the above sense.</div></div>","PeriodicalId":420,"journal":{"name":"Results in Chemistry","volume":"18 ","pages":"Article 102730"},"PeriodicalIF":4.2000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Benchmarking automated feature engineering in oxidative coupling of methane and the impact of domain knowledge\",\"authors\":\"Jun Maki ,&nbsp;Hiromasa Kaneko\",\"doi\":\"10.1016/j.rechem.2025.102730\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In materials informatics, feature engineering is an essential process that improves model accuracy and model interpretability. In this study, efficient methods for feature engineering were investigated. Existing libraries (TPOT, autofeat, Feature-engine, and xfeat) that automatically or semi-automatically perform feature engineering were used, and feature creation and selection were performed by these libraries. The libraries are unique, with some focusing on automation and others on high-performance feature selection. The target data was an oxidative coupling of methane (OCM) reaction downloaded from the catalytic reaction database. In addition, feature engineering using domain knowledge was performed, and the results obtained using each library were compared. These libraries showed practical performance, especially in feature selection. However, the domain knowledge-based method was more effective in terms of feature creation. Therefore, in order to efficiently construct models with higher performance, it is effective to combine feature creation using domain knowledge and feature selection using libraries. Adsorption of methane or oxygen on the catalyst surface is an essential factor in the OCM reaction, and the metal elements selected as important features in this study could have contributed to the reaction in the above sense.</div></div>\",\"PeriodicalId\":420,\"journal\":{\"name\":\"Results in Chemistry\",\"volume\":\"18 \",\"pages\":\"Article 102730\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Results in Chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2211715625007131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211715625007131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

在材料信息学中,特征工程是提高模型准确性和模型可解释性的重要过程。本研究探讨了特征工程的有效方法。使用了自动或半自动执行特性工程的现有库(TPOT、autofeat、feature -engine和xfeat),并且由这些库执行特性创建和选择。这些库是独一无二的,其中一些专注于自动化,另一些专注于高性能特性选择。目标数据是从催化反应数据库下载的甲烷氧化偶联反应(OCM)。此外,利用领域知识进行特征工程,并对各个库的结果进行比较。这些库显示了实际的性能,特别是在特征选择方面。然而,基于领域知识的方法在特征创建方面更为有效。因此,将使用领域知识的特征创建与使用库的特征选择相结合是高效构建具有更高性能的模型的有效方法。甲烷或氧气在催化剂表面的吸附是OCM反应的重要因素,本研究中选择的金属元素作为重要特征可能对上述意义上的反应有贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Benchmarking automated feature engineering in oxidative coupling of methane and the impact of domain knowledge

Benchmarking automated feature engineering in oxidative coupling of methane and the impact of domain knowledge
In materials informatics, feature engineering is an essential process that improves model accuracy and model interpretability. In this study, efficient methods for feature engineering were investigated. Existing libraries (TPOT, autofeat, Feature-engine, and xfeat) that automatically or semi-automatically perform feature engineering were used, and feature creation and selection were performed by these libraries. The libraries are unique, with some focusing on automation and others on high-performance feature selection. The target data was an oxidative coupling of methane (OCM) reaction downloaded from the catalytic reaction database. In addition, feature engineering using domain knowledge was performed, and the results obtained using each library were compared. These libraries showed practical performance, especially in feature selection. However, the domain knowledge-based method was more effective in terms of feature creation. Therefore, in order to efficiently construct models with higher performance, it is effective to combine feature creation using domain knowledge and feature selection using libraries. Adsorption of methane or oxygen on the catalyst surface is an essential factor in the OCM reaction, and the metal elements selected as important features in this study could have contributed to the reaction in the above sense.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Results in Chemistry
Results in Chemistry Chemistry-Chemistry (all)
CiteScore
2.70
自引率
8.70%
发文量
380
审稿时长
56 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信