{"title":"甲烷氧化偶联中的基准自动化特征工程及领域知识的影响","authors":"Jun Maki , Hiromasa Kaneko","doi":"10.1016/j.rechem.2025.102730","DOIUrl":null,"url":null,"abstract":"<div><div>In materials informatics, feature engineering is an essential process that improves model accuracy and model interpretability. In this study, efficient methods for feature engineering were investigated. Existing libraries (TPOT, autofeat, Feature-engine, and xfeat) that automatically or semi-automatically perform feature engineering were used, and feature creation and selection were performed by these libraries. The libraries are unique, with some focusing on automation and others on high-performance feature selection. The target data was an oxidative coupling of methane (OCM) reaction downloaded from the catalytic reaction database. In addition, feature engineering using domain knowledge was performed, and the results obtained using each library were compared. These libraries showed practical performance, especially in feature selection. However, the domain knowledge-based method was more effective in terms of feature creation. Therefore, in order to efficiently construct models with higher performance, it is effective to combine feature creation using domain knowledge and feature selection using libraries. Adsorption of methane or oxygen on the catalyst surface is an essential factor in the OCM reaction, and the metal elements selected as important features in this study could have contributed to the reaction in the above sense.</div></div>","PeriodicalId":420,"journal":{"name":"Results in Chemistry","volume":"18 ","pages":"Article 102730"},"PeriodicalIF":4.2000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Benchmarking automated feature engineering in oxidative coupling of methane and the impact of domain knowledge\",\"authors\":\"Jun Maki , Hiromasa Kaneko\",\"doi\":\"10.1016/j.rechem.2025.102730\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In materials informatics, feature engineering is an essential process that improves model accuracy and model interpretability. In this study, efficient methods for feature engineering were investigated. Existing libraries (TPOT, autofeat, Feature-engine, and xfeat) that automatically or semi-automatically perform feature engineering were used, and feature creation and selection were performed by these libraries. The libraries are unique, with some focusing on automation and others on high-performance feature selection. The target data was an oxidative coupling of methane (OCM) reaction downloaded from the catalytic reaction database. In addition, feature engineering using domain knowledge was performed, and the results obtained using each library were compared. These libraries showed practical performance, especially in feature selection. However, the domain knowledge-based method was more effective in terms of feature creation. Therefore, in order to efficiently construct models with higher performance, it is effective to combine feature creation using domain knowledge and feature selection using libraries. Adsorption of methane or oxygen on the catalyst surface is an essential factor in the OCM reaction, and the metal elements selected as important features in this study could have contributed to the reaction in the above sense.</div></div>\",\"PeriodicalId\":420,\"journal\":{\"name\":\"Results in Chemistry\",\"volume\":\"18 \",\"pages\":\"Article 102730\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Results in Chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2211715625007131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Results in Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211715625007131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Benchmarking automated feature engineering in oxidative coupling of methane and the impact of domain knowledge
In materials informatics, feature engineering is an essential process that improves model accuracy and model interpretability. In this study, efficient methods for feature engineering were investigated. Existing libraries (TPOT, autofeat, Feature-engine, and xfeat) that automatically or semi-automatically perform feature engineering were used, and feature creation and selection were performed by these libraries. The libraries are unique, with some focusing on automation and others on high-performance feature selection. The target data was an oxidative coupling of methane (OCM) reaction downloaded from the catalytic reaction database. In addition, feature engineering using domain knowledge was performed, and the results obtained using each library were compared. These libraries showed practical performance, especially in feature selection. However, the domain knowledge-based method was more effective in terms of feature creation. Therefore, in order to efficiently construct models with higher performance, it is effective to combine feature creation using domain knowledge and feature selection using libraries. Adsorption of methane or oxygen on the catalyst surface is an essential factor in the OCM reaction, and the metal elements selected as important features in this study could have contributed to the reaction in the above sense.