Haokun Ti , Letian Yang , Weihao Yan , Yongdong Chen , Fu Yang , Yujing Yue , Yixue Sun , Xia Li , Yechao Niu , Shi Li
{"title":"Structure-based machine learning with standardized feature selection for screening MOFs with high ammonia capture capacity","authors":"Haokun Ti , Letian Yang , Weihao Yan , Yongdong Chen , Fu Yang , Yujing Yue , Yixue Sun , Xia Li , Yechao Niu , Shi Li","doi":"10.1016/j.psep.2025.107955","DOIUrl":null,"url":null,"abstract":"<div><div>It is essential that ammonia, a corrosive and hazardous gas widely used as an industrial feedstock and emerging hydrogen carrier, be effectively captured to safeguard process safety and reduce air pollution in industrial operations. Metal–organic frameworks (MOFs), with tunable porosity and surface chemistries, are promising candidates for NH<sub>3</sub> capture. To accelerate the identification of high-performance MOFs, we develop a standardized machine learning framework integrating diverse structural descriptors with a multi-step feature selection strategy. A 198-dimensional feature space is constructed by combining conventional geometrical descriptors with multiple categories of RDKit-derived features. Feature selection involves four steps: variance thresholding, LightGBM-based importance ranking, Pearson correlation filtering, and forward feature selection, yielding a compact and informative subset that is used to construct a machine learning model. To enhance model interpretability, a multilevel interpretability analysis is further conducted to quantify the contributions of selected features and reveal their structure-performance relationships. Beyond model construction, an integrated assessment for engineering application is performed, including transferability validation across databases, identification of design windows, volumetric capacity ranking, breakthrough time estimation, and preliminary considerations of stability and regeneration. This study offers an efficient approach to accelerate the identification of high-performance MOFs with reduced experimental burden, and its engineering-oriented assessments consolidate the framework’s applicability to process-level practice.</div></div>","PeriodicalId":20743,"journal":{"name":"Process Safety and Environmental Protection","volume":"203 ","pages":"Article 107955"},"PeriodicalIF":7.8000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Process Safety and Environmental Protection","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957582025012224","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
It is essential that ammonia, a corrosive and hazardous gas widely used as an industrial feedstock and emerging hydrogen carrier, be effectively captured to safeguard process safety and reduce air pollution in industrial operations. Metal–organic frameworks (MOFs), with tunable porosity and surface chemistries, are promising candidates for NH3 capture. To accelerate the identification of high-performance MOFs, we develop a standardized machine learning framework integrating diverse structural descriptors with a multi-step feature selection strategy. A 198-dimensional feature space is constructed by combining conventional geometrical descriptors with multiple categories of RDKit-derived features. Feature selection involves four steps: variance thresholding, LightGBM-based importance ranking, Pearson correlation filtering, and forward feature selection, yielding a compact and informative subset that is used to construct a machine learning model. To enhance model interpretability, a multilevel interpretability analysis is further conducted to quantify the contributions of selected features and reveal their structure-performance relationships. Beyond model construction, an integrated assessment for engineering application is performed, including transferability validation across databases, identification of design windows, volumetric capacity ranking, breakthrough time estimation, and preliminary considerations of stability and regeneration. This study offers an efficient approach to accelerate the identification of high-performance MOFs with reduced experimental burden, and its engineering-oriented assessments consolidate the framework’s applicability to process-level practice.
期刊介绍:
The Process Safety and Environmental Protection (PSEP) journal is a leading international publication that focuses on the publication of high-quality, original research papers in the field of engineering, specifically those related to the safety of industrial processes and environmental protection. The journal encourages submissions that present new developments in safety and environmental aspects, particularly those that show how research findings can be applied in process engineering design and practice.
PSEP is particularly interested in research that brings fresh perspectives to established engineering principles, identifies unsolved problems, or suggests directions for future research. The journal also values contributions that push the boundaries of traditional engineering and welcomes multidisciplinary papers.
PSEP's articles are abstracted and indexed by a range of databases and services, which helps to ensure that the journal's research is accessible and recognized in the academic and professional communities. These databases include ANTE, Chemical Abstracts, Chemical Hazards in Industry, Current Contents, Elsevier Engineering Information database, Pascal Francis, Web of Science, Scopus, Engineering Information Database EnCompass LIT (Elsevier), and INSPEC. This wide coverage facilitates the dissemination of the journal's content to a global audience interested in process safety and environmental engineering.