Shuofan Li, Zhongyu Zhang, Zukui Li, Guangqing Cai, Linzhou Zhang, Quan Shi
{"title":"通过数据驱动建模和可解释优化的石脑油馏分分子组成重建","authors":"Shuofan Li, Zhongyu Zhang, Zukui Li, Guangqing Cai, Linzhou Zhang, Quan Shi","doi":"10.1016/j.ces.2025.122655","DOIUrl":null,"url":null,"abstract":"Molecular composition reconstruction models for petroleum fractions is important for the application in the process upgrading of modern refineries. In the traditional data-driven reconstruction framework, the underdetermined mapping from low-dimensional bulk properties to high-dimensional molecular compositions and the limited interpretability remain prominent challenges. In this study, a novel strategy is proposed for molecular composition reconstruction of naphtha fractions by integrating data-driven modeling with interpretable optimization. A feedforward neural network (FNN) is pre-trained to map molecular composition to bulk properties. For each sample to be reconstructed, the parameterized FNN is then embedded into an inverse optimization problem to reconstruct the molecular composition. Furthermore, a sample-specific reference composition is constructed from the database to guide the reconstruction, along with carbon number distribution and structural distribution profiles, ensuring chemical plausibility and similarity to realistic petroleum compositions. The reconstructed compositions demonstrate high accuracy. Model interpretability analysis using SHapley Additive exPlanations (SHAP) demonstrates that selecting key bulk properties facilitates the establishment of a robust and unbiased mapping between composition and properties. Furthermore, a broader set of bulk properties are predicted by a supplementary extreme gradient boosting model using limited key properties and its reconstructed composition as inputs.","PeriodicalId":271,"journal":{"name":"Chemical Engineering Science","volume":"15 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Molecular composition reconstruction of naphtha fractions through data-driven modeling and interpretable optimization\",\"authors\":\"Shuofan Li, Zhongyu Zhang, Zukui Li, Guangqing Cai, Linzhou Zhang, Quan Shi\",\"doi\":\"10.1016/j.ces.2025.122655\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Molecular composition reconstruction models for petroleum fractions is important for the application in the process upgrading of modern refineries. In the traditional data-driven reconstruction framework, the underdetermined mapping from low-dimensional bulk properties to high-dimensional molecular compositions and the limited interpretability remain prominent challenges. In this study, a novel strategy is proposed for molecular composition reconstruction of naphtha fractions by integrating data-driven modeling with interpretable optimization. A feedforward neural network (FNN) is pre-trained to map molecular composition to bulk properties. For each sample to be reconstructed, the parameterized FNN is then embedded into an inverse optimization problem to reconstruct the molecular composition. Furthermore, a sample-specific reference composition is constructed from the database to guide the reconstruction, along with carbon number distribution and structural distribution profiles, ensuring chemical plausibility and similarity to realistic petroleum compositions. The reconstructed compositions demonstrate high accuracy. Model interpretability analysis using SHapley Additive exPlanations (SHAP) demonstrates that selecting key bulk properties facilitates the establishment of a robust and unbiased mapping between composition and properties. Furthermore, a broader set of bulk properties are predicted by a supplementary extreme gradient boosting model using limited key properties and its reconstructed composition as inputs.\",\"PeriodicalId\":271,\"journal\":{\"name\":\"Chemical Engineering Science\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemical Engineering Science\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ces.2025.122655\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemical Engineering Science","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.ces.2025.122655","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
Molecular composition reconstruction of naphtha fractions through data-driven modeling and interpretable optimization
Molecular composition reconstruction models for petroleum fractions is important for the application in the process upgrading of modern refineries. In the traditional data-driven reconstruction framework, the underdetermined mapping from low-dimensional bulk properties to high-dimensional molecular compositions and the limited interpretability remain prominent challenges. In this study, a novel strategy is proposed for molecular composition reconstruction of naphtha fractions by integrating data-driven modeling with interpretable optimization. A feedforward neural network (FNN) is pre-trained to map molecular composition to bulk properties. For each sample to be reconstructed, the parameterized FNN is then embedded into an inverse optimization problem to reconstruct the molecular composition. Furthermore, a sample-specific reference composition is constructed from the database to guide the reconstruction, along with carbon number distribution and structural distribution profiles, ensuring chemical plausibility and similarity to realistic petroleum compositions. The reconstructed compositions demonstrate high accuracy. Model interpretability analysis using SHapley Additive exPlanations (SHAP) demonstrates that selecting key bulk properties facilitates the establishment of a robust and unbiased mapping between composition and properties. Furthermore, a broader set of bulk properties are predicted by a supplementary extreme gradient boosting model using limited key properties and its reconstructed composition as inputs.
期刊介绍:
Chemical engineering enables the transformation of natural resources and energy into useful products for society. It draws on and applies natural sciences, mathematics and economics, and has developed fundamental engineering science that underpins the discipline.
Chemical Engineering Science (CES) has been publishing papers on the fundamentals of chemical engineering since 1951. CES is the platform where the most significant advances in the discipline have ever since been published. Chemical Engineering Science has accompanied and sustained chemical engineering through its development into the vibrant and broad scientific discipline it is today.