{"title":"考虑中心性和多样性准则的过程综合进化方法的数据驱动初始化","authors":"Jean-Marc Commenge, Andres Piña-Martinez","doi":"10.1016/j.compchemeng.2025.109416","DOIUrl":null,"url":null,"abstract":"<div><div>Process synthesis using evolutionary methods, based on the iterative application of mutation operators, requires to initialize the method by one or a set of process flowsheets. Appropriate initialization might reduce computation times by providing first proposals that decrease the number of mutations to reach optimal structures, in terms of units and connectivity. This work illustrates how to identify, from a given database of flowsheets, the flowsheets that might play a pivotal role in the further evolutionary synthesis. A home-made database with over 2000 flowsheets, digitalized from 800 recent scientific publications, is used, exhibiting the variety of possible structures from single distillation columns to biorefinery layouts. Selection of initialization flowsheets should ensure diversity in structures and units while minimizing the number of mutations needed to evolve to any other process flowsheet. A distance function is defined as the minimum number of mutations required to transform one flowsheet into another, and computed for all pairs of flowsheets in the database enabling to compare their topologies and quantitatively analyze the population. Four sampling strategies are compared, considering centrality criteria, sampling flowsheets in groups of similar structures, random sampling, and k-medoids clustering. For each strategy, the distribution of distances from the selected structures to the database population and their diversity are compared. Centrality-based selection minimizes the required number of mutations but shows poor units’ diversity. Selection from distinct groups of similar structures improves performance only for distant flowsheets. Random sampling ensures diversity but performs poorly in reducing required mutations. Conversely, k-medoids sampling shows good performance in both the number of required mutations and the diversity of selected flowsheets, making it a balanced method for flowsheet sampling. The initialization strategies are applied to the case study of benzene chlorination and their fitness and diversity are monitored along the generations of the evolutionary synthesis.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"204 ","pages":"Article 109416"},"PeriodicalIF":3.9000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-driven initialization of evolutionary methods for process synthesis considering centrality and diversity criteria\",\"authors\":\"Jean-Marc Commenge, Andres Piña-Martinez\",\"doi\":\"10.1016/j.compchemeng.2025.109416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Process synthesis using evolutionary methods, based on the iterative application of mutation operators, requires to initialize the method by one or a set of process flowsheets. Appropriate initialization might reduce computation times by providing first proposals that decrease the number of mutations to reach optimal structures, in terms of units and connectivity. This work illustrates how to identify, from a given database of flowsheets, the flowsheets that might play a pivotal role in the further evolutionary synthesis. A home-made database with over 2000 flowsheets, digitalized from 800 recent scientific publications, is used, exhibiting the variety of possible structures from single distillation columns to biorefinery layouts. Selection of initialization flowsheets should ensure diversity in structures and units while minimizing the number of mutations needed to evolve to any other process flowsheet. A distance function is defined as the minimum number of mutations required to transform one flowsheet into another, and computed for all pairs of flowsheets in the database enabling to compare their topologies and quantitatively analyze the population. Four sampling strategies are compared, considering centrality criteria, sampling flowsheets in groups of similar structures, random sampling, and k-medoids clustering. For each strategy, the distribution of distances from the selected structures to the database population and their diversity are compared. Centrality-based selection minimizes the required number of mutations but shows poor units’ diversity. Selection from distinct groups of similar structures improves performance only for distant flowsheets. Random sampling ensures diversity but performs poorly in reducing required mutations. Conversely, k-medoids sampling shows good performance in both the number of required mutations and the diversity of selected flowsheets, making it a balanced method for flowsheet sampling. The initialization strategies are applied to the case study of benzene chlorination and their fitness and diversity are monitored along the generations of the evolutionary synthesis.</div></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"204 \",\"pages\":\"Article 109416\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135425004193\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425004193","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Data-driven initialization of evolutionary methods for process synthesis considering centrality and diversity criteria
Process synthesis using evolutionary methods, based on the iterative application of mutation operators, requires to initialize the method by one or a set of process flowsheets. Appropriate initialization might reduce computation times by providing first proposals that decrease the number of mutations to reach optimal structures, in terms of units and connectivity. This work illustrates how to identify, from a given database of flowsheets, the flowsheets that might play a pivotal role in the further evolutionary synthesis. A home-made database with over 2000 flowsheets, digitalized from 800 recent scientific publications, is used, exhibiting the variety of possible structures from single distillation columns to biorefinery layouts. Selection of initialization flowsheets should ensure diversity in structures and units while minimizing the number of mutations needed to evolve to any other process flowsheet. A distance function is defined as the minimum number of mutations required to transform one flowsheet into another, and computed for all pairs of flowsheets in the database enabling to compare their topologies and quantitatively analyze the population. Four sampling strategies are compared, considering centrality criteria, sampling flowsheets in groups of similar structures, random sampling, and k-medoids clustering. For each strategy, the distribution of distances from the selected structures to the database population and their diversity are compared. Centrality-based selection minimizes the required number of mutations but shows poor units’ diversity. Selection from distinct groups of similar structures improves performance only for distant flowsheets. Random sampling ensures diversity but performs poorly in reducing required mutations. Conversely, k-medoids sampling shows good performance in both the number of required mutations and the diversity of selected flowsheets, making it a balanced method for flowsheet sampling. The initialization strategies are applied to the case study of benzene chlorination and their fitness and diversity are monitored along the generations of the evolutionary synthesis.
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.