Yifan Wang, Lorenz Fleitmann, Lukas Raßpe-Lange, Niklas von der Assen, André Bardow, Kai Leonhard
{"title":"Fine-Tuning a Genetic Algorithm for CAMD: A Screening-Guided Warm Start.","authors":"Yifan Wang, Lorenz Fleitmann, Lukas Raßpe-Lange, Niklas von der Assen, André Bardow, Kai Leonhard","doi":"10.1021/acs.jcim.4c02038","DOIUrl":null,"url":null,"abstract":"<p><p>More sustainable chemical processes require the selection of suitable molecules, which can be supported by computer-aided molecular design (CAMD). CAMD often generates and evaluates molecular structures using genetic algorithms. However, genetic algorithms can suffer from slow convergence, and might yield suboptimal solutions. In response to these challenges, this work presents a method to fine-tune a genetic algorithm for CAMD. The proposed method builds on the COSMO-CAMD framework that utilizes a genetic algorithm for solving optimization-based molecular design problems and COSMO-RS for predicting physical properties of molecules. The key idea of the proposed method is to integrate results from a fast large-scale molecular screening into the molecular design framework through an automated fragmentation procedure. By generating a promising initial population and constructing a tailored fragment library, our method enables a targeted initialization of the genetic algorithm, referred to as warm-start. The proposed method is applied in two case studies to design solvents for extracting γ-valerolactone and phenol, respectively, from aqueous solutions. Compared to the benchmark method, the warm-started COSMO-CAMD framework achieves a 70% faster convergence, discovers 4-fold more top-performing candidate molecules, and identifies seven tailored molecular fragments, culminating in the discovery of two novel solvents specifically for the phenol case. The optimal solvent is found in all computational runs. Overall, the warm-started COSMO-CAMD framework significantly improves efficiency, effectiveness, and robustness of molecular design.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":" ","pages":""},"PeriodicalIF":5.6000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.4c02038","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
Abstract
More sustainable chemical processes require the selection of suitable molecules, which can be supported by computer-aided molecular design (CAMD). CAMD often generates and evaluates molecular structures using genetic algorithms. However, genetic algorithms can suffer from slow convergence, and might yield suboptimal solutions. In response to these challenges, this work presents a method to fine-tune a genetic algorithm for CAMD. The proposed method builds on the COSMO-CAMD framework that utilizes a genetic algorithm for solving optimization-based molecular design problems and COSMO-RS for predicting physical properties of molecules. The key idea of the proposed method is to integrate results from a fast large-scale molecular screening into the molecular design framework through an automated fragmentation procedure. By generating a promising initial population and constructing a tailored fragment library, our method enables a targeted initialization of the genetic algorithm, referred to as warm-start. The proposed method is applied in two case studies to design solvents for extracting γ-valerolactone and phenol, respectively, from aqueous solutions. Compared to the benchmark method, the warm-started COSMO-CAMD framework achieves a 70% faster convergence, discovers 4-fold more top-performing candidate molecules, and identifies seven tailored molecular fragments, culminating in the discovery of two novel solvents specifically for the phenol case. The optimal solvent is found in all computational runs. Overall, the warm-started COSMO-CAMD framework significantly improves efficiency, effectiveness, and robustness of molecular design.
期刊介绍:
The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery.
Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field.
As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.