Christoforos Brozos , Jan G. Rittig , Elie Akanny , Sandip Bhattacharya , Christina Kohlmann , Alexander Mitsos
{"title":"Predicting the temperature-dependent CMC of surfactant mixtures with graph neural networks","authors":"Christoforos Brozos , Jan G. Rittig , Elie Akanny , Sandip Bhattacharya , Christina Kohlmann , Alexander Mitsos","doi":"10.1016/j.compchemeng.2025.109085","DOIUrl":null,"url":null,"abstract":"<div><div>Surfactants are key ingredients in various industries such as personal and home care with the critical micelle concentration (CMC) being of major interest. Predictive models for CMC of pure surfactants have been developed based on recent ML methods, however, in practice surfactant mixtures are typically used due to performance, environmental, and cost reasons. Herein, we develop a graph neural network (GNN) framework for surfactant mixtures to predict the temperature-dependent CMC. We collect data for 108 surfactant binary mixtures, to which we add data for pure species from our previous work Brozos et al. (2024). We then develop and train GNNs and evaluate their accuracy across different prediction test scenarios for binary mixtures relevant to practical applications. The final GNN models demonstrate very high predictive performance when interpolating between different mixture compositions and for new binary mixtures with known species. Extrapolation to binary surfactant mixtures where either one or both surfactant species are not seen before, yields accurate results for the majority of surfactant systems. We further find superior accuracy of the GNN over a semi-empirical model based on activity coefficients, which has been widely used to date. We then explore if GNN models trained solely on binary mixture and pure species data can also accurately predict the CMCs of ternary mixtures. Finally, we experimentally measure the CMC of 4 commercial surfactants that contain up to four species and industrial relevant mixtures and find a very good agreement between measured and predicted CMC values.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"198 ","pages":"Article 109085"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425000894","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Surfactants are key ingredients in various industries such as personal and home care with the critical micelle concentration (CMC) being of major interest. Predictive models for CMC of pure surfactants have been developed based on recent ML methods, however, in practice surfactant mixtures are typically used due to performance, environmental, and cost reasons. Herein, we develop a graph neural network (GNN) framework for surfactant mixtures to predict the temperature-dependent CMC. We collect data for 108 surfactant binary mixtures, to which we add data for pure species from our previous work Brozos et al. (2024). We then develop and train GNNs and evaluate their accuracy across different prediction test scenarios for binary mixtures relevant to practical applications. The final GNN models demonstrate very high predictive performance when interpolating between different mixture compositions and for new binary mixtures with known species. Extrapolation to binary surfactant mixtures where either one or both surfactant species are not seen before, yields accurate results for the majority of surfactant systems. We further find superior accuracy of the GNN over a semi-empirical model based on activity coefficients, which has been widely used to date. We then explore if GNN models trained solely on binary mixture and pure species data can also accurately predict the CMCs of ternary mixtures. Finally, we experimentally measure the CMC of 4 commercial surfactants that contain up to four species and industrial relevant mixtures and find a very good agreement between measured and predicted CMC values.
表面活性剂是许多行业的关键成分,如个人和家庭护理,关键胶束浓度(CMC)是主要的兴趣。纯表面活性剂CMC的预测模型是基于最近的ML方法开发的,然而,在实践中,由于性能、环境和成本的原因,通常使用表面活性剂混合物。在此,我们开发了一个用于表面活性剂混合物的图神经网络(GNN)框架来预测温度依赖的CMC。我们收集了108种表面活性剂二元混合物的数据,并在其中添加了我们之前的工作Brozos et al.(2024)的纯物种数据。然后,我们开发和训练gnn,并评估其在与实际应用相关的二元混合物的不同预测测试场景中的准确性。最终的GNN模型在不同混合物组成之间的插值和具有已知物种的新二元混合物中显示出非常高的预测性能。外推到二元表面活性剂混合物中,其中一种或两种表面活性剂以前没有见过,对大多数表面活性剂体系产生准确的结果。我们进一步发现GNN比迄今为止广泛使用的基于活度系数的半经验模型具有更高的精度。然后,我们探讨了仅在二元混合物和纯物种数据上训练的GNN模型是否也能准确预测三元混合物的cmc。最后,我们通过实验测量了4种商业表面活性剂的CMC,这些表面活性剂含有多达4种物质和工业相关混合物,结果表明CMC的实测值和预测值非常吻合。
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.