{"title":"使用静电和结构性质作为描述符的临界胶束浓度的机器学习预测","authors":"Gabriel D. Barbosa*, and , Alberto Striolo, ","doi":"10.1021/acs.jced.5c00388","DOIUrl":null,"url":null,"abstract":"<p >Understanding and predicting surfactants’ critical micelle concentration (CMC) remains a key challenge for the rational design of functional amphiphiles. In this work, we develop a deep learning framework to predict CMCs using quantum chemically derived descriptors, focusing on electrostatic surface potential (ESP) and structural features. We employ a comprehensive temperature-dependent data set comprising over 1300 CMC values across diverse surfactant classes. Fourteen molecular descriptors are extracted via density functional theory (DFT) calculations and used as input, alongside temperature. A fully connected neural network trained on these features yields accurate predictions, achieving performance comparable to previously published graph-based models. To support model interpretability, we explicitly assessed ESP distributions for representative surfactants. SHapley Additive exPlanations (SHAP) and partial dependence analyses reveal that molecular volume, ESP variance, and solvation free energy are the dominant predictors, aligning with established thermodynamic theories. These results demonstrate that DFT-derived electrostatic and geometric descriptors can enable robust and interpretable CMC prediction, offering a physically grounded alternative to black-box models. The methodology and insights presented here may also inform the design of nanostructured soft materials, including surfactant-assisted platforms for hydrogen storage.</p>","PeriodicalId":42,"journal":{"name":"Journal of Chemical & Engineering Data","volume":"70 10","pages":"4019–4030"},"PeriodicalIF":2.1000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Machine Learning Prediction of Critical Micellar Concentration Using Electrostatic and Structural Properties as Descriptors\",\"authors\":\"Gabriel D. Barbosa*, and , Alberto Striolo, \",\"doi\":\"10.1021/acs.jced.5c00388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Understanding and predicting surfactants’ critical micelle concentration (CMC) remains a key challenge for the rational design of functional amphiphiles. In this work, we develop a deep learning framework to predict CMCs using quantum chemically derived descriptors, focusing on electrostatic surface potential (ESP) and structural features. We employ a comprehensive temperature-dependent data set comprising over 1300 CMC values across diverse surfactant classes. Fourteen molecular descriptors are extracted via density functional theory (DFT) calculations and used as input, alongside temperature. A fully connected neural network trained on these features yields accurate predictions, achieving performance comparable to previously published graph-based models. To support model interpretability, we explicitly assessed ESP distributions for representative surfactants. SHapley Additive exPlanations (SHAP) and partial dependence analyses reveal that molecular volume, ESP variance, and solvation free energy are the dominant predictors, aligning with established thermodynamic theories. These results demonstrate that DFT-derived electrostatic and geometric descriptors can enable robust and interpretable CMC prediction, offering a physically grounded alternative to black-box models. The methodology and insights presented here may also inform the design of nanostructured soft materials, including surfactant-assisted platforms for hydrogen storage.</p>\",\"PeriodicalId\":42,\"journal\":{\"name\":\"Journal of Chemical & Engineering Data\",\"volume\":\"70 10\",\"pages\":\"4019–4030\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2025-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical & Engineering Data\",\"FirstCategoryId\":\"1\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jced.5c00388\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical & Engineering Data","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jced.5c00388","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Machine Learning Prediction of Critical Micellar Concentration Using Electrostatic and Structural Properties as Descriptors
Understanding and predicting surfactants’ critical micelle concentration (CMC) remains a key challenge for the rational design of functional amphiphiles. In this work, we develop a deep learning framework to predict CMCs using quantum chemically derived descriptors, focusing on electrostatic surface potential (ESP) and structural features. We employ a comprehensive temperature-dependent data set comprising over 1300 CMC values across diverse surfactant classes. Fourteen molecular descriptors are extracted via density functional theory (DFT) calculations and used as input, alongside temperature. A fully connected neural network trained on these features yields accurate predictions, achieving performance comparable to previously published graph-based models. To support model interpretability, we explicitly assessed ESP distributions for representative surfactants. SHapley Additive exPlanations (SHAP) and partial dependence analyses reveal that molecular volume, ESP variance, and solvation free energy are the dominant predictors, aligning with established thermodynamic theories. These results demonstrate that DFT-derived electrostatic and geometric descriptors can enable robust and interpretable CMC prediction, offering a physically grounded alternative to black-box models. The methodology and insights presented here may also inform the design of nanostructured soft materials, including surfactant-assisted platforms for hydrogen storage.
期刊介绍:
The Journal of Chemical & Engineering Data is a monthly journal devoted to the publication of data obtained from both experiment and computation, which are viewed as complementary. It is the only American Chemical Society journal primarily concerned with articles containing data on the phase behavior and the physical, thermodynamic, and transport properties of well-defined materials, including complex mixtures of known compositions. While environmental and biological samples are of interest, their compositions must be known and reproducible. As a result, adsorption on natural product materials does not generally fit within the scope of Journal of Chemical & Engineering Data.