预测大气相关有机化合物纯组分表面张力的机器学习方法。

ACS ES&T Air Pub Date : 2025-04-08 eCollection Date: 2025-05-09 DOI:10.1021/acsestair.4c00291
Ryan Schmedding, Mees Franssen, Andreas Zuend
{"title":"预测大气相关有机化合物纯组分表面张力的机器学习方法。","authors":"Ryan Schmedding, Mees Franssen, Andreas Zuend","doi":"10.1021/acsestair.4c00291","DOIUrl":null,"url":null,"abstract":"<p><p>Atmospheric aerosols are complex mixtures of highly functionalized organic compounds, water, inorganic electrolytes, metals, and carbonaceous species. The surface properties of atmospheric aerosol particles can influence several of their chemical and physical impacts, including their hygroscopic growth, aerosol-cloud interactions, and heterogeneous chemical reactions. The effects of the various compounds within a particle on its surface tension depend in part on the pure-component surface tensions. For many of the myriad of organic compounds of interest, experimental pure-component surface tension data at tropospheric temperatures are lacking, thus, requiring the development and application of property estimation methods. In this work, a compiled database of experimental pure-component surface tension data, covering a wide range of organic compound classes and temperatures, is used to train four different types of machine learning models to predict the temperature-dependent pure-component surface tensions of atmospherically relevant organic compounds. The trained models process input information about the temperature and the molecular structure of an organic compound, initially in the form of a Simplified Molecular Input Line Entry System (SMILES) string, to enable predictions. Our quantitative model assessment shows that extreme gradient-boosted descent along with Molecular ACCess System (MACCS) key descriptors of molecular structure provided the best balance of derived input complexity and model performance, resulting in a root-mean-square error (RMSE) of ∼1 mJ m<sup>-2</sup> in pure-component surface tension. Additionally, a simplified model based on molar mass, elemental ratios, and temperature as inputs was developed for use in applications for which molecular structure information is incomplete (RMSE of ∼2 mJ m<sup>-2</sup>). We demonstrate that including predicted pure-component surface tension values in thermodynamically rigorous bulk-surface partitioning calculations may substantially modify the critical supersaturations necessary for aerosol activation into cloud droplets.</p>","PeriodicalId":100014,"journal":{"name":"ACS ES&T Air","volume":"2 5","pages":"808-823"},"PeriodicalIF":0.0000,"publicationDate":"2025-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12071373/pdf/","citationCount":"0","resultStr":"{\"title\":\"A Machine Learning Approach for Predicting the Pure-Component Surface Tension of Atmospherically Relevant Organic Compounds.\",\"authors\":\"Ryan Schmedding, Mees Franssen, Andreas Zuend\",\"doi\":\"10.1021/acsestair.4c00291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Atmospheric aerosols are complex mixtures of highly functionalized organic compounds, water, inorganic electrolytes, metals, and carbonaceous species. The surface properties of atmospheric aerosol particles can influence several of their chemical and physical impacts, including their hygroscopic growth, aerosol-cloud interactions, and heterogeneous chemical reactions. The effects of the various compounds within a particle on its surface tension depend in part on the pure-component surface tensions. For many of the myriad of organic compounds of interest, experimental pure-component surface tension data at tropospheric temperatures are lacking, thus, requiring the development and application of property estimation methods. In this work, a compiled database of experimental pure-component surface tension data, covering a wide range of organic compound classes and temperatures, is used to train four different types of machine learning models to predict the temperature-dependent pure-component surface tensions of atmospherically relevant organic compounds. The trained models process input information about the temperature and the molecular structure of an organic compound, initially in the form of a Simplified Molecular Input Line Entry System (SMILES) string, to enable predictions. Our quantitative model assessment shows that extreme gradient-boosted descent along with Molecular ACCess System (MACCS) key descriptors of molecular structure provided the best balance of derived input complexity and model performance, resulting in a root-mean-square error (RMSE) of ∼1 mJ m<sup>-2</sup> in pure-component surface tension. Additionally, a simplified model based on molar mass, elemental ratios, and temperature as inputs was developed for use in applications for which molecular structure information is incomplete (RMSE of ∼2 mJ m<sup>-2</sup>). We demonstrate that including predicted pure-component surface tension values in thermodynamically rigorous bulk-surface partitioning calculations may substantially modify the critical supersaturations necessary for aerosol activation into cloud droplets.</p>\",\"PeriodicalId\":100014,\"journal\":{\"name\":\"ACS ES&T Air\",\"volume\":\"2 5\",\"pages\":\"808-823\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12071373/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS ES&T Air\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1021/acsestair.4c00291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/9 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS ES&T Air","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1021/acsestair.4c00291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/9 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

大气气溶胶是高度功能化的有机化合物、水、无机电解质、金属和碳质物质的复杂混合物。大气气溶胶颗粒的表面特性可以影响它们的几种化学和物理影响,包括吸湿性生长、气溶胶与云的相互作用和非均相化学反应。粒子内各种化合物对其表面张力的影响部分取决于纯组分表面张力。对于许多感兴趣的无数有机化合物,缺乏对流层温度下的实验纯组分表面张力数据,因此,需要开发和应用性质估计方法。在这项工作中,一个汇编的实验纯组分表面张力数据数据库,涵盖了广泛的有机化合物类别和温度,用于训练四种不同类型的机器学习模型,以预测与大气相关的有机化合物的温度相关的纯组分表面张力。经过训练的模型处理有关有机化合物的温度和分子结构的输入信息,最初以简化分子输入线输入系统(SMILES)字符串的形式进行预测。我们的定量模型评估表明,极端梯度增强下降以及分子访问系统(MACCS)分子结构的关键描述符提供了衍生输入复杂性和模型性能的最佳平衡,导致纯组分表面张力的均方根误差(RMSE)为1 mJ - m-2。此外,开发了基于摩尔质量、元素比和温度作为输入的简化模型,用于分子结构信息不完整的应用(RMSE为~ 2 mJ -2)。我们证明,在热力学严格的体积-表面分配计算中包括预测的纯组分表面张力值可能会大大改变气溶胶活化成云滴所需的临界过饱和度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Machine Learning Approach for Predicting the Pure-Component Surface Tension of Atmospherically Relevant Organic Compounds.

Atmospheric aerosols are complex mixtures of highly functionalized organic compounds, water, inorganic electrolytes, metals, and carbonaceous species. The surface properties of atmospheric aerosol particles can influence several of their chemical and physical impacts, including their hygroscopic growth, aerosol-cloud interactions, and heterogeneous chemical reactions. The effects of the various compounds within a particle on its surface tension depend in part on the pure-component surface tensions. For many of the myriad of organic compounds of interest, experimental pure-component surface tension data at tropospheric temperatures are lacking, thus, requiring the development and application of property estimation methods. In this work, a compiled database of experimental pure-component surface tension data, covering a wide range of organic compound classes and temperatures, is used to train four different types of machine learning models to predict the temperature-dependent pure-component surface tensions of atmospherically relevant organic compounds. The trained models process input information about the temperature and the molecular structure of an organic compound, initially in the form of a Simplified Molecular Input Line Entry System (SMILES) string, to enable predictions. Our quantitative model assessment shows that extreme gradient-boosted descent along with Molecular ACCess System (MACCS) key descriptors of molecular structure provided the best balance of derived input complexity and model performance, resulting in a root-mean-square error (RMSE) of ∼1 mJ m-2 in pure-component surface tension. Additionally, a simplified model based on molar mass, elemental ratios, and temperature as inputs was developed for use in applications for which molecular structure information is incomplete (RMSE of ∼2 mJ m-2). We demonstrate that including predicted pure-component surface tension values in thermodynamically rigorous bulk-surface partitioning calculations may substantially modify the critical supersaturations necessary for aerosol activation into cloud droplets.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信