Prediction of the density of aqueous electrolyte solutions with matrix completion methods

IF 2.8 3区 工程技术 Q3 CHEMISTRY, PHYSICAL
Maximilian Kohns, Pascal Zittlau, Fabian Jirasek
{"title":"Prediction of the density of aqueous electrolyte solutions with matrix completion methods","authors":"Maximilian Kohns,&nbsp;Pascal Zittlau,&nbsp;Fabian Jirasek","doi":"10.1016/j.fluid.2025.114454","DOIUrl":null,"url":null,"abstract":"<div><div>Information on the density of electrolyte solutions is important for many processes in chemistry and chemical engineering. However, experimental data are scarce, and broadly applicable prediction methods that can extrapolate to unstudied electrolytes have been unavailable until now. In the present work, we introduce a novel approach for predicting the densities of aqueous solutions of 720 single electrolytes at 298.15 K based on the machine-learning concept of matrix completion. The studied electrolytes belong to the valency classes 1:1, 2:1, 1:2, 3:1, 2:2, and 3:2; individual ion concentrations up to 0.1 mol/mol are considered. We arrange the available density data for these electrolytes composed of 40 cations and 18 anions in a matrix, where the columns and rows denote the cations and anions, respectively. In the literature, experimental data are available for only 181 of all 720 electrolytes. This makes the prediction for the other electrolytes a matrix completion problem, which we address using probabilistic matrix factorization. To account for the concentration dependence of the density, a dimensionality reduction is carried out by representing the density as a linear function of the mole fraction-based ionic strength, a correlation found to be very accurate for all considered electrolytes. As a result, a sparse matrix containing the scalar slope of that linear function is obtained. Two matrix completion methods (MCMs) are introduced: a purely data-driven one trained only on the available density data and a hierarchical model that includes the ions’ valencies as side information. The performance of both models is evaluated on unseen test data, with the hierarchical MCM providing very accurate predictions: When averaging the relative deviations for all density data points for a certain electrolyte, an average deviation of 0.96 % is obtained. Moreover, we show that the MCM parameters learned during training are physically interpretable, as their values align with descriptors such as an ion’s charge density.</div></div>","PeriodicalId":12170,"journal":{"name":"Fluid Phase Equilibria","volume":"597 ","pages":"Article 114454"},"PeriodicalIF":2.8000,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fluid Phase Equilibria","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378381225001244","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Information on the density of electrolyte solutions is important for many processes in chemistry and chemical engineering. However, experimental data are scarce, and broadly applicable prediction methods that can extrapolate to unstudied electrolytes have been unavailable until now. In the present work, we introduce a novel approach for predicting the densities of aqueous solutions of 720 single electrolytes at 298.15 K based on the machine-learning concept of matrix completion. The studied electrolytes belong to the valency classes 1:1, 2:1, 1:2, 3:1, 2:2, and 3:2; individual ion concentrations up to 0.1 mol/mol are considered. We arrange the available density data for these electrolytes composed of 40 cations and 18 anions in a matrix, where the columns and rows denote the cations and anions, respectively. In the literature, experimental data are available for only 181 of all 720 electrolytes. This makes the prediction for the other electrolytes a matrix completion problem, which we address using probabilistic matrix factorization. To account for the concentration dependence of the density, a dimensionality reduction is carried out by representing the density as a linear function of the mole fraction-based ionic strength, a correlation found to be very accurate for all considered electrolytes. As a result, a sparse matrix containing the scalar slope of that linear function is obtained. Two matrix completion methods (MCMs) are introduced: a purely data-driven one trained only on the available density data and a hierarchical model that includes the ions’ valencies as side information. The performance of both models is evaluated on unseen test data, with the hierarchical MCM providing very accurate predictions: When averaging the relative deviations for all density data points for a certain electrolyte, an average deviation of 0.96 % is obtained. Moreover, we show that the MCM parameters learned during training are physically interpretable, as their values align with descriptors such as an ion’s charge density.
用矩阵补全法预测电解质水溶液的密度
关于电解质溶液密度的信息在化学和化学工程的许多过程中是重要的。然而,实验数据很少,而且迄今为止还没有广泛适用的预测方法来推断未研究的电解质。在目前的工作中,我们介绍了一种基于矩阵补全的机器学习概念的新方法,用于预测298.15 K下720种单一电解质水溶液的密度。所研究的电解质属于1:1、2:1、1:2、3:1、2:2和3:2的价类;单个离子浓度可达0.1 mol/mol。我们将这些由40个阳离子和18个阴离子组成的电解质的可用密度数据排列在一个矩阵中,其中列和行分别表示阳离子和阴离子。在文献中,所有720种电解质中只有181种的实验数据可用。这使得其他电解质的预测成为一个矩阵补全问题,我们使用概率矩阵分解来解决这个问题。为了解释密度的浓度依赖性,通过将密度表示为基于摩尔分数的离子强度的线性函数来进行降维,发现这种相关性对所有考虑的电解质都非常准确。从而得到一个包含该线性函数的标量斜率的稀疏矩阵。介绍了两种矩阵补全方法(mcm):一种是纯数据驱动的方法,仅根据可用的密度数据进行训练,另一种是包含离子价作为副信息的分层模型。两种模型的性能都是在未见过的测试数据上进行评估的,分层MCM提供了非常准确的预测:当对特定电解质的所有密度数据点的相对偏差取平均值时,平均偏差为0.96%。此外,我们表明在训练期间学习的MCM参数在物理上是可解释的,因为它们的值与描述符(如离子的电荷密度)一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Fluid Phase Equilibria
Fluid Phase Equilibria 工程技术-工程:化工
CiteScore
5.30
自引率
15.40%
发文量
223
审稿时长
53 days
期刊介绍: Fluid Phase Equilibria publishes high-quality papers dealing with experimental, theoretical, and applied research related to equilibrium and transport properties of fluids, solids, and interfaces. Subjects of interest include physical/phase and chemical equilibria; equilibrium and nonequilibrium thermophysical properties; fundamental thermodynamic relations; and stability. The systems central to the journal include pure substances and mixtures of organic and inorganic materials, including polymers, biochemicals, and surfactants with sufficient characterization of composition and purity for the results to be reproduced. Alloys are of interest only when thermodynamic studies are included, purely material studies will not be considered. In all cases, authors are expected to provide physical or chemical interpretations of the results. Experimental research can include measurements under all conditions of temperature, pressure, and composition, including critical and supercritical. Measurements are to be associated with systems and conditions of fundamental or applied interest, and may not be only a collection of routine data, such as physical property or solubility measurements at limited pressures and temperatures close to ambient, or surfactant studies focussed strictly on micellisation or micelle structure. Papers reporting common data must be accompanied by new physical insights and/or contemporary or new theory or techniques.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信