Ensemble machine learning to accelerate industrial decarbonization: Prediction of Hansen solubility parameters for streamlined chemical solvent selection

IF 3 Q2 ENGINEERING, CHEMICAL
Eslam G. Al-Sakkari , Ahmed Ragab , Mostafa Amer , Olumoye Ajao , Marzouk Benali , Daria C. Boffito , Hanane Dagdougui , Mouloud Amazouz
{"title":"Ensemble machine learning to accelerate industrial decarbonization: Prediction of Hansen solubility parameters for streamlined chemical solvent selection","authors":"Eslam G. Al-Sakkari ,&nbsp;Ahmed Ragab ,&nbsp;Mostafa Amer ,&nbsp;Olumoye Ajao ,&nbsp;Marzouk Benali ,&nbsp;Daria C. Boffito ,&nbsp;Hanane Dagdougui ,&nbsp;Mouloud Amazouz","doi":"10.1016/j.dche.2024.100207","DOIUrl":null,"url":null,"abstract":"<div><div>Several processes and strategies have been developed to promote the utilization of lignin and to facilitate its market adoption across a broad spectrum of applications within the expanding lignin bioeconomy. However, the inherent variability in lignin properties, resulting from diverse feedstock sources and varied recovery and downstream processing methods, remains a significant challenge. This highlights the critical need to investigate lignin's miscibility and reactivity with polymers and solvents, as most lignin valorization pathways involve mixing, blending, or solubilization. Accurate estimation of Hansen solubility parameters (HSP) is crucial for solvent selection in several fields such as polymer science, coatings, adhesives, lignin-based biorefineries and solvent-based carbon capture. Traditional methods for predicting HSP are time-consuming and involve complex experiments, especially in applications dealing with carbon dioxide and lignin solubility. This paper introduces a novel ensemble modeling methodology based on machine learning (ML) techniques for accurate HSP prediction using Simplified Molecular Input Line Entry System (SMILES) codes as entries. The methodology integrates different ML approaches, including deep and shallow learning, to enhance prediction accuracy. Decision fusion of individual ML models is achieved through a hybrid approach combining non-learnable and learnable methods, resulting in reduced errors and enhanced accuracy. The results highlight the effectiveness of the ensemble-based methodology, which achieved 99% accuracy in predicting dispersion solubility parameters, outperforming other individual ML techniques. The proposed generic methodology, from data preprocessing to decision fusion through diverse ML algorithms, can be applied to various chemical analytics beyond HSP prediction.</div></div>","PeriodicalId":72815,"journal":{"name":"Digital Chemical Engineering","volume":"14 ","pages":"Article 100207"},"PeriodicalIF":3.0000,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chemical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772508124000693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Several processes and strategies have been developed to promote the utilization of lignin and to facilitate its market adoption across a broad spectrum of applications within the expanding lignin bioeconomy. However, the inherent variability in lignin properties, resulting from diverse feedstock sources and varied recovery and downstream processing methods, remains a significant challenge. This highlights the critical need to investigate lignin's miscibility and reactivity with polymers and solvents, as most lignin valorization pathways involve mixing, blending, or solubilization. Accurate estimation of Hansen solubility parameters (HSP) is crucial for solvent selection in several fields such as polymer science, coatings, adhesives, lignin-based biorefineries and solvent-based carbon capture. Traditional methods for predicting HSP are time-consuming and involve complex experiments, especially in applications dealing with carbon dioxide and lignin solubility. This paper introduces a novel ensemble modeling methodology based on machine learning (ML) techniques for accurate HSP prediction using Simplified Molecular Input Line Entry System (SMILES) codes as entries. The methodology integrates different ML approaches, including deep and shallow learning, to enhance prediction accuracy. Decision fusion of individual ML models is achieved through a hybrid approach combining non-learnable and learnable methods, resulting in reduced errors and enhanced accuracy. The results highlight the effectiveness of the ensemble-based methodology, which achieved 99% accuracy in predicting dispersion solubility parameters, outperforming other individual ML techniques. The proposed generic methodology, from data preprocessing to decision fusion through diverse ML algorithms, can be applied to various chemical analytics beyond HSP prediction.

Abstract Image

集成机器学习加速工业脱碳:流线型化学溶剂选择汉森溶解度参数的预测
为了促进木质素的利用,并在不断扩大的木质素生物经济的广泛应用中促进其市场采用,已经制定了一些过程和策略。然而,由于不同的原料来源和不同的回收和下游加工方法,木质素性质的内在变异性仍然是一个重大的挑战。这突出了研究木质素与聚合物和溶剂的混溶性和反应性的迫切需要,因为大多数木质素的增值途径涉及混合、共混或增溶。汉森溶解度参数(HSP)的准确估计对于聚合物科学、涂料、粘合剂、木质素基生物炼制和溶剂基碳捕集等多个领域的溶剂选择至关重要。传统的热稳定性预测方法耗时且涉及复杂的实验,特别是在处理二氧化碳和木质素溶解度的应用中。本文介绍了一种基于机器学习(ML)技术的新型集成建模方法,该方法使用简化分子输入行输入系统(SMILES)代码作为条目进行准确的热热反应预测。该方法集成了不同的机器学习方法,包括深度和浅学习,以提高预测准确性。单个ML模型的决策融合是通过结合不可学习和可学习方法的混合方法实现的,从而减少了错误,提高了准确性。结果突出了基于集成的方法的有效性,该方法在预测分散溶解度参数方面达到了99%的准确率,优于其他单个ML技术。提出的通用方法,从数据预处理到通过各种ML算法的决策融合,可以应用于HSP预测之外的各种化学分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信