Employing Auto-Machine Learning Algorithms for Predicting the Cold Filter Plugging and Kinematic Viscosity at 40 ºC in Biodiesel Blends using Vibrational Spectroscopy Data

IF 1.1 Q4 CHEMISTRY, ANALYTICAL
A. Luna, A. Torres, Camilla L Cunha, I. Lima, Luis Nonato
{"title":"Employing Auto-Machine Learning Algorithms for Predicting the Cold Filter Plugging and Kinematic Viscosity at 40 ºC in Biodiesel Blends using Vibrational Spectroscopy Data","authors":"A. Luna, A. Torres, Camilla L Cunha, I. Lima, Luis Nonato","doi":"10.30744/brjac.2179-3425.ar-30-2022","DOIUrl":null,"url":null,"abstract":"This work aims to develop an auto-machine learning method using Mid-Infrared (MIR) spectroscopy data to determine the cold filter plugging point (CFPP) and kinematic viscosity at 40 ºC of biodiesel, diesel, and mixtures samples. The biodiesel was obtained by the transesterification reaction and later purified. The first dataset was composed of 108 blends (biodiesel obtained from different biomass such as soy, corn, sunflower, and canola) with binary, ternary and quaternary mixtures. The second dataset was composed of 227 blends of diesel-biodiesel and diesel-biodiesel-ethanol, respectively. The physical properties of the samples were obtained according to ABNT NBR 14747 and ABNT NBR 10441, respectively. The MIR Spectroscopy data were acquired from 7,800 to 450 cm-1, with a 4 cm-1 resolution and 20 scans. The spectra' baseline alignment was carried out using the asymmetric least squares method. A Savitzky–Golay filter was applied to a set of digital data points to smooth the data. This work used a first-order polynomial and a zero derivative function to smooth the spectra. The dataset was split into training and test sets using the function CreateDataPartition from the caret package. It was adopted 70% for training and 30% for test sets. In this work, the model training process was carried out using the open-source Python library LazyPredict. The LazyPredict returns the trained models and their performance metrics. The kinematic viscosity at 40 ºC of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset using different auto-machine learning algorithms. The RMSEP (Root Mean Square Error of Prediction) (≤ 0.02 mm2 s-1) was similar to the experimental error obtained after log transformation. The CFPP of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset by different auto-machine learning algorithms with an RMSEP (≤ 1.6 ºC) similar to the experimental error obtained by traditional methodology. Based on the lower computational time and the same performance observed by the RMSEP and R2 (coefficient of determination) values from different algorithms, it is recommended to use Ridge or Ridge Cross-Validation Regression models for both physical properties using MIR Spectroscopy data.","PeriodicalId":9115,"journal":{"name":"Brazilian Journal of Analytical Chemistry","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brazilian Journal of Analytical Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30744/brjac.2179-3425.ar-30-2022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
引用次数: 0

Abstract

This work aims to develop an auto-machine learning method using Mid-Infrared (MIR) spectroscopy data to determine the cold filter plugging point (CFPP) and kinematic viscosity at 40 ºC of biodiesel, diesel, and mixtures samples. The biodiesel was obtained by the transesterification reaction and later purified. The first dataset was composed of 108 blends (biodiesel obtained from different biomass such as soy, corn, sunflower, and canola) with binary, ternary and quaternary mixtures. The second dataset was composed of 227 blends of diesel-biodiesel and diesel-biodiesel-ethanol, respectively. The physical properties of the samples were obtained according to ABNT NBR 14747 and ABNT NBR 10441, respectively. The MIR Spectroscopy data were acquired from 7,800 to 450 cm-1, with a 4 cm-1 resolution and 20 scans. The spectra' baseline alignment was carried out using the asymmetric least squares method. A Savitzky–Golay filter was applied to a set of digital data points to smooth the data. This work used a first-order polynomial and a zero derivative function to smooth the spectra. The dataset was split into training and test sets using the function CreateDataPartition from the caret package. It was adopted 70% for training and 30% for test sets. In this work, the model training process was carried out using the open-source Python library LazyPredict. The LazyPredict returns the trained models and their performance metrics. The kinematic viscosity at 40 ºC of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset using different auto-machine learning algorithms. The RMSEP (Root Mean Square Error of Prediction) (≤ 0.02 mm2 s-1) was similar to the experimental error obtained after log transformation. The CFPP of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset by different auto-machine learning algorithms with an RMSEP (≤ 1.6 ºC) similar to the experimental error obtained by traditional methodology. Based on the lower computational time and the same performance observed by the RMSEP and R2 (coefficient of determination) values from different algorithms, it is recommended to use Ridge or Ridge Cross-Validation Regression models for both physical properties using MIR Spectroscopy data.
采用自动机器学习算法利用振动光谱数据预测生物柴油混合物在40ºC下的冷滤器堵塞和运动粘度
这项工作旨在开发一种自动机器学习方法,使用中红外(MIR)光谱数据来确定生物柴油、柴油和混合物样品在40ºC下的冷滤器堵塞点(CFPP)和运动粘度。通过酯交换反应获得生物柴油,随后进行纯化。第一个数据集由108种混合物(从大豆、玉米、向日葵和油菜籽等不同生物质中获得的生物柴油)与二元、三元和四元混合物组成。第二个数据集分别由227种柴油生物柴油和柴油生物柴油乙醇混合物组成。样品的物理性能分别根据ABNT NBR 14747和ABNT NBR 10441获得。MIR光谱数据采集范围为7800至450 cm-1,分辨率为4 cm-1,扫描次数为20次。使用不对称最小二乘法对光谱进行基线比对。Savitzky–Golay滤波器被应用于一组数字数据点,以平滑数据。这项工作使用了一阶多项式和零导数函数来平滑光谱。使用插入符号包中的函数CreateDataPartition将数据集拆分为训练集和测试集。70%用于训练,30%用于测试集。在这项工作中,模型训练过程是使用开源Python库LazyPredict进行的。LazyPredict返回经过训练的模型及其性能指标。生物柴油样品及其混合物在40ºC下的运动粘度可以使用MIR光谱学数据集使用不同的自动机器学习算法进行建模。RMSEP(预测均方根误差)(≤0.02 mm2 s-1)与对数变换后获得的实验误差相似。生物柴油样品及其混合物的CFPP可以通过不同的自动机器学习算法使用MIR光谱数据集进行建模,RMSEP(≤1.6ºC)与传统方法获得的实验误差相似。基于较低的计算时间和不同算法的RMSEP和R2(决定系数)值观察到的相同性能,建议使用MIR光谱数据对两种物理性质使用岭或岭交叉验证回归模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.60
自引率
14.30%
发文量
46
期刊介绍: BrJAC is dedicated to the diffusion of significant and original knowledge in all branches of Analytical Chemistry, and is addressed to professionals involved in science, technology and innovation projects at universities, research centers and in industry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信