A. Luna, A. Torres, Camilla L Cunha, I. Lima, Luis Nonato
{"title":"采用自动机器学习算法利用振动光谱数据预测生物柴油混合物在40ºC下的冷滤器堵塞和运动粘度","authors":"A. Luna, A. Torres, Camilla L Cunha, I. Lima, Luis Nonato","doi":"10.30744/brjac.2179-3425.ar-30-2022","DOIUrl":null,"url":null,"abstract":"This work aims to develop an auto-machine learning method using Mid-Infrared (MIR) spectroscopy data to determine the cold filter plugging point (CFPP) and kinematic viscosity at 40 ºC of biodiesel, diesel, and mixtures samples. The biodiesel was obtained by the transesterification reaction and later purified. The first dataset was composed of 108 blends (biodiesel obtained from different biomass such as soy, corn, sunflower, and canola) with binary, ternary and quaternary mixtures. The second dataset was composed of 227 blends of diesel-biodiesel and diesel-biodiesel-ethanol, respectively. The physical properties of the samples were obtained according to ABNT NBR 14747 and ABNT NBR 10441, respectively. The MIR Spectroscopy data were acquired from 7,800 to 450 cm-1, with a 4 cm-1 resolution and 20 scans. The spectra' baseline alignment was carried out using the asymmetric least squares method. A Savitzky–Golay filter was applied to a set of digital data points to smooth the data. This work used a first-order polynomial and a zero derivative function to smooth the spectra. The dataset was split into training and test sets using the function CreateDataPartition from the caret package. It was adopted 70% for training and 30% for test sets. In this work, the model training process was carried out using the open-source Python library LazyPredict. The LazyPredict returns the trained models and their performance metrics. The kinematic viscosity at 40 ºC of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset using different auto-machine learning algorithms. The RMSEP (Root Mean Square Error of Prediction) (≤ 0.02 mm2 s-1) was similar to the experimental error obtained after log transformation. The CFPP of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset by different auto-machine learning algorithms with an RMSEP (≤ 1.6 ºC) similar to the experimental error obtained by traditional methodology. Based on the lower computational time and the same performance observed by the RMSEP and R2 (coefficient of determination) values from different algorithms, it is recommended to use Ridge or Ridge Cross-Validation Regression models for both physical properties using MIR Spectroscopy data.","PeriodicalId":9115,"journal":{"name":"Brazilian Journal of Analytical Chemistry","volume":null,"pages":null},"PeriodicalIF":1.1000,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Employing Auto-Machine Learning Algorithms for Predicting the Cold Filter Plugging and Kinematic Viscosity at 40 ºC in Biodiesel Blends using Vibrational Spectroscopy Data\",\"authors\":\"A. Luna, A. Torres, Camilla L Cunha, I. Lima, Luis Nonato\",\"doi\":\"10.30744/brjac.2179-3425.ar-30-2022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work aims to develop an auto-machine learning method using Mid-Infrared (MIR) spectroscopy data to determine the cold filter plugging point (CFPP) and kinematic viscosity at 40 ºC of biodiesel, diesel, and mixtures samples. The biodiesel was obtained by the transesterification reaction and later purified. The first dataset was composed of 108 blends (biodiesel obtained from different biomass such as soy, corn, sunflower, and canola) with binary, ternary and quaternary mixtures. The second dataset was composed of 227 blends of diesel-biodiesel and diesel-biodiesel-ethanol, respectively. The physical properties of the samples were obtained according to ABNT NBR 14747 and ABNT NBR 10441, respectively. The MIR Spectroscopy data were acquired from 7,800 to 450 cm-1, with a 4 cm-1 resolution and 20 scans. The spectra' baseline alignment was carried out using the asymmetric least squares method. A Savitzky–Golay filter was applied to a set of digital data points to smooth the data. This work used a first-order polynomial and a zero derivative function to smooth the spectra. The dataset was split into training and test sets using the function CreateDataPartition from the caret package. It was adopted 70% for training and 30% for test sets. In this work, the model training process was carried out using the open-source Python library LazyPredict. The LazyPredict returns the trained models and their performance metrics. The kinematic viscosity at 40 ºC of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset using different auto-machine learning algorithms. The RMSEP (Root Mean Square Error of Prediction) (≤ 0.02 mm2 s-1) was similar to the experimental error obtained after log transformation. The CFPP of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset by different auto-machine learning algorithms with an RMSEP (≤ 1.6 ºC) similar to the experimental error obtained by traditional methodology. Based on the lower computational time and the same performance observed by the RMSEP and R2 (coefficient of determination) values from different algorithms, it is recommended to use Ridge or Ridge Cross-Validation Regression models for both physical properties using MIR Spectroscopy data.\",\"PeriodicalId\":9115,\"journal\":{\"name\":\"Brazilian Journal of Analytical Chemistry\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2022-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Brazilian Journal of Analytical Chemistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.30744/brjac.2179-3425.ar-30-2022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brazilian Journal of Analytical Chemistry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.30744/brjac.2179-3425.ar-30-2022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
Employing Auto-Machine Learning Algorithms for Predicting the Cold Filter Plugging and Kinematic Viscosity at 40 ºC in Biodiesel Blends using Vibrational Spectroscopy Data
This work aims to develop an auto-machine learning method using Mid-Infrared (MIR) spectroscopy data to determine the cold filter plugging point (CFPP) and kinematic viscosity at 40 ºC of biodiesel, diesel, and mixtures samples. The biodiesel was obtained by the transesterification reaction and later purified. The first dataset was composed of 108 blends (biodiesel obtained from different biomass such as soy, corn, sunflower, and canola) with binary, ternary and quaternary mixtures. The second dataset was composed of 227 blends of diesel-biodiesel and diesel-biodiesel-ethanol, respectively. The physical properties of the samples were obtained according to ABNT NBR 14747 and ABNT NBR 10441, respectively. The MIR Spectroscopy data were acquired from 7,800 to 450 cm-1, with a 4 cm-1 resolution and 20 scans. The spectra' baseline alignment was carried out using the asymmetric least squares method. A Savitzky–Golay filter was applied to a set of digital data points to smooth the data. This work used a first-order polynomial and a zero derivative function to smooth the spectra. The dataset was split into training and test sets using the function CreateDataPartition from the caret package. It was adopted 70% for training and 30% for test sets. In this work, the model training process was carried out using the open-source Python library LazyPredict. The LazyPredict returns the trained models and their performance metrics. The kinematic viscosity at 40 ºC of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset using different auto-machine learning algorithms. The RMSEP (Root Mean Square Error of Prediction) (≤ 0.02 mm2 s-1) was similar to the experimental error obtained after log transformation. The CFPP of the biodiesel samples and their blends could be modeled using the MIR Spectroscopy dataset by different auto-machine learning algorithms with an RMSEP (≤ 1.6 ºC) similar to the experimental error obtained by traditional methodology. Based on the lower computational time and the same performance observed by the RMSEP and R2 (coefficient of determination) values from different algorithms, it is recommended to use Ridge or Ridge Cross-Validation Regression models for both physical properties using MIR Spectroscopy data.
期刊介绍:
BrJAC is dedicated to the diffusion of significant and original knowledge in all branches of Analytical Chemistry, and is addressed to professionals involved in science, technology and innovation projects at universities, research centers and in industry.