Improvement in classification capabilities of surface water samples based on analysis of multidimensional data from gas sensor array.

Annals of agricultural and environmental medicine : AAEM Pub Date : 2025-06-27 Epub Date: 2025-06-25 DOI:10.26444/aaem/206945

Magdalena Piłat-Rożek, Grzegorz Łagód

{"title":"Improvement in classification capabilities of surface water samples based on analysis of multidimensional data from gas sensor array.","authors":"Magdalena Piłat-Rożek, Grzegorz Łagód","doi":"10.26444/aaem/206945","DOIUrl":null,"url":null,"abstract":"Introduction and objective: It has been proven that e-noses can successfully differentiate between drainage and river water samples. However, it was supposed that the classification accuracy in the previous article from the series could have been refined. The aim of the article was to improve the classification accuracy of surface water samples analyzed with a gas sensor array.Material and methods: The multidimensional data on which the machine learning models were trained was derived from river water, drainage water and synthetic air samples measured using an array comprising 17 gas sensors. In this research, the unsupervised t-SNE and k-medians were used for dimensionality reduction, visualization on 2-dimensional plane, and clustering. Subsequently, supervised classificators XGBoost and AdaBoost.M1 were trained and compared with regard to the achieved quality of classification of objects into correct classes.Results: The visualization using t-SNE and clustering with k-medians clearly distinguished the observations from the water sample and different drainage samples. The applied supervised machine learning methods achieved 88.8% and 89.2% correct classifications on the test set for the XGBoost and AdaBoost.M1 models, respectively.Conclusions: Despite the absence of statistical significance in differences of medians in most of the multiple comparisons between sample groups for all the classical indicators, the electronic nose allows differentiating and correctly classifying surface water samples with high accuracy.","PeriodicalId":520557,"journal":{"name":"Annals of agricultural and environmental medicine : AAEM","volume":"32 2","pages":"222-229"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of agricultural and environmental medicine : AAEM","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26444/aaem/206945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/25 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction and objective: It has been proven that e-noses can successfully differentiate between drainage and river water samples. However, it was supposed that the classification accuracy in the previous article from the series could have been refined. The aim of the article was to improve the classification accuracy of surface water samples analyzed with a gas sensor array.

Material and methods: The multidimensional data on which the machine learning models were trained was derived from river water, drainage water and synthetic air samples measured using an array comprising 17 gas sensors. In this research, the unsupervised t-SNE and k-medians were used for dimensionality reduction, visualization on 2-dimensional plane, and clustering. Subsequently, supervised classificators XGBoost and AdaBoost.M1 were trained and compared with regard to the achieved quality of classification of objects into correct classes.

Results: The visualization using t-SNE and clustering with k-medians clearly distinguished the observations from the water sample and different drainage samples. The applied supervised machine learning methods achieved 88.8% and 89.2% correct classifications on the test set for the XGBoost and AdaBoost.M1 models, respectively.

Conclusions: Despite the absence of statistical significance in differences of medians in most of the multiple comparisons between sample groups for all the classical indicators, the electronic nose allows differentiating and correctly classifying surface water samples with high accuracy.

查看原文本刊更多论文

基于气体传感器阵列多维数据分析的地表水样本分类能力提升。

前言和目的：已经证明电子鼻可以成功地区分水样和河样。但是，假设本系列前一篇文章中的分类精度可以得到改进。本文的目的是提高气体传感器阵列分析地表水样品的分类精度。材料和方法：训练机器学习模型的多维数据来自河水、排水和合成空气样本，使用由17个气体传感器组成的阵列进行测量。在本研究中，使用无监督的t-SNE和k-中位数进行降维、二维平面可视化和聚类。随后，有监督分类器XGBoost和AdaBoost。对M1进行训练，并就将物体分类为正确类别的实现质量进行比较。结果：利用t-SNE和k-中位数聚类的可视化方法清晰地区分了水样和不同排水样的观测结果。应用监督机器学习方法在XGBoost和AdaBoost的测试集上实现了88.8%和89.2%的正确分类。分别是M1型号。结论：尽管在大多数样本组间的多次比较中，所有经典指标的中位数差异均无统计学意义，但电子鼻能够以较高的准确率对地表水样本进行鉴别和正确分类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of agricultural and environmental medicine : AAEM

自引率

0.00%

发文量