Comparative Analysis of Factors Influencing PM2.5 Using Sentinel-5P and CPCB Data by Machine Learning Techniques: Case Study of Gurugram City (2019–2023)
{"title":"Comparative Analysis of Factors Influencing PM2.5 Using Sentinel-5P and CPCB Data by Machine Learning Techniques: Case Study of Gurugram City (2019–2023)","authors":"Shilpa Mahajan, Pankaj Rathi, Duiena Rai, Tripti Sharma, Avi Aneja, Avni Jettley","doi":"10.1007/s12647-026-00907-4","DOIUrl":null,"url":null,"abstract":"<div><p>PM<sub>2.5</sub> particulates are major contributing factors that pose a serious threat to public health, particularly in urban cities like Gurugram, India. This study investigates the spatiotemporal variations of PM<sub>2.5</sub> concentration in Gurugram from 2019 to 2023 by integrating satellite and surface-level data. Meteorological and ground-level air quality data were collected through the Central Pollution Control Board, and spatial patterns of pollution were collected using satellite data from Google Earth Engine. Supervised Machine learning algorithms were then used to predict PM<sub>2.5</sub> concentrations and identify the key parameters influencing pollution. Among the evaluated models, the Random Forest algorithm demonstrated superior performance, achieving a coefficient of determination (R<sup>2</sup>) of 0.912, a mean absolute error of 2.946, a root mean square error of 5.013, and a mean square error of 25.13 Analysis has revealed that ground-based predictors exhibited stronger linear association with PM<sub>2.5</sub>, whereas satellite-derived predictors captured broader regional trends. Strict tests for accuracy and precision in satellite-retrieved data were performed using comparative studies with ground measurement datasets. Temporal analysis indicated strong seasonal variation with the elevated PM<sub>2.5</sub> recorded during the winter months, whereas spatial analysis using satellite data revealed that the densely populated areas and transportation-dominated zones have high levels of pollutants. The findings demonstrate that the combination of satellite-based atmospheric indicators with ground measurements increases the spatiotemporal characterisations of air quality in cities.</p></div>","PeriodicalId":689,"journal":{"name":"MAPAN","volume":"41 1","pages":"285 - 300"},"PeriodicalIF":1.3000,"publicationDate":"2026-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MAPAN","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s12647-026-00907-4","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
PM2.5 particulates are major contributing factors that pose a serious threat to public health, particularly in urban cities like Gurugram, India. This study investigates the spatiotemporal variations of PM2.5 concentration in Gurugram from 2019 to 2023 by integrating satellite and surface-level data. Meteorological and ground-level air quality data were collected through the Central Pollution Control Board, and spatial patterns of pollution were collected using satellite data from Google Earth Engine. Supervised Machine learning algorithms were then used to predict PM2.5 concentrations and identify the key parameters influencing pollution. Among the evaluated models, the Random Forest algorithm demonstrated superior performance, achieving a coefficient of determination (R2) of 0.912, a mean absolute error of 2.946, a root mean square error of 5.013, and a mean square error of 25.13 Analysis has revealed that ground-based predictors exhibited stronger linear association with PM2.5, whereas satellite-derived predictors captured broader regional trends. Strict tests for accuracy and precision in satellite-retrieved data were performed using comparative studies with ground measurement datasets. Temporal analysis indicated strong seasonal variation with the elevated PM2.5 recorded during the winter months, whereas spatial analysis using satellite data revealed that the densely populated areas and transportation-dominated zones have high levels of pollutants. The findings demonstrate that the combination of satellite-based atmospheric indicators with ground measurements increases the spatiotemporal characterisations of air quality in cities.
期刊介绍:
MAPAN-Journal Metrology Society of India is a quarterly publication. It is exclusively devoted to Metrology (Scientific, Industrial or Legal). It has been fulfilling an important need of Metrologists and particularly of quality practitioners by publishing exclusive articles on scientific, industrial and legal metrology.
The journal publishes research communication or technical articles of current interest in measurement science; original work, tutorial or survey papers in any metrology related area; reviews and analytical studies in metrology; case studies on reliability, uncertainty in measurements; and reports and results of intercomparison and proficiency testing.