{"title":"基于机器学习的大气排放监测方法的长期评估。","authors":"Minxing Si, Brett M Wiens, Ke Du","doi":"10.1007/s00267-024-02057-2","DOIUrl":null,"url":null,"abstract":"<p><p>Machine learning (ML) techniques have been researched and used in various environmental monitoring applications. Few studies have reported the long-term evaluation of such applications. Discussions regarding the risks and regulatory frameworks of ML applications in environmental monitoring have been rare. We monitored the performance of six predictive models developed using ML and statistical methods for 28 months. The six models used to predict NO<sub>x</sub> emissions were developed using six different algorithms. The model developed with a moderate complexity algorithm, adaptive boosting, had the best performance in long-term monitoring, with a root mean square error (RMSE) of 0.48 kg/hr in the 28-month monitoring period, and passed two of the three relative accuracy test audits. High complexity models based on gradient boosting and neural network algorithms had the best training performance, with a minimum RMSE of 0.23 kg/hr and 0.26 kg/hr, but also had the worst RMSE scores, of 0.51 kg/hr and 0.57 kg/hr, during the monitoring period. In addition, all six models failed all three relative accuracy test audits. The following problems were observed: (1) Complex ML models tended to have overfitting problems, thus indicating the importance of the trade-off between model accuracy and complexity. (2) Model input sensor drift or out of high-frequency ranges from the training data resulted in inaccurate predictions or an accuracy lower than the minimum allowed by regulators. (3) Existing regulatory frameworks must be modernized to keep pace with current machine learning practices. Some statistical tests are unsuitable for applications developed by using ML methods.</p>","PeriodicalId":543,"journal":{"name":"Environmental Management","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Long-term Evaluation of Machine Learning Based Methods for Air Emission Monitoring.\",\"authors\":\"Minxing Si, Brett M Wiens, Ke Du\",\"doi\":\"10.1007/s00267-024-02057-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Machine learning (ML) techniques have been researched and used in various environmental monitoring applications. Few studies have reported the long-term evaluation of such applications. Discussions regarding the risks and regulatory frameworks of ML applications in environmental monitoring have been rare. We monitored the performance of six predictive models developed using ML and statistical methods for 28 months. The six models used to predict NO<sub>x</sub> emissions were developed using six different algorithms. The model developed with a moderate complexity algorithm, adaptive boosting, had the best performance in long-term monitoring, with a root mean square error (RMSE) of 0.48 kg/hr in the 28-month monitoring period, and passed two of the three relative accuracy test audits. High complexity models based on gradient boosting and neural network algorithms had the best training performance, with a minimum RMSE of 0.23 kg/hr and 0.26 kg/hr, but also had the worst RMSE scores, of 0.51 kg/hr and 0.57 kg/hr, during the monitoring period. In addition, all six models failed all three relative accuracy test audits. The following problems were observed: (1) Complex ML models tended to have overfitting problems, thus indicating the importance of the trade-off between model accuracy and complexity. (2) Model input sensor drift or out of high-frequency ranges from the training data resulted in inaccurate predictions or an accuracy lower than the minimum allowed by regulators. (3) Existing regulatory frameworks must be modernized to keep pace with current machine learning practices. Some statistical tests are unsuitable for applications developed by using ML methods.</p>\",\"PeriodicalId\":543,\"journal\":{\"name\":\"Environmental Management\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Environmental Management\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://doi.org/10.1007/s00267-024-02057-2\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Management","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1007/s00267-024-02057-2","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
机器学习(ML)技术已在各种环境监测应用中得到研究和使用。很少有研究报告对此类应用进行长期评估。有关环境监测中应用 ML 的风险和监管框架的讨论也很少。我们对使用 ML 和统计方法开发的六个预测模型进行了 28 个月的性能监测。用于预测氮氧化物排放的六个模型采用了六种不同的算法。采用中等复杂度算法(自适应提升)开发的模型在长期监测中表现最佳,在 28 个月的监测期内均方根误差 (RMSE) 为 0.48 千克/小时,并通过了三项相对准确度测试审核中的两项。基于梯度提升和神经网络算法的高复杂度模型的训练性能最好,最小均方根误差分别为 0.23 千克/小时和 0.26 千克/小时,但在监测期间的均方根误差也最差,分别为 0.51 千克/小时和 0.57 千克/小时。此外,所有六个模型都未能通过三项相对精度测试审核。观察到的问题如下(1) 复杂的 ML 模型往往存在过度拟合的问题,这表明在模型准确性和复杂性之间进行权衡的重要性。(2) 模型输入传感器漂移或超出训练数据的高频范围导致预测不准确或准确率低于监管机构允许的最低水平。(3) 必须对现有监管框架进行现代化改造,以跟上当前机器学习实践的步伐。有些统计测试不适合使用 ML 方法开发的应用。
Long-term Evaluation of Machine Learning Based Methods for Air Emission Monitoring.
Machine learning (ML) techniques have been researched and used in various environmental monitoring applications. Few studies have reported the long-term evaluation of such applications. Discussions regarding the risks and regulatory frameworks of ML applications in environmental monitoring have been rare. We monitored the performance of six predictive models developed using ML and statistical methods for 28 months. The six models used to predict NOx emissions were developed using six different algorithms. The model developed with a moderate complexity algorithm, adaptive boosting, had the best performance in long-term monitoring, with a root mean square error (RMSE) of 0.48 kg/hr in the 28-month monitoring period, and passed two of the three relative accuracy test audits. High complexity models based on gradient boosting and neural network algorithms had the best training performance, with a minimum RMSE of 0.23 kg/hr and 0.26 kg/hr, but also had the worst RMSE scores, of 0.51 kg/hr and 0.57 kg/hr, during the monitoring period. In addition, all six models failed all three relative accuracy test audits. The following problems were observed: (1) Complex ML models tended to have overfitting problems, thus indicating the importance of the trade-off between model accuracy and complexity. (2) Model input sensor drift or out of high-frequency ranges from the training data resulted in inaccurate predictions or an accuracy lower than the minimum allowed by regulators. (3) Existing regulatory frameworks must be modernized to keep pace with current machine learning practices. Some statistical tests are unsuitable for applications developed by using ML methods.
期刊介绍:
Environmental Management offers research and opinions on use and conservation of natural resources, protection of habitats and control of hazards, spanning the field of environmental management without regard to traditional disciplinary boundaries. The journal aims to improve communication, making ideas and results from any field available to practitioners from other backgrounds. Contributions are drawn from biology, botany, chemistry, climatology, ecology, ecological economics, environmental engineering, fisheries, environmental law, forest sciences, geosciences, information science, public affairs, public health, toxicology, zoology and more.
As the principal user of nature, humanity is responsible for ensuring that its environmental impacts are benign rather than catastrophic. Environmental Management presents the work of academic researchers and professionals outside universities, including those in business, government, research establishments, and public interest groups, presenting a wide spectrum of viewpoints and approaches.