A Novel Random Forest and its Application on Classification of Air Quality

Hualing Yi, Qingyu Xiong, Qinghong Zou, Rui Xu, Kai Wang, Min Gao
{"title":"A Novel Random Forest and its Application on Classification of Air Quality","authors":"Hualing Yi, Qingyu Xiong, Qinghong Zou, Rui Xu, Kai Wang, Min Gao","doi":"10.1109/IIAI-AAI.2019.00018","DOIUrl":null,"url":null,"abstract":"Air pollution has a serious impact on daily life. It is necessary to inform the air quality in time to the public in order to take measures in advance. Machine learning methods such as random forest are good at evaluating grades of air quality. We find the distribution of air data is imbalance, which leads to negative effect on random forest classifiers. We propose a random forest method based on samples grouped bootstrap to solve this problem. Then we design three sets of experiments to evaluate the performance of the proposed method. The results of experiments indicate that the proposed method presents an improvement of random forest when both apply on balance datasets. The improvement is very significant when they apply on imbalance datasets, where the new method is much better at classifying minority samples.","PeriodicalId":136474,"journal":{"name":"2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAI-AAI.2019.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

Air pollution has a serious impact on daily life. It is necessary to inform the air quality in time to the public in order to take measures in advance. Machine learning methods such as random forest are good at evaluating grades of air quality. We find the distribution of air data is imbalance, which leads to negative effect on random forest classifiers. We propose a random forest method based on samples grouped bootstrap to solve this problem. Then we design three sets of experiments to evaluate the performance of the proposed method. The results of experiments indicate that the proposed method presents an improvement of random forest when both apply on balance datasets. The improvement is very significant when they apply on imbalance datasets, where the new method is much better at classifying minority samples.
一种新型随机森林及其在空气质量分类中的应用
空气污染严重影响人们的日常生活。有必要及时向公众通报空气质量,以便提前采取措施。随机森林等机器学习方法擅长评估空气质量等级。我们发现空气数据的分布是不平衡的,这对随机森林分类器产生了负面影响。我们提出了一种基于样本分组自举的随机森林方法来解决这一问题。然后我们设计了三组实验来评估所提出方法的性能。实验结果表明,当两种方法都应用于平衡数据集时,所提出的方法都是对随机森林的改进。当它们应用于不平衡数据集时,改进是非常显著的,其中新方法在分类少数样本方面要好得多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信