Yuehua Yue, Lianyin Jia, Hongsong Zhai, Ming Kong, Mengjuan Li
{"title":"CFS-DT : a Combined Feature Selection and Decision Tree based Method for Octane Number Prediction","authors":"Yuehua Yue, Lianyin Jia, Hongsong Zhai, Ming Kong, Mengjuan Li","doi":"10.1109/ICDSBA51020.2020.00033","DOIUrl":null,"url":null,"abstract":"Octane number (ON) is the most important index of vehicle gasoline specification. Due to the complexity of refining process, the equipment variety, a large number of features are collected, which makes it difficult to predict ON of gasoline. In this paper, we propose a combined feature selection and decision tree based prediction method, CFS-DT, which combines low variance filtering, high correlation filtering and random forest to execute feature selection on a large number of original feature first. After that, a decision tree(DT) is trained for ON prediction on selected features. Experiments are carried out on datasets collected from 2020 Huawei cup Mathematical Modeling show that our model has a good effectiveness and achieves a 89% prediction precision.","PeriodicalId":354742,"journal":{"name":"2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSBA51020.2020.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Octane number (ON) is the most important index of vehicle gasoline specification. Due to the complexity of refining process, the equipment variety, a large number of features are collected, which makes it difficult to predict ON of gasoline. In this paper, we propose a combined feature selection and decision tree based prediction method, CFS-DT, which combines low variance filtering, high correlation filtering and random forest to execute feature selection on a large number of original feature first. After that, a decision tree(DT) is trained for ON prediction on selected features. Experiments are carried out on datasets collected from 2020 Huawei cup Mathematical Modeling show that our model has a good effectiveness and achieves a 89% prediction precision.