Yuehua Yue, Lianyin Jia, Hongsong Zhai, Ming Kong, Mengjuan Li
{"title":"CFS-DT:基于特征选择和决策树的辛烷值预测方法","authors":"Yuehua Yue, Lianyin Jia, Hongsong Zhai, Ming Kong, Mengjuan Li","doi":"10.1109/ICDSBA51020.2020.00033","DOIUrl":null,"url":null,"abstract":"Octane number (ON) is the most important index of vehicle gasoline specification. Due to the complexity of refining process, the equipment variety, a large number of features are collected, which makes it difficult to predict ON of gasoline. In this paper, we propose a combined feature selection and decision tree based prediction method, CFS-DT, which combines low variance filtering, high correlation filtering and random forest to execute feature selection on a large number of original feature first. After that, a decision tree(DT) is trained for ON prediction on selected features. Experiments are carried out on datasets collected from 2020 Huawei cup Mathematical Modeling show that our model has a good effectiveness and achieves a 89% prediction precision.","PeriodicalId":354742,"journal":{"name":"2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CFS-DT : a Combined Feature Selection and Decision Tree based Method for Octane Number Prediction\",\"authors\":\"Yuehua Yue, Lianyin Jia, Hongsong Zhai, Ming Kong, Mengjuan Li\",\"doi\":\"10.1109/ICDSBA51020.2020.00033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Octane number (ON) is the most important index of vehicle gasoline specification. Due to the complexity of refining process, the equipment variety, a large number of features are collected, which makes it difficult to predict ON of gasoline. In this paper, we propose a combined feature selection and decision tree based prediction method, CFS-DT, which combines low variance filtering, high correlation filtering and random forest to execute feature selection on a large number of original feature first. After that, a decision tree(DT) is trained for ON prediction on selected features. Experiments are carried out on datasets collected from 2020 Huawei cup Mathematical Modeling show that our model has a good effectiveness and achieves a 89% prediction precision.\",\"PeriodicalId\":354742,\"journal\":{\"name\":\"2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDSBA51020.2020.00033\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 4th Annual International Conference on Data Science and Business Analytics (ICDSBA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDSBA51020.2020.00033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CFS-DT : a Combined Feature Selection and Decision Tree based Method for Octane Number Prediction
Octane number (ON) is the most important index of vehicle gasoline specification. Due to the complexity of refining process, the equipment variety, a large number of features are collected, which makes it difficult to predict ON of gasoline. In this paper, we propose a combined feature selection and decision tree based prediction method, CFS-DT, which combines low variance filtering, high correlation filtering and random forest to execute feature selection on a large number of original feature first. After that, a decision tree(DT) is trained for ON prediction on selected features. Experiments are carried out on datasets collected from 2020 Huawei cup Mathematical Modeling show that our model has a good effectiveness and achieves a 89% prediction precision.