{"title":"基于多切片采样的高维数据深度森林回归算法","authors":"Heng Xia, Jian Tang, J. Qiao, Wen Yu","doi":"10.1109/IAI50351.2020.9262209","DOIUrl":null,"url":null,"abstract":"In the online soft measurement of difficult-to-measure parameters of complex industrial processes. With the rapid development of investigation, deep learning such as deep forest regression (DFR) have been applied. However, for high-dimension datasets, these methods usually can't implement better effects and high time cost. Therefore, in this paper, a multi-sliced sampling-based DFR (Mss-DFR) model is proposed to solve the above problems in high-dimension datasets. The improved model is different from the original model in three important aspects. Firstly, considering the diversity and time cost of sub-forest, the raw feature vector is segmented into three parts through multi slicing strategy. Further, based on the mutual information feature selection model, the optimized feature set is obtained that according to the principle of minimum redundancy and maximum correlation, and then combined with the layer regression vector. Finally, consider variance can projection effect the difference of each sub-forest in DFR, so that it added to the layer regression vector. Experimental results show that Mss-DFR performs significantly, and even outperforms neural networks and achieves state-of-the-art results in some cases.","PeriodicalId":137183,"journal":{"name":"2020 2nd International Conference on Industrial Artificial Intelligence (IAI)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-sliced Sampling-based Deep Forest Regression Algorithm for High-dimension Data\",\"authors\":\"Heng Xia, Jian Tang, J. Qiao, Wen Yu\",\"doi\":\"10.1109/IAI50351.2020.9262209\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the online soft measurement of difficult-to-measure parameters of complex industrial processes. With the rapid development of investigation, deep learning such as deep forest regression (DFR) have been applied. However, for high-dimension datasets, these methods usually can't implement better effects and high time cost. Therefore, in this paper, a multi-sliced sampling-based DFR (Mss-DFR) model is proposed to solve the above problems in high-dimension datasets. The improved model is different from the original model in three important aspects. Firstly, considering the diversity and time cost of sub-forest, the raw feature vector is segmented into three parts through multi slicing strategy. Further, based on the mutual information feature selection model, the optimized feature set is obtained that according to the principle of minimum redundancy and maximum correlation, and then combined with the layer regression vector. Finally, consider variance can projection effect the difference of each sub-forest in DFR, so that it added to the layer regression vector. Experimental results show that Mss-DFR performs significantly, and even outperforms neural networks and achieves state-of-the-art results in some cases.\",\"PeriodicalId\":137183,\"journal\":{\"name\":\"2020 2nd International Conference on Industrial Artificial Intelligence (IAI)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 2nd International Conference on Industrial Artificial Intelligence (IAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IAI50351.2020.9262209\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Industrial Artificial Intelligence (IAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAI50351.2020.9262209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-sliced Sampling-based Deep Forest Regression Algorithm for High-dimension Data
In the online soft measurement of difficult-to-measure parameters of complex industrial processes. With the rapid development of investigation, deep learning such as deep forest regression (DFR) have been applied. However, for high-dimension datasets, these methods usually can't implement better effects and high time cost. Therefore, in this paper, a multi-sliced sampling-based DFR (Mss-DFR) model is proposed to solve the above problems in high-dimension datasets. The improved model is different from the original model in three important aspects. Firstly, considering the diversity and time cost of sub-forest, the raw feature vector is segmented into three parts through multi slicing strategy. Further, based on the mutual information feature selection model, the optimized feature set is obtained that according to the principle of minimum redundancy and maximum correlation, and then combined with the layer regression vector. Finally, consider variance can projection effect the difference of each sub-forest in DFR, so that it added to the layer regression vector. Experimental results show that Mss-DFR performs significantly, and even outperforms neural networks and achieves state-of-the-art results in some cases.