{"title":"集成学习增强的逐步聚类分析在河流融冰日期预测中的应用","authors":"W. Sun, Q. Shi, Y. Huang, Y. Lv","doi":"10.3808/JEIL.201900005","DOIUrl":null,"url":null,"abstract":"Frequently occurring ice jams often cause concern in northern regions. Breakup timing is directly related to emergency responses preparation and thus its early accurate forecasting is beneficial to ice-related flooding management. The stepwise cluster analysis (SCA) is a non-parameter regression method, which generates a classification tree in the sense of probability through cutting or merging operations according to certain statistic criteria. To enhance SCA’s predictive performance, a SCA ensemble (SCAE) method is developed and applied to forecasting of annual river ice breakup dates (BDs). In detail, the SCA is employed as a base model at the lower level while the simple average method is selected as combining models at the upper level. The SCA base models are selected according to different performance selection criteria and searched for further combination. A site on a representative river prone to river ice flooding in Alberta, Canada is selected to demonstrate the effectiveness of the proposed SCAE. The results mainly show that: the SCA base models with multiple combinations of inputs and internal parameters are able to predict the BDs with good performances (the highest average of correlation coefficients for training can be 0.958); the optimal SCA base model has three inputs, which indicates that the temperatures before breakup and just after freeze-up as well as the maximum of water flow in March are relatively important indicators of BD. The optimal SCAE, including base models from different performance selection criteria, has the lowest average of root mean squared error, which improves upon the optimal SCA base model by 25.3%. It indicates the different model selection criteria do improve the diversity and thus further help to improve the performance of ensemble models. This first application of the SCAE to river ice forecasting highlights the possibility of using the ensemble learning paradigm to enhance the SCA. The potential applications of the SCAE to other forecasting problems are expected.","PeriodicalId":143718,"journal":{"name":"Journal of Environmental Informatics Letters","volume":"59 3-4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Ensemble Learning Enhanced Stepwise Cluster Analysis for River Ice Breakup Date Forecasting\",\"authors\":\"W. Sun, Q. Shi, Y. Huang, Y. Lv\",\"doi\":\"10.3808/JEIL.201900005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Frequently occurring ice jams often cause concern in northern regions. Breakup timing is directly related to emergency responses preparation and thus its early accurate forecasting is beneficial to ice-related flooding management. The stepwise cluster analysis (SCA) is a non-parameter regression method, which generates a classification tree in the sense of probability through cutting or merging operations according to certain statistic criteria. To enhance SCA’s predictive performance, a SCA ensemble (SCAE) method is developed and applied to forecasting of annual river ice breakup dates (BDs). In detail, the SCA is employed as a base model at the lower level while the simple average method is selected as combining models at the upper level. The SCA base models are selected according to different performance selection criteria and searched for further combination. A site on a representative river prone to river ice flooding in Alberta, Canada is selected to demonstrate the effectiveness of the proposed SCAE. The results mainly show that: the SCA base models with multiple combinations of inputs and internal parameters are able to predict the BDs with good performances (the highest average of correlation coefficients for training can be 0.958); the optimal SCA base model has three inputs, which indicates that the temperatures before breakup and just after freeze-up as well as the maximum of water flow in March are relatively important indicators of BD. The optimal SCAE, including base models from different performance selection criteria, has the lowest average of root mean squared error, which improves upon the optimal SCA base model by 25.3%. It indicates the different model selection criteria do improve the diversity and thus further help to improve the performance of ensemble models. This first application of the SCAE to river ice forecasting highlights the possibility of using the ensemble learning paradigm to enhance the SCA. The potential applications of the SCAE to other forecasting problems are expected.\",\"PeriodicalId\":143718,\"journal\":{\"name\":\"Journal of Environmental Informatics Letters\",\"volume\":\"59 3-4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Environmental Informatics Letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3808/JEIL.201900005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Informatics Letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3808/JEIL.201900005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Ensemble Learning Enhanced Stepwise Cluster Analysis for River Ice Breakup Date Forecasting
Frequently occurring ice jams often cause concern in northern regions. Breakup timing is directly related to emergency responses preparation and thus its early accurate forecasting is beneficial to ice-related flooding management. The stepwise cluster analysis (SCA) is a non-parameter regression method, which generates a classification tree in the sense of probability through cutting or merging operations according to certain statistic criteria. To enhance SCA’s predictive performance, a SCA ensemble (SCAE) method is developed and applied to forecasting of annual river ice breakup dates (BDs). In detail, the SCA is employed as a base model at the lower level while the simple average method is selected as combining models at the upper level. The SCA base models are selected according to different performance selection criteria and searched for further combination. A site on a representative river prone to river ice flooding in Alberta, Canada is selected to demonstrate the effectiveness of the proposed SCAE. The results mainly show that: the SCA base models with multiple combinations of inputs and internal parameters are able to predict the BDs with good performances (the highest average of correlation coefficients for training can be 0.958); the optimal SCA base model has three inputs, which indicates that the temperatures before breakup and just after freeze-up as well as the maximum of water flow in March are relatively important indicators of BD. The optimal SCAE, including base models from different performance selection criteria, has the lowest average of root mean squared error, which improves upon the optimal SCA base model by 25.3%. It indicates the different model selection criteria do improve the diversity and thus further help to improve the performance of ensemble models. This first application of the SCAE to river ice forecasting highlights the possibility of using the ensemble learning paradigm to enhance the SCA. The potential applications of the SCAE to other forecasting problems are expected.