{"title":"利用基于 PCA 的新型模型堆叠方法绘制特大山洪易发图","authors":"Amirreza Shojaeian , Hossein Shafizadeh-Moghadam , Ahmad Sharafati , Himan Shahabi","doi":"10.1016/j.asr.2024.08.004","DOIUrl":null,"url":null,"abstract":"<div><div>This study introduces an efficient methodology for model stacking, incorporating six diverse machine learning and statistical models alongside principal component analysis (PCA). The approach is applied for the flash flood susceptibility mapping within the Karkheh Basin in Iran. The selected models include random forest (RF), boosted regression trees (BRT), support vector machine (SVM), artificial neural networks (ANN), generalized additive model (GAM), and the least absolute shrinkage and selection operator (Lasso), with RF also serving as the <em>meta</em>-model for the stacking. The results revealed significant correlations among the predictions of the individual models, which could potentially impact the <em>meta</em>-model’s efficacy. To address this, PCA was applied to the model predictions to generate de-correlated components as inputs for the <em>meta</em>-model, thereby enhancing prediction accuracy and robustness. Evaluation based on the area under the receiver operating characteristic (AUROC) curve demonstrated that the GAM outperformed all other individual models with the highest accuracy score of 0.924. In contrast, the RF and ANN models had the lowest accuracy, both registering at 0.872. However, the performance disparity across models was minimal. Notably, the PCA-based stacking approach (0.936) surpassed both traditional model stacking (0.912) and the performances of all individual models, advocating for its use in enhancing predictive accuracy. These findings endorse the PCA-stacking method over conventional stacking techniques. Nonetheless, further research across varied applications is warranted to generalize its efficacy.</div></div>","PeriodicalId":50850,"journal":{"name":"Advances in Space Research","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Extreme flash flood susceptibility mapping using a novel PCA-based model stacking approach\",\"authors\":\"Amirreza Shojaeian , Hossein Shafizadeh-Moghadam , Ahmad Sharafati , Himan Shahabi\",\"doi\":\"10.1016/j.asr.2024.08.004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study introduces an efficient methodology for model stacking, incorporating six diverse machine learning and statistical models alongside principal component analysis (PCA). The approach is applied for the flash flood susceptibility mapping within the Karkheh Basin in Iran. The selected models include random forest (RF), boosted regression trees (BRT), support vector machine (SVM), artificial neural networks (ANN), generalized additive model (GAM), and the least absolute shrinkage and selection operator (Lasso), with RF also serving as the <em>meta</em>-model for the stacking. The results revealed significant correlations among the predictions of the individual models, which could potentially impact the <em>meta</em>-model’s efficacy. To address this, PCA was applied to the model predictions to generate de-correlated components as inputs for the <em>meta</em>-model, thereby enhancing prediction accuracy and robustness. Evaluation based on the area under the receiver operating characteristic (AUROC) curve demonstrated that the GAM outperformed all other individual models with the highest accuracy score of 0.924. In contrast, the RF and ANN models had the lowest accuracy, both registering at 0.872. However, the performance disparity across models was minimal. Notably, the PCA-based stacking approach (0.936) surpassed both traditional model stacking (0.912) and the performances of all individual models, advocating for its use in enhancing predictive accuracy. These findings endorse the PCA-stacking method over conventional stacking techniques. Nonetheless, further research across varied applications is warranted to generalize its efficacy.</div></div>\",\"PeriodicalId\":50850,\"journal\":{\"name\":\"Advances in Space Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Space Research\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S027311772400807X\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ASTRONOMY & ASTROPHYSICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Space Research","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S027311772400807X","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ASTRONOMY & ASTROPHYSICS","Score":null,"Total":0}
Extreme flash flood susceptibility mapping using a novel PCA-based model stacking approach
This study introduces an efficient methodology for model stacking, incorporating six diverse machine learning and statistical models alongside principal component analysis (PCA). The approach is applied for the flash flood susceptibility mapping within the Karkheh Basin in Iran. The selected models include random forest (RF), boosted regression trees (BRT), support vector machine (SVM), artificial neural networks (ANN), generalized additive model (GAM), and the least absolute shrinkage and selection operator (Lasso), with RF also serving as the meta-model for the stacking. The results revealed significant correlations among the predictions of the individual models, which could potentially impact the meta-model’s efficacy. To address this, PCA was applied to the model predictions to generate de-correlated components as inputs for the meta-model, thereby enhancing prediction accuracy and robustness. Evaluation based on the area under the receiver operating characteristic (AUROC) curve demonstrated that the GAM outperformed all other individual models with the highest accuracy score of 0.924. In contrast, the RF and ANN models had the lowest accuracy, both registering at 0.872. However, the performance disparity across models was minimal. Notably, the PCA-based stacking approach (0.936) surpassed both traditional model stacking (0.912) and the performances of all individual models, advocating for its use in enhancing predictive accuracy. These findings endorse the PCA-stacking method over conventional stacking techniques. Nonetheless, further research across varied applications is warranted to generalize its efficacy.
期刊介绍:
The COSPAR publication Advances in Space Research (ASR) is an open journal covering all areas of space research including: space studies of the Earth''s surface, meteorology, climate, the Earth-Moon system, planets and small bodies of the solar system, upper atmospheres, ionospheres and magnetospheres of the Earth and planets including reference atmospheres, space plasmas in the solar system, astrophysics from space, materials sciences in space, fundamental physics in space, space debris, space weather, Earth observations of space phenomena, etc.
NB: Please note that manuscripts related to life sciences as related to space are no more accepted for submission to Advances in Space Research. Such manuscripts should now be submitted to the new COSPAR Journal Life Sciences in Space Research (LSSR).
All submissions are reviewed by two scientists in the field. COSPAR is an interdisciplinary scientific organization concerned with the progress of space research on an international scale. Operating under the rules of ICSU, COSPAR ignores political considerations and considers all questions solely from the scientific viewpoint.