{"title":"SHAP(SHapley Additive exPlanations)解释如何改进基于深度学习的城市蜂窝自动机模型?","authors":"Changlan Yang , Xuefeng Guan , Qingyang Xu , Weiran Xing , Xiaoyu Chen , Jinguo Chen , Peng Jia","doi":"10.1016/j.compenvurbsys.2024.102133","DOIUrl":null,"url":null,"abstract":"<div><p>Interpretations of the urban cellular automata (CA) model aim to ensure that its predictive behaviors are consistent with real-world processes. Current urban CA interpretations have revealed the impacts of driving factors on land development suitability, or neighborhood effects and random perturbation on simulation results. However, three limitations remain unresolved: (1) the interpretations of deep learning (DL)-based urban CA are seldom integrated with the prerequired feature selection, (2) the input features from different urban CA modules are still explained by separate approaches, and (3) the interpretation results are rarely derived at the cell level to uncover spatially varying urban land development patterns. This study proposes a SHapley Additive exPlanations (SHAP)-based urban CA interpretation framework to address these challenges and improve urban CA. This framework uses model-level SHAP importance to identify dominant features from different modules for constructing the final simulation model. Then, cell-level SHAP importance is used to uncover spatially varying driving forces of urban expansion. The framework's effectiveness is rigorously tested and confirmed using a convolution neural network CA (CNN-CA) model for Dongguan City. The experimental results demonstrate that (1) SHAP-based model interpretation improves feature selection for DL-based urban CA. The figure of merit for CNN-CA calibrated using SHAP-based important features improves by 3%, outperforming the tested baseline methods. (2) SHAP measures the impacts of each feature from different CA modules in a whole. In this case, physical factors are much more important at the model level than proximity and accessibility factors, while neighborhood effect is the second most crucial factor. (3) Cell-level SHAP interpretations uncover spatially different urban land development patterns. For example, due to the extensive industrial land development in the northern Songshan Lake Zone, in the CNN-CA model, proximity to major roads within this region is associated with positive SHAP-based contribution share on cell-level urban expansion.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"111 ","pages":"Article 102133"},"PeriodicalIF":7.1000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How can SHAP (SHapley Additive exPlanations) interpretations improve deep learning based urban cellular automata model?\",\"authors\":\"Changlan Yang , Xuefeng Guan , Qingyang Xu , Weiran Xing , Xiaoyu Chen , Jinguo Chen , Peng Jia\",\"doi\":\"10.1016/j.compenvurbsys.2024.102133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Interpretations of the urban cellular automata (CA) model aim to ensure that its predictive behaviors are consistent with real-world processes. Current urban CA interpretations have revealed the impacts of driving factors on land development suitability, or neighborhood effects and random perturbation on simulation results. However, three limitations remain unresolved: (1) the interpretations of deep learning (DL)-based urban CA are seldom integrated with the prerequired feature selection, (2) the input features from different urban CA modules are still explained by separate approaches, and (3) the interpretation results are rarely derived at the cell level to uncover spatially varying urban land development patterns. This study proposes a SHapley Additive exPlanations (SHAP)-based urban CA interpretation framework to address these challenges and improve urban CA. This framework uses model-level SHAP importance to identify dominant features from different modules for constructing the final simulation model. Then, cell-level SHAP importance is used to uncover spatially varying driving forces of urban expansion. The framework's effectiveness is rigorously tested and confirmed using a convolution neural network CA (CNN-CA) model for Dongguan City. The experimental results demonstrate that (1) SHAP-based model interpretation improves feature selection for DL-based urban CA. The figure of merit for CNN-CA calibrated using SHAP-based important features improves by 3%, outperforming the tested baseline methods. (2) SHAP measures the impacts of each feature from different CA modules in a whole. In this case, physical factors are much more important at the model level than proximity and accessibility factors, while neighborhood effect is the second most crucial factor. (3) Cell-level SHAP interpretations uncover spatially different urban land development patterns. For example, due to the extensive industrial land development in the northern Songshan Lake Zone, in the CNN-CA model, proximity to major roads within this region is associated with positive SHAP-based contribution share on cell-level urban expansion.</p></div>\",\"PeriodicalId\":48241,\"journal\":{\"name\":\"Computers Environment and Urban Systems\",\"volume\":\"111 \",\"pages\":\"Article 102133\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2024-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers Environment and Urban Systems\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0198971524000620\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL STUDIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers Environment and Urban Systems","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0198971524000620","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}
How can SHAP (SHapley Additive exPlanations) interpretations improve deep learning based urban cellular automata model?
Interpretations of the urban cellular automata (CA) model aim to ensure that its predictive behaviors are consistent with real-world processes. Current urban CA interpretations have revealed the impacts of driving factors on land development suitability, or neighborhood effects and random perturbation on simulation results. However, three limitations remain unresolved: (1) the interpretations of deep learning (DL)-based urban CA are seldom integrated with the prerequired feature selection, (2) the input features from different urban CA modules are still explained by separate approaches, and (3) the interpretation results are rarely derived at the cell level to uncover spatially varying urban land development patterns. This study proposes a SHapley Additive exPlanations (SHAP)-based urban CA interpretation framework to address these challenges and improve urban CA. This framework uses model-level SHAP importance to identify dominant features from different modules for constructing the final simulation model. Then, cell-level SHAP importance is used to uncover spatially varying driving forces of urban expansion. The framework's effectiveness is rigorously tested and confirmed using a convolution neural network CA (CNN-CA) model for Dongguan City. The experimental results demonstrate that (1) SHAP-based model interpretation improves feature selection for DL-based urban CA. The figure of merit for CNN-CA calibrated using SHAP-based important features improves by 3%, outperforming the tested baseline methods. (2) SHAP measures the impacts of each feature from different CA modules in a whole. In this case, physical factors are much more important at the model level than proximity and accessibility factors, while neighborhood effect is the second most crucial factor. (3) Cell-level SHAP interpretations uncover spatially different urban land development patterns. For example, due to the extensive industrial land development in the northern Songshan Lake Zone, in the CNN-CA model, proximity to major roads within this region is associated with positive SHAP-based contribution share on cell-level urban expansion.
期刊介绍:
Computers, Environment and Urban Systemsis an interdisciplinary journal publishing cutting-edge and innovative computer-based research on environmental and urban systems, that privileges the geospatial perspective. The journal welcomes original high quality scholarship of a theoretical, applied or technological nature, and provides a stimulating presentation of perspectives, research developments, overviews of important new technologies and uses of major computational, information-based, and visualization innovations. Applied and theoretical contributions demonstrate the scope of computer-based analysis fostering a better understanding of environmental and urban systems, their spatial scope and their dynamics.