SHAP（SHapley Additive exPlanations）解释如何改进基于深度学习的城市蜂窝自动机模型？

IF 7.1 1区地球科学 Q1 ENVIRONMENTAL STUDIES

Computers Environment and Urban Systems Pub Date : 2024-05-30 DOI:10.1016/j.compenvurbsys.2024.102133

Changlan Yang , Xuefeng Guan , Qingyang Xu , Weiran Xing , Xiaoyu Chen , Jinguo Chen , Peng Jia

{"title":"SHAP（SHapley Additive exPlanations）解释如何改进基于深度学习的城市蜂窝自动机模型？","authors":"Changlan Yang , Xuefeng Guan , Qingyang Xu , Weiran Xing , Xiaoyu Chen , Jinguo Chen , Peng Jia","doi":"10.1016/j.compenvurbsys.2024.102133","DOIUrl":null,"url":null,"abstract":"<div><p>Interpretations of the urban cellular automata (CA) model aim to ensure that its predictive behaviors are consistent with real-world processes. Current urban CA interpretations have revealed the impacts of driving factors on land development suitability, or neighborhood effects and random perturbation on simulation results. However, three limitations remain unresolved: (1) the interpretations of deep learning (DL)-based urban CA are seldom integrated with the prerequired feature selection, (2) the input features from different urban CA modules are still explained by separate approaches, and (3) the interpretation results are rarely derived at the cell level to uncover spatially varying urban land development patterns. This study proposes a SHapley Additive exPlanations (SHAP)-based urban CA interpretation framework to address these challenges and improve urban CA. This framework uses model-level SHAP importance to identify dominant features from different modules for constructing the final simulation model. Then, cell-level SHAP importance is used to uncover spatially varying driving forces of urban expansion. The framework's effectiveness is rigorously tested and confirmed using a convolution neural network CA (CNN-CA) model for Dongguan City. The experimental results demonstrate that (1) SHAP-based model interpretation improves feature selection for DL-based urban CA. The figure of merit for CNN-CA calibrated using SHAP-based important features improves by 3%, outperforming the tested baseline methods. (2) SHAP measures the impacts of each feature from different CA modules in a whole. In this case, physical factors are much more important at the model level than proximity and accessibility factors, while neighborhood effect is the second most crucial factor. (3) Cell-level SHAP interpretations uncover spatially different urban land development patterns. For example, due to the extensive industrial land development in the northern Songshan Lake Zone, in the CNN-CA model, proximity to major roads within this region is associated with positive SHAP-based contribution share on cell-level urban expansion.</p></div>","PeriodicalId":48241,"journal":{"name":"Computers Environment and Urban Systems","volume":"111 ","pages":"Article 102133"},"PeriodicalIF":7.1000,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How can SHAP (SHapley Additive exPlanations) interpretations improve deep learning based urban cellular automata model?\",\"authors\":\"Changlan Yang , Xuefeng Guan , Qingyang Xu , Weiran Xing , Xiaoyu Chen , Jinguo Chen , Peng Jia\",\"doi\":\"10.1016/j.compenvurbsys.2024.102133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Interpretations of the urban cellular automata (CA) model aim to ensure that its predictive behaviors are consistent with real-world processes. Current urban CA interpretations have revealed the impacts of driving factors on land development suitability, or neighborhood effects and random perturbation on simulation results. However, three limitations remain unresolved: (1) the interpretations of deep learning (DL)-based urban CA are seldom integrated with the prerequired feature selection, (2) the input features from different urban CA modules are still explained by separate approaches, and (3) the interpretation results are rarely derived at the cell level to uncover spatially varying urban land development patterns. This study proposes a SHapley Additive exPlanations (SHAP)-based urban CA interpretation framework to address these challenges and improve urban CA. This framework uses model-level SHAP importance to identify dominant features from different modules for constructing the final simulation model. Then, cell-level SHAP importance is used to uncover spatially varying driving forces of urban expansion. The framework's effectiveness is rigorously tested and confirmed using a convolution neural network CA (CNN-CA) model for Dongguan City. The experimental results demonstrate that (1) SHAP-based model interpretation improves feature selection for DL-based urban CA. The figure of merit for CNN-CA calibrated using SHAP-based important features improves by 3%, outperforming the tested baseline methods. (2) SHAP measures the impacts of each feature from different CA modules in a whole. In this case, physical factors are much more important at the model level than proximity and accessibility factors, while neighborhood effect is the second most crucial factor. (3) Cell-level SHAP interpretations uncover spatially different urban land development patterns. For example, due to the extensive industrial land development in the northern Songshan Lake Zone, in the CNN-CA model, proximity to major roads within this region is associated with positive SHAP-based contribution share on cell-level urban expansion.</p></div>\",\"PeriodicalId\":48241,\"journal\":{\"name\":\"Computers Environment and Urban Systems\",\"volume\":\"111 \",\"pages\":\"Article 102133\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2024-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers Environment and Urban Systems\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0198971524000620\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL STUDIES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers Environment and Urban Systems","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0198971524000620","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL STUDIES","Score":null,"Total":0}

引用次数: 0

摘要

对城市细胞自动机（CA）模型的解释旨在确保其预测行为与现实世界的过程相一致。当前的城市细胞自动机模型解释揭示了驱动因素对土地开发适宜性的影响，或邻里效应和随机扰动对模拟结果的影响。然而，有三个局限性问题仍未得到解决：（1）基于深度学习（DL）的城市核算分析的解释很少与预设的特征选择相结合；（2）来自不同城市核算分析模块的输入特征仍由不同的方法解释；（3）解释结果很少在单元水平上得出，以揭示空间变化的城市土地开发模式。本研究提出了基于 SHapley Additive exPlanations（SHAP）的城市气候变化解释框架，以应对这些挑战并改进城市气候变化。该框架使用模型级 SHAP 重要性来识别不同模块的主要特征，从而构建最终的模拟模型。然后，利用单元级 SHAP 重要性揭示城市扩张的空间驱动力。使用东莞市的卷积神经网络 CA（CNN-CA）模型对该框架的有效性进行了严格测试和确认。实验结果表明：(1) 基于 SHAP 的模型解释改进了基于 DL 的城市 CA 的特征选择。使用基于 SHAP 的重要特征校准的 CNN-CA 的优越性提高了 3%，优于测试的基线方法。(2) SHAP 从整体上衡量不同 CA 模块中每个特征的影响。在这种情况下，物理因素在模型层面的重要性远远高于邻近性和可达性因素，而邻里效应则是第二重要的因素。(3) 单元层面的 SHAP 解释揭示了空间上不同的城市土地开发模式。例如，在 CNN-CA 模型中，由于松山湖北部地区的工业用地开发规模较大，靠近该区域内的主要道路与基于 SHAP 的单元级城市扩张贡献份额正相关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

How can SHAP (SHapley Additive exPlanations) interpretations improve deep learning based urban cellular automata model?

查看原文本刊更多论文

How can SHAP (SHapley Additive exPlanations) interpretations improve deep learning based urban cellular automata model?

Interpretations of the urban cellular automata (CA) model aim to ensure that its predictive behaviors are consistent with real-world processes. Current urban CA interpretations have revealed the impacts of driving factors on land development suitability, or neighborhood effects and random perturbation on simulation results. However, three limitations remain unresolved: (1) the interpretations of deep learning (DL)-based urban CA are seldom integrated with the prerequired feature selection, (2) the input features from different urban CA modules are still explained by separate approaches, and (3) the interpretation results are rarely derived at the cell level to uncover spatially varying urban land development patterns. This study proposes a SHapley Additive exPlanations (SHAP)-based urban CA interpretation framework to address these challenges and improve urban CA. This framework uses model-level SHAP importance to identify dominant features from different modules for constructing the final simulation model. Then, cell-level SHAP importance is used to uncover spatially varying driving forces of urban expansion. The framework's effectiveness is rigorously tested and confirmed using a convolution neural network CA (CNN-CA) model for Dongguan City. The experimental results demonstrate that (1) SHAP-based model interpretation improves feature selection for DL-based urban CA. The figure of merit for CNN-CA calibrated using SHAP-based important features improves by 3%, outperforming the tested baseline methods. (2) SHAP measures the impacts of each feature from different CA modules in a whole. In this case, physical factors are much more important at the model level than proximity and accessibility factors, while neighborhood effect is the second most crucial factor. (3) Cell-level SHAP interpretations uncover spatially different urban land development patterns. For example, due to the extensive industrial land development in the northern Songshan Lake Zone, in the CNN-CA model, proximity to major roads within this region is associated with positive SHAP-based contribution share on cell-level urban expansion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers Environment and Urban Systems Multiple-

CiteScore

13.30

自引率

7.40%

发文量

111

审稿时长

32 days

期刊介绍： Computers, Environment and Urban Systemsis an interdisciplinary journal publishing cutting-edge and innovative computer-based research on environmental and urban systems, that privileges the geospatial perspective. The journal welcomes original high quality scholarship of a theoretical, applied or technological nature, and provides a stimulating presentation of perspectives, research developments, overviews of important new technologies and uses of major computational, information-based, and visualization innovations. Applied and theoretical contributions demonstrate the scope of computer-based analysis fostering a better understanding of environmental and urban systems, their spatial scope and their dynamics.