Galaxy: Towards Scalable and Interpretable Explanation on High-Dimensional and Spatio-Temporal Correlated Climate Data

Yong Zhuang, D. Small, Xin Shu, Kui Yu, S. Islam, W. Ding
{"title":"Galaxy: Towards Scalable and Interpretable Explanation on High-Dimensional and Spatio-Temporal Correlated Climate Data","authors":"Yong Zhuang, D. Small, Xin Shu, Kui Yu, S. Islam, W. Ding","doi":"10.1109/ICBK.2018.00027","DOIUrl":null,"url":null,"abstract":"Interpretability has become a major criterion for designing predictive models in climate science. High interpretability can provide qualitative understanding between the meteorological variables and the climate phenomena which is helpful for climate scientists to learn causes of climate events. However, detecting the features which have efficient interpretability to observed events is challenging in spatio-temporal climate data because the key features may be overlooked by the redundancy due to the high degree of spatial and temporal correlations among the features, especially in high dimensionality. Furthermore, climate events occurred in different regions or different times may have different explanatory patterns, detecting explanations for overall climate phenomena is also difficult. Here we propose Galaxy, a new interpretable predictive model. Galaxy allows us to represent the explanatory patterns of subpopulations within an overall population of the target. Each explanatory pattern is defined by the smallest feature subset that the conditional distribution of target actually depends on, which we define as the minimal target explanation. Based on the detection of such explanatory patterns, Galaxy can detect the Galaxy space, the explanations for the overall target population, by sequentially detecting target explanation of every individual subpopulation of the target, and then forecast the target variable by their ensemble predictive power. We validate our approach by comparing Galaxy to several state-of-the-art baselines in a set of comparative experiments and then evaluate how Galaxy can be used to identify the explanatory space and give a referential explanation report in a real-world scenario on a given location in the United States.","PeriodicalId":144958,"journal":{"name":"2018 IEEE International Conference on Big Knowledge (ICBK)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Big Knowledge (ICBK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBK.2018.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Interpretability has become a major criterion for designing predictive models in climate science. High interpretability can provide qualitative understanding between the meteorological variables and the climate phenomena which is helpful for climate scientists to learn causes of climate events. However, detecting the features which have efficient interpretability to observed events is challenging in spatio-temporal climate data because the key features may be overlooked by the redundancy due to the high degree of spatial and temporal correlations among the features, especially in high dimensionality. Furthermore, climate events occurred in different regions or different times may have different explanatory patterns, detecting explanations for overall climate phenomena is also difficult. Here we propose Galaxy, a new interpretable predictive model. Galaxy allows us to represent the explanatory patterns of subpopulations within an overall population of the target. Each explanatory pattern is defined by the smallest feature subset that the conditional distribution of target actually depends on, which we define as the minimal target explanation. Based on the detection of such explanatory patterns, Galaxy can detect the Galaxy space, the explanations for the overall target population, by sequentially detecting target explanation of every individual subpopulation of the target, and then forecast the target variable by their ensemble predictive power. We validate our approach by comparing Galaxy to several state-of-the-art baselines in a set of comparative experiments and then evaluate how Galaxy can be used to identify the explanatory space and give a referential explanation report in a real-world scenario on a given location in the United States.
银河:对高维时空相关气候数据的可扩展和可解释解释
可解释性已成为设计气候科学预测模型的主要标准。高可解释性可以提供气象变量与气候现象之间的定性认识,有助于气候科学家了解气候事件的原因。然而,在时空气候数据中,检测对观测事件具有有效可解释性的特征是一个挑战,因为特征之间的高度时空相关性,特别是在高维空间中,可能会导致冗余而忽略关键特征。此外,不同地区或不同时间发生的气候事件可能具有不同的解释模式,对整体气候现象的解释也很困难。在这里,我们提出了一个新的可解释的预测模型——星系。银河系使我们能够在目标的总体种群中表示亚种群的解释模式。每个解释模式由目标条件分布实际依赖的最小特征子集定义,我们将其定义为最小目标解释。在检测这些解释模式的基础上,通过顺序检测目标的每个个体亚群的目标解释,然后通过它们的集合预测能力预测目标变量,可以检测星系空间,对整个目标群体的解释。我们通过在一组比较实验中将Galaxy与几个最先进的基线进行比较来验证我们的方法,然后评估如何使用Galaxy来识别解释空间,并在美国给定位置的真实场景中给出参考解释报告。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信