Kernel density smoothing of composite spatial data on administrative area level

AStA Wirtschafts- und Sozialstatistisches Archiv Pub Date : 2021-12-23 DOI:10.1007/s11943-021-00298-9

Kerstin Erfurth, Marcus Groß, Ulrich Rendtel, Timo Schmid

{"title":"Kernel density smoothing of composite spatial data on administrative area level","authors":"Kerstin Erfurth, Marcus Groß, Ulrich Rendtel, Timo Schmid","doi":"10.1007/s11943-021-00298-9","DOIUrl":null,"url":null,"abstract":"<div>Composite spatial data on administrative area level are often presented by maps. The aim is to detect regional differences in the concentration of subpopulations, like elderly persons, ethnic minorities, low-educated persons, voters of a political party or persons with a certain disease. Thematic collections of such maps are presented in different atlases. The standard presentation is by Choropleth maps where each administrative unit is represented by a single value. These maps can be criticized under three aspects: the implicit assumption of a uniform distribution within the area, the instability of the resulting map with respect to a change of the reference area and the discontinuities of the maps at the borderlines of the reference areas which inhibit the detection of regional clusters.In order to address these problems we use a density approach in the construction of maps. This approach does not enforce a local uniform distribution. It does not depend on a specific choice of area reference system and there are no discontinuities in the displayed maps. A standard estimation procedure of densities are Kernel density estimates. However, these estimates need the geo-coordinates of the single units which are not at disposal as we have only access to the aggregates of some area system. To overcome this hurdle, we use a statistical simulation concept. This can be interpreted as a Simulated Expectation Maximisation (SEM) algorithm of Celeux et al (1996). We simulate observations from the current density estimates which are consistent with the aggregation information (S-step). Then we apply the Kernel density estimator to the simulated sample which gives the next density estimate (E-Step).This concept has been first applied for grid data with rectangular areas, see Groß et al (2017), for the display of ethnic minorities. In a second application we demonstrated the use of this approach for the so-called “change of support” (Bradley et al 2016) problem. Here Groß et al (2020) used the SEM algorithm to recalculate case numbers between non-hierarchical administrative area systems. Recently Rendtel et al (2021) applied the SEM algorithm to display spatial-temporal clusters of Corona infections in Germany.Here we present three modifications of the basic SEM algorithm: 1) We introduce a boundary correction which removes the underestimation of kernel density estimates at the borders of the population area. 2) We recognize unsettled areas, like lakes, parks and industrial areas, in the computation of the kernel density. 3) We adapt the SEM algorithm for the computation of local percentages which are important especially in voting analysis.We evaluate our approach against several standard maps by means of the local voting register with known addresses. In the empirical part we apply our approach for the display of voting results for the 2016 election of the Berlin parliament. We contrast our results against Choropleth maps and show new possibilities for reporting spatial voting results.</div>","PeriodicalId":100134,"journal":{"name":"AStA Wirtschafts- und Sozialstatistisches Archiv","volume":"16 1","pages":"25 - 49"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11943-021-00298-9.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AStA Wirtschafts- und Sozialstatistisches Archiv","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s11943-021-00298-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Composite spatial data on administrative area level are often presented by maps. The aim is to detect regional differences in the concentration of subpopulations, like elderly persons, ethnic minorities, low-educated persons, voters of a political party or persons with a certain disease. Thematic collections of such maps are presented in different atlases. The standard presentation is by Choropleth maps where each administrative unit is represented by a single value. These maps can be criticized under three aspects: the implicit assumption of a uniform distribution within the area, the instability of the resulting map with respect to a change of the reference area and the discontinuities of the maps at the borderlines of the reference areas which inhibit the detection of regional clusters.

In order to address these problems we use a density approach in the construction of maps. This approach does not enforce a local uniform distribution. It does not depend on a specific choice of area reference system and there are no discontinuities in the displayed maps. A standard estimation procedure of densities are Kernel density estimates. However, these estimates need the geo-coordinates of the single units which are not at disposal as we have only access to the aggregates of some area system. To overcome this hurdle, we use a statistical simulation concept. This can be interpreted as a Simulated Expectation Maximisation (SEM) algorithm of Celeux et al (1996). We simulate observations from the current density estimates which are consistent with the aggregation information (S-step). Then we apply the Kernel density estimator to the simulated sample which gives the next density estimate (E-Step).

This concept has been first applied for grid data with rectangular areas, see Groß et al (2017), for the display of ethnic minorities. In a second application we demonstrated the use of this approach for the so-called “change of support” (Bradley et al 2016) problem. Here Groß et al (2020) used the SEM algorithm to recalculate case numbers between non-hierarchical administrative area systems. Recently Rendtel et al (2021) applied the SEM algorithm to display spatial-temporal clusters of Corona infections in Germany.

Here we present three modifications of the basic SEM algorithm: 1) We introduce a boundary correction which removes the underestimation of kernel density estimates at the borders of the population area. 2) We recognize unsettled areas, like lakes, parks and industrial areas, in the computation of the kernel density. 3) We adapt the SEM algorithm for the computation of local percentages which are important especially in voting analysis.

We evaluate our approach against several standard maps by means of the local voting register with known addresses. In the empirical part we apply our approach for the display of voting results for the 2016 election of the Berlin parliament. We contrast our results against Choropleth maps and show new possibilities for reporting spatial voting results.

查看原文本刊更多论文

行政区域级复合空间数据的核密度平滑

行政区域一级的综合空间数据通常由地图提供。其目的是检测亚群体集中度的区域差异，如老年人、少数民族、低教育程度者、政党选民或患有某种疾病的人。这些地图的专题集载于不同的地图册中。标准表示方式为Choropleth地图，其中每个行政单位由一个值表示。这些地图可以在三个方面受到批评：区域内均匀分布的隐含假设、生成的地图相对于参考区域变化的不稳定性以及地图在参考区域边界线的不连续性，这些不连续性阻碍了区域集群的检测。为了解决这些问题，我们在地图的构建中使用了密度方法。这种方法不强制执行局部均匀分布。它不取决于区域参考系统的具体选择，并且显示的地图中没有间断。密度的标准估计程序是核密度估计。然而，这些估计需要单个单元的地理坐标，这些单元不可处理，因为我们只能访问某些区域系统的集合。为了克服这个障碍，我们使用了统计模拟的概念。这可以解释为Celeux等人（1996）的模拟期望最大化（SEM）算法。我们模拟了与聚集信息一致的电流密度估计的观测结果（S阶）。然后，我们将核密度估计器应用于模拟样本，从而给出下一个密度估计（E-Step）。这一概念首次应用于矩形区域的网格数据，参见Groß等人（2017），用于显示少数民族。在第二个应用程序中，我们演示了这种方法用于所谓的“支持的更改”问题（Bradley等人，2016）。Groß等人（2020）使用SEM算法重新计算非分级行政区域系统之间的病例数。最近，Rendtel等人（2021）应用SEM算法显示了德国冠状病毒感染的时空集群。在这里，我们提出了对基本SEM算法的三种修改：1）我们引入了一种边界校正，它消除了对人口区域边界处核密度估计的低估。2）在计算核密度时，我们会识别出不稳定的区域，如湖泊、公园和工业区。3）我们将SEM算法用于计算局部百分比，这在投票分析中尤其重要。我们通过具有已知地址的本地投票寄存器，对照几种标准地图来评估我们的方法。在实证部分，我们将我们的方法应用于2016年柏林议会选举的投票结果显示。我们将我们的结果与Choropleth地图进行了对比，并展示了报告空间投票结果的新可能性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AStA Wirtschafts- und Sozialstatistisches Archiv

自引率

0.00%

发文量