行政区域级复合空间数据的核密度平滑

Kerstin Erfurth, Marcus Groß, Ulrich Rendtel, Timo Schmid
{"title":"行政区域级复合空间数据的核密度平滑","authors":"Kerstin Erfurth,&nbsp;Marcus Groß,&nbsp;Ulrich Rendtel,&nbsp;Timo Schmid","doi":"10.1007/s11943-021-00298-9","DOIUrl":null,"url":null,"abstract":"<div><p>Composite spatial data on administrative area level are often presented by maps. The aim is to detect regional differences in the concentration of subpopulations, like elderly persons, ethnic minorities, low-educated persons, voters of a political party or persons with a certain disease. Thematic collections of such maps are presented in different atlases. The standard presentation is by Choropleth maps where each administrative unit is represented by a single value. These maps can be criticized under three aspects: the implicit assumption of a uniform distribution within the area, the instability of the resulting map with respect to a change of the reference area and the discontinuities of the maps at the borderlines of the reference areas which inhibit the detection of regional clusters.</p><p>In order to address these problems we use a density approach in the construction of maps. This approach does not enforce a local uniform distribution. It does not depend on a specific choice of area reference system and there are no discontinuities in the displayed maps. A standard estimation procedure of densities are Kernel density estimates. However, these estimates need the geo-coordinates of the single units which are not at disposal as we have only access to the aggregates of some area system. To overcome this hurdle, we use a statistical simulation concept. This can be interpreted as a Simulated Expectation Maximisation (SEM) algorithm of Celeux et al (1996). We simulate observations from the current density estimates which are consistent with the aggregation information (S-step). Then we apply the Kernel density estimator to the simulated sample which gives the next density estimate (E-Step).</p><p>This concept has been first applied for grid data with rectangular areas, see Groß et al (2017), for the display of ethnic minorities. In a second application we demonstrated the use of this approach for the so-called “change of support” (Bradley et al 2016) problem. Here Groß et al (2020) used the SEM algorithm to recalculate case numbers between non-hierarchical administrative area systems. Recently Rendtel et al (2021) applied the SEM algorithm to display spatial-temporal clusters of Corona infections in Germany.</p><p>Here we present three modifications of the basic SEM algorithm: 1) We introduce a boundary correction which removes the underestimation of kernel density estimates at the borders of the population area. 2) We recognize unsettled areas, like lakes, parks and industrial areas, in the computation of the kernel density. 3) We adapt the SEM algorithm for the computation of local percentages which are important especially in voting analysis.</p><p>We evaluate our approach against several standard maps by means of the local voting register with known addresses. In the empirical part we apply our approach for the display of voting results for the 2016 election of the Berlin parliament. We contrast our results against Choropleth maps and show new possibilities for reporting spatial voting results.</p></div>","PeriodicalId":100134,"journal":{"name":"AStA Wirtschafts- und Sozialstatistisches Archiv","volume":"16 1","pages":"25 - 49"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11943-021-00298-9.pdf","citationCount":"1","resultStr":"{\"title\":\"Kernel density smoothing of composite spatial data on administrative area level\",\"authors\":\"Kerstin Erfurth,&nbsp;Marcus Groß,&nbsp;Ulrich Rendtel,&nbsp;Timo Schmid\",\"doi\":\"10.1007/s11943-021-00298-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Composite spatial data on administrative area level are often presented by maps. The aim is to detect regional differences in the concentration of subpopulations, like elderly persons, ethnic minorities, low-educated persons, voters of a political party or persons with a certain disease. Thematic collections of such maps are presented in different atlases. The standard presentation is by Choropleth maps where each administrative unit is represented by a single value. These maps can be criticized under three aspects: the implicit assumption of a uniform distribution within the area, the instability of the resulting map with respect to a change of the reference area and the discontinuities of the maps at the borderlines of the reference areas which inhibit the detection of regional clusters.</p><p>In order to address these problems we use a density approach in the construction of maps. This approach does not enforce a local uniform distribution. It does not depend on a specific choice of area reference system and there are no discontinuities in the displayed maps. A standard estimation procedure of densities are Kernel density estimates. However, these estimates need the geo-coordinates of the single units which are not at disposal as we have only access to the aggregates of some area system. To overcome this hurdle, we use a statistical simulation concept. This can be interpreted as a Simulated Expectation Maximisation (SEM) algorithm of Celeux et al (1996). We simulate observations from the current density estimates which are consistent with the aggregation information (S-step). Then we apply the Kernel density estimator to the simulated sample which gives the next density estimate (E-Step).</p><p>This concept has been first applied for grid data with rectangular areas, see Groß et al (2017), for the display of ethnic minorities. In a second application we demonstrated the use of this approach for the so-called “change of support” (Bradley et al 2016) problem. Here Groß et al (2020) used the SEM algorithm to recalculate case numbers between non-hierarchical administrative area systems. Recently Rendtel et al (2021) applied the SEM algorithm to display spatial-temporal clusters of Corona infections in Germany.</p><p>Here we present three modifications of the basic SEM algorithm: 1) We introduce a boundary correction which removes the underestimation of kernel density estimates at the borders of the population area. 2) We recognize unsettled areas, like lakes, parks and industrial areas, in the computation of the kernel density. 3) We adapt the SEM algorithm for the computation of local percentages which are important especially in voting analysis.</p><p>We evaluate our approach against several standard maps by means of the local voting register with known addresses. In the empirical part we apply our approach for the display of voting results for the 2016 election of the Berlin parliament. We contrast our results against Choropleth maps and show new possibilities for reporting spatial voting results.</p></div>\",\"PeriodicalId\":100134,\"journal\":{\"name\":\"AStA Wirtschafts- und Sozialstatistisches Archiv\",\"volume\":\"16 1\",\"pages\":\"25 - 49\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s11943-021-00298-9.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AStA Wirtschafts- und Sozialstatistisches Archiv\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11943-021-00298-9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AStA Wirtschafts- und Sozialstatistisches Archiv","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s11943-021-00298-9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

行政区域一级的综合空间数据通常由地图提供。其目的是检测亚群体集中度的区域差异,如老年人、少数民族、低教育程度者、政党选民或患有某种疾病的人。这些地图的专题集载于不同的地图册中。标准表示方式为Choropleth地图,其中每个行政单位由一个值表示。这些地图可以在三个方面受到批评:区域内均匀分布的隐含假设、生成的地图相对于参考区域变化的不稳定性以及地图在参考区域边界线的不连续性,这些不连续性阻碍了区域集群的检测。为了解决这些问题,我们在地图的构建中使用了密度方法。这种方法不强制执行局部均匀分布。它不取决于区域参考系统的具体选择,并且显示的地图中没有间断。密度的标准估计程序是核密度估计。然而,这些估计需要单个单元的地理坐标,这些单元不可处理,因为我们只能访问某些区域系统的集合。为了克服这个障碍,我们使用了统计模拟的概念。这可以解释为Celeux等人(1996)的模拟期望最大化(SEM)算法。我们模拟了与聚集信息一致的电流密度估计的观测结果(S阶)。然后,我们将核密度估计器应用于模拟样本,从而给出下一个密度估计(E-Step)。这一概念首次应用于矩形区域的网格数据,参见Groß等人(2017),用于显示少数民族。在第二个应用程序中,我们演示了这种方法用于所谓的“支持的更改”问题(Bradley等人,2016)。Groß等人(2020)使用SEM算法重新计算非分级行政区域系统之间的病例数。最近,Rendtel等人(2021)应用SEM算法显示了德国冠状病毒感染的时空集群。在这里,我们提出了对基本SEM算法的三种修改:1)我们引入了一种边界校正,它消除了对人口区域边界处核密度估计的低估。2) 在计算核密度时,我们会识别出不稳定的区域,如湖泊、公园和工业区。3) 我们将SEM算法用于计算局部百分比,这在投票分析中尤其重要。我们通过具有已知地址的本地投票寄存器,对照几种标准地图来评估我们的方法。在实证部分,我们将我们的方法应用于2016年柏林议会选举的投票结果显示。我们将我们的结果与Choropleth地图进行了对比,并展示了报告空间投票结果的新可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Kernel density smoothing of composite spatial data on administrative area level

Composite spatial data on administrative area level are often presented by maps. The aim is to detect regional differences in the concentration of subpopulations, like elderly persons, ethnic minorities, low-educated persons, voters of a political party or persons with a certain disease. Thematic collections of such maps are presented in different atlases. The standard presentation is by Choropleth maps where each administrative unit is represented by a single value. These maps can be criticized under three aspects: the implicit assumption of a uniform distribution within the area, the instability of the resulting map with respect to a change of the reference area and the discontinuities of the maps at the borderlines of the reference areas which inhibit the detection of regional clusters.

In order to address these problems we use a density approach in the construction of maps. This approach does not enforce a local uniform distribution. It does not depend on a specific choice of area reference system and there are no discontinuities in the displayed maps. A standard estimation procedure of densities are Kernel density estimates. However, these estimates need the geo-coordinates of the single units which are not at disposal as we have only access to the aggregates of some area system. To overcome this hurdle, we use a statistical simulation concept. This can be interpreted as a Simulated Expectation Maximisation (SEM) algorithm of Celeux et al (1996). We simulate observations from the current density estimates which are consistent with the aggregation information (S-step). Then we apply the Kernel density estimator to the simulated sample which gives the next density estimate (E-Step).

This concept has been first applied for grid data with rectangular areas, see Groß et al (2017), for the display of ethnic minorities. In a second application we demonstrated the use of this approach for the so-called “change of support” (Bradley et al 2016) problem. Here Groß et al (2020) used the SEM algorithm to recalculate case numbers between non-hierarchical administrative area systems. Recently Rendtel et al (2021) applied the SEM algorithm to display spatial-temporal clusters of Corona infections in Germany.

Here we present three modifications of the basic SEM algorithm: 1) We introduce a boundary correction which removes the underestimation of kernel density estimates at the borders of the population area. 2) We recognize unsettled areas, like lakes, parks and industrial areas, in the computation of the kernel density. 3) We adapt the SEM algorithm for the computation of local percentages which are important especially in voting analysis.

We evaluate our approach against several standard maps by means of the local voting register with known addresses. In the empirical part we apply our approach for the display of voting results for the 2016 election of the Berlin parliament. We contrast our results against Choropleth maps and show new possibilities for reporting spatial voting results.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信