Bumbea Alessio , Mazzitelli Andrea , Giuffrida Annamaria , Espa Giuseppe
{"title":"Spatial bootstrapping using deep clustering methods: Spatial machine learning applied to Lombardy high-tech businesses","authors":"Bumbea Alessio , Mazzitelli Andrea , Giuffrida Annamaria , Espa Giuseppe","doi":"10.1016/j.rspp.2025.100242","DOIUrl":null,"url":null,"abstract":"<div><div>Bootstrap and clustering techniques are foundational tools across scientific disciplines, playing a particularly important role in spatial analysis. However, traditional bootstrap methods often fall short in preserving spatial dependencies and complex attribute relationships during resampling. In this work, we introduce a novel framework in the Spatial Machine Learning domain that leverages deep learning techniques to enhance stratified bootstrap procedures for spatial data. Deep learning has already revolutionized prediction and classification tasks in data with temporal and spatial dependencies. In this work we want to extend the scope of application to bootstrap analysis by using tools like entity embeddings and autoencoders. By encoding high-cardinality categorical variables into continuous representations, entity embeddings facilitate the discovery of meaningful spatial and attribute-based cluster. These embeddings are then passed to a Deep Embedded Clustering (DEC) algorithm that can use them to create clusters. This algorithm is able to handle high-dimensional big data using an autoencoder based architecture that performs dimensionality reduction and clustering simultaneously to avoid loss of information. These clusters can be finally used as strata that guide a stratified bootstrap approach which preserves spatial autocorrelation and heterogeneity. We demonstrate the utility of our framework by performing a bootstrap analysis of high-tech firm productivity in the Lombardy region. Our approach is able to analyze efficiently large amounts of high dimensional data with complex attributes.</div></div>","PeriodicalId":45520,"journal":{"name":"Regional Science Policy and Practice","volume":"17 12","pages":"Article 100242"},"PeriodicalIF":2.1000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Regional Science Policy and Practice","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1757780225000721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"GEOGRAPHY","Score":null,"Total":0}
引用次数: 0
Abstract
Bootstrap and clustering techniques are foundational tools across scientific disciplines, playing a particularly important role in spatial analysis. However, traditional bootstrap methods often fall short in preserving spatial dependencies and complex attribute relationships during resampling. In this work, we introduce a novel framework in the Spatial Machine Learning domain that leverages deep learning techniques to enhance stratified bootstrap procedures for spatial data. Deep learning has already revolutionized prediction and classification tasks in data with temporal and spatial dependencies. In this work we want to extend the scope of application to bootstrap analysis by using tools like entity embeddings and autoencoders. By encoding high-cardinality categorical variables into continuous representations, entity embeddings facilitate the discovery of meaningful spatial and attribute-based cluster. These embeddings are then passed to a Deep Embedded Clustering (DEC) algorithm that can use them to create clusters. This algorithm is able to handle high-dimensional big data using an autoencoder based architecture that performs dimensionality reduction and clustering simultaneously to avoid loss of information. These clusters can be finally used as strata that guide a stratified bootstrap approach which preserves spatial autocorrelation and heterogeneity. We demonstrate the utility of our framework by performing a bootstrap analysis of high-tech firm productivity in the Lombardy region. Our approach is able to analyze efficiently large amounts of high dimensional data with complex attributes.
期刊介绍:
Regional Science Policy & Practice (RSPP) is the official policy and practitioner orientated journal of the Regional Science Association International. It is an international journal that publishes high quality papers in applied regional science that explore policy and practice issues in regional and local development. It welcomes papers from a range of academic disciplines and practitioners including planning, public policy, geography, economics and environmental science and related fields. Papers should address the interface between academic debates and policy development and application. RSPP provides an opportunity for academics and policy makers to develop a dialogue to identify and explore many of the challenges facing local and regional economies.