{"title":"Measuring unit relevance and stability in hierarchical spatio-temporal clustering","authors":"Roy Cerqueti , Raffaele Mattera","doi":"10.1016/j.spasta.2025.100880","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding the significance of individual data points within clustering structures is critical to effective data analysis. Traditional stability methods, while valuable, often overlook the nuanced impact of individual units, particularly in spatial contexts. In this paper, we explore the concept of unit relevance in clustering analysis, emphasizing its importance in capturing the spatio-temporal nature of the clustering problem. We propose a simple measure of unit relevance, the Unit Relevance Index (URI), and define an overall measure of clustering stability based on the aggregation of computed URIs. Considering two experiments on real datasets with geo-referenced time series, we find that the use of spatial constraints in the clustering task yields more stable results. Therefore, the inclusion of the spatial dimension can be seen as a way to stabilize the clustering.</div></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"66 ","pages":"Article 100880"},"PeriodicalIF":2.1000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spatial Statistics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211675325000028","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding the significance of individual data points within clustering structures is critical to effective data analysis. Traditional stability methods, while valuable, often overlook the nuanced impact of individual units, particularly in spatial contexts. In this paper, we explore the concept of unit relevance in clustering analysis, emphasizing its importance in capturing the spatio-temporal nature of the clustering problem. We propose a simple measure of unit relevance, the Unit Relevance Index (URI), and define an overall measure of clustering stability based on the aggregation of computed URIs. Considering two experiments on real datasets with geo-referenced time series, we find that the use of spatial constraints in the clustering task yields more stable results. Therefore, the inclusion of the spatial dimension can be seen as a way to stabilize the clustering.
期刊介绍:
Spatial Statistics publishes articles on the theory and application of spatial and spatio-temporal statistics. It favours manuscripts that present theory generated by new applications, or in which new theory is applied to an important practical case. A purely theoretical study will only rarely be accepted. Pure case studies without methodological development are not acceptable for publication.
Spatial statistics concerns the quantitative analysis of spatial and spatio-temporal data, including their statistical dependencies, accuracy and uncertainties. Methodology for spatial statistics is typically found in probability theory, stochastic modelling and mathematical statistics as well as in information science. Spatial statistics is used in mapping, assessing spatial data quality, sampling design optimisation, modelling of dependence structures, and drawing of valid inference from a limited set of spatio-temporal data.