{"title":"What is special about spatial data science and Geo-AI?","authors":"S. Shekhar","doi":"10.1145/3468791.3472263","DOIUrl":null,"url":null,"abstract":"The importance of spatial data science and Geo-AI is growing with the rise of spatial and spatiotemporal big data (e.g., trajectories, remote-sensing images, census and geo-social media) [1-2]. Societal use cases include Agriculture (global crop monitoring, precision agriculture), Location-based services (e.g., navigation, ride-sharing), Public Health (e.g., monitoring disease spread), Environment and Climate (change detection, land-cover classification), Smart Cities (e.g., mapping buildings), etc. [1-2] Classical data science and AI (e.g., machine learning) often perform poorly when applied to spatial data sets because of the many reasons [1-5]. First, spatial data is embedded in a continuous space and classical statistics (e.g., correlation) are not robust to the modifiable areal unit problem. Second, spatial data-items have extended footprints (e.g., line strings, polygons) and implicit relationships (e.g., distance, touch). Third, high cost of spurious patterns requires guardrails (e.g., statistical significance tests) to reduce false positives. Furthermore, spatial autocorrelation and variability violate the classical assumption of data samples being generated independently from identical distributions, which risk models that are either inaccurate or inconsistent with the data. Thus, new methods are needed to analyze spatial data [1-5]. This talk surveys common and emerging methods for spatial classification and prediction (e.g., spatial autoregression, spatial decision trees [6], spatial variability aware neural networks [7]), as well as techniques for discovering interesting, useful and non-trivial patterns such as hotspots (e.g., circular, linear, arbitrary shapes [8]), interactions (e.g., co-locations [9], tele-connections), spatial outliers [10], and their spatio-temporal counterparts [3].","PeriodicalId":312773,"journal":{"name":"33rd International Conference on Scientific and Statistical Database Management","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"33rd International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3468791.3472263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The importance of spatial data science and Geo-AI is growing with the rise of spatial and spatiotemporal big data (e.g., trajectories, remote-sensing images, census and geo-social media) [1-2]. Societal use cases include Agriculture (global crop monitoring, precision agriculture), Location-based services (e.g., navigation, ride-sharing), Public Health (e.g., monitoring disease spread), Environment and Climate (change detection, land-cover classification), Smart Cities (e.g., mapping buildings), etc. [1-2] Classical data science and AI (e.g., machine learning) often perform poorly when applied to spatial data sets because of the many reasons [1-5]. First, spatial data is embedded in a continuous space and classical statistics (e.g., correlation) are not robust to the modifiable areal unit problem. Second, spatial data-items have extended footprints (e.g., line strings, polygons) and implicit relationships (e.g., distance, touch). Third, high cost of spurious patterns requires guardrails (e.g., statistical significance tests) to reduce false positives. Furthermore, spatial autocorrelation and variability violate the classical assumption of data samples being generated independently from identical distributions, which risk models that are either inaccurate or inconsistent with the data. Thus, new methods are needed to analyze spatial data [1-5]. This talk surveys common and emerging methods for spatial classification and prediction (e.g., spatial autoregression, spatial decision trees [6], spatial variability aware neural networks [7]), as well as techniques for discovering interesting, useful and non-trivial patterns such as hotspots (e.g., circular, linear, arbitrary shapes [8]), interactions (e.g., co-locations [9], tele-connections), spatial outliers [10], and their spatio-temporal counterparts [3].