International Workshop on Analytics for Big Geospatial Data最新文献

筛选
英文 中文
Spatiotemporal data mining in the era of big spatial data: algorithms and applications 大空间数据时代的时空数据挖掘:算法与应用
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447482
Ranga Raju Vatsavai, A. Ganguly, V. Chandola, A. Stefanidis, S. Klasky, S. Shekhar
{"title":"Spatiotemporal data mining in the era of big spatial data: algorithms and applications","authors":"Ranga Raju Vatsavai, A. Ganguly, V. Chandola, A. Stefanidis, S. Klasky, S. Shekhar","doi":"10.1145/2447481.2447482","DOIUrl":"https://doi.org/10.1145/2447481.2447482","url":null,"abstract":"Spatial data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from the spatial and spatiotemporal data. However, explosive growth in the spatial and spatiotemporal data, and the emergence of social media and location sensing technologies emphasize the need for developing new and computationally efficient methods tailored for analyzing big data. In this paper, we review major spatial data mining algorithms by closely looking at the computational and I/O requirements and allude to few applications dealing with big spatial data.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113999909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 147
Computing the drainage network on huge grid terrains 计算巨大网格地形上的排水网络
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447488
Thiago L. Gomes, S. V. G. Magalhães, M. Andrade, W. Randolph Franklin, Guilherme C. Pena
{"title":"Computing the drainage network on huge grid terrains","authors":"Thiago L. Gomes, S. V. G. Magalhães, M. Andrade, W. Randolph Franklin, Guilherme C. Pena","doi":"10.1145/2447481.2447488","DOIUrl":"https://doi.org/10.1145/2447481.2447488","url":null,"abstract":"We present a very efficient algorithm, named EMFlow, and its implementation to compute the drainage network, that is, the flow direction and flow accumulation on huge terrains stored in external memory. It is about 20 times faster than the two most recent and most efficient published methods: TerraFlow and r.watershed.seg. Since processing large datasets can take hours, this improvement is very significant.\u0000 The EMFlow is based on our previous method RWFlood which uses a flooding process to compute the drainage network. And, to reduce the total number of I/O operations, EMFlow is based on grouping the terrain cells into blocks which are stored in a special data structure managed as a cache memory. Also, a new strategy is adopted to subdivide the terrains in islands which are processed separately.\u0000 Because of the recent increase in the volume of high resolution terrestrial data, the internal memory algorithms do not run well on most computers and, thus, optimizing the massive data processing algorithm simultaneously for data movement and computation has been a challenge for GIS.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131033192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Accelerating satellite image based large-scale settlement detection with GPU 利用GPU加速基于卫星图像的大规模沉降检测
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447487
D. Patlolla, E. Bright, Jeanette E. Weaver, A. Cheriyadat
{"title":"Accelerating satellite image based large-scale settlement detection with GPU","authors":"D. Patlolla, E. Bright, Jeanette E. Weaver, A. Cheriyadat","doi":"10.1145/2447481.2447487","DOIUrl":"https://doi.org/10.1145/2447481.2447487","url":null,"abstract":"Computer vision algorithms for image analysis are often computationally demanding. Application of such algorithms on large image databases--- such as the high-resolution satellite imagery covering the entire land surface, can easily saturate the computational capabilities of conventional CPUs. There is a great demand for vision algorithms running on high performance computing (HPC) architecture capable of processing petascale image data. We exploit the parallel processing capability of GPUs to present a GPU-friendly algorithm for robust and efficient detection of settlements from large-scale high-resolution satellite imagery. Feature descriptor generation is an expensive, but a key step in automated scene analysis. To address this challenge, we present GPU implementations for three different feature descriptors-multiscale Historgram of Oriented Gradients (HOG), Gray Level Co-Occurrence Matrix (GLCM) Contrast and local pixel intensity statistics. We perform extensive experimental evaluations of our implementation using diverse and large image datasets. Our GPU implementation of the feature descriptor algorithms results in speedups of 220 times compared to the CPU version. We present an highly efficient settlement detection system running on a multiGPU architecture capable of extracting human settlement regions from a city-scale sub-meter spatial resolution aerial imagery spanning roughly 1200 sq. kilometers in just 56 seconds with detection accuracy close to 90%. This remarkable speedup gained by our vision algorithm maintaining high detection accuracy clearly demonstrates that such computational advancements clearly hold the solution for petascale image analysis challenges.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128222875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
TMC-pattern: holistic trajectory extraction, modeling and mining TMC-pattern:整体轨迹提取、建模和挖掘
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447490
Roland Assam, T. Seidl
{"title":"TMC-pattern: holistic trajectory extraction, modeling and mining","authors":"Roland Assam, T. Seidl","doi":"10.1145/2447481.2447490","DOIUrl":"https://doi.org/10.1145/2447481.2447490","url":null,"abstract":"Mobility data is Big Data. Modeling such raw big location data is quite challenging in terms of quality and runtime efficiency. Mobility data emanating from smart phones and other pervasive devices consists of a combination of spatio-temporal dimensions, as well as some additional contextual dimensions that may range from social network activities, diseases to telephone calls. However, most existing trajectory models focus only on the spatio-temporal dimensions of mobility data and their regions of interest depict only the popularity of a place. In this paper, we propose a novel trajectory model called Time Mobility Context Correlation Pattern (TMC-Pattern), which considers a wide variety of dimensions and utilizes subspace clustering to find contextual regions of interest. In addition, our proposed TMC-Pattern rigorously captures and embeds infrastructural, human, social and behavioral patterns into the trajectory model. We show theoretically and experimentally, how TMC-Pattern can be used for Frequent Location Sequence Mining and Location Prediction with real datasets.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121183433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
EarthDB: scalable analysis of MODIS data using SciDB EarthDB:使用SciDB对MODIS数据进行可扩展分析
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447483
Gary Planthaber, M. Stonebraker, J. Frew
{"title":"EarthDB: scalable analysis of MODIS data using SciDB","authors":"Gary Planthaber, M. Stonebraker, J. Frew","doi":"10.1145/2447481.2447483","DOIUrl":"https://doi.org/10.1145/2447481.2447483","url":null,"abstract":"Earth scientists are increasingly experiencing difficulties with analyzing rapidly growing volumes of complex data. Those who must perform analysis directly on low-level National Aeronautics and Space Administration (NASA) Moderate Resolution Imaging Spectroradiometer (MODIS) Level 1B calibrated and geolocated data, for example, encounter an arcane, high-volume data set that is burdensome to make use of. Instead, Earth scientists typically opt to use higher-level \"canned\" products provided by NASA. However, when these higher-level products fail to meet the requirements of a particular project, a cruel dilemma arises: cope with data products that don't exactly meet the project's needs or spend an enormous amount of resources extracting what is needed from the unadulterated low-level data. In this paper, we present EarthDB, a system that eliminates this dilemma by offering the following contributions:\u0000 1. Enabling painless importing of MODIS Level 1B data into SciDB, a highly scalable science-oriented database platform that abstracts away the complexity of distributed storage and analysis of complex multi-dimensional data,\u0000 2. Defining a schema that unifies storage and representation of MODIS Level 1B data, regardless of its source file,\u0000 3. Supporting fast filtering and analysis of MODIS data through the use of an intuitive, high-level query language rather than complex procedural programming and,\u0000 4. Providing the ability to easily define and reconfigure entire analysis pipelines within the SciDB database, allowing for rapid ad-hoc analysis. To demonstrate this ability, we provide sample benchmarks for the construction of true-color (RGB) and Normalized Difference Vegetative Index (NDVI) images from raw MODIS Level 1B data using relatively simple queries with scalable performance.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114204754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Big 3D spatial data processing using cloud computing environment 利用云计算环境进行大三维空间数据处理
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447484
R. Sugumaran, Jeff Burnett, Andrew Blinkmann
{"title":"Big 3D spatial data processing using cloud computing environment","authors":"R. Sugumaran, Jeff Burnett, Andrew Blinkmann","doi":"10.1145/2447481.2447484","DOIUrl":"https://doi.org/10.1145/2447481.2447484","url":null,"abstract":"Lately, acquiring a large quantity of three-dimensional (3-D) spatial data particularly topographic information has become commonplace with the advent of new technology such as laser scanner or light detection and ranging (LiDAR) and techniques. Though both in the USA and around the globe, the pace of massive 3-D spatial data collection is accelerating, the provision of affordable technology for dealing with issues such as processing, management, archival, dissemination, and analysis of the huge data volumes has lagged behind. Single computers and generic high-end computing are not sufficient to process this massive data and researches started to explore other computing environments. Recently cloud computing environment showed very promising solutions due to availability and affordability. The main goal of this paper is to develop a web-based LiDAR data processing framework called \"Cloud Computing-based LiDAR Processing System (CLiPS)\" to process massive LiDAR data using cloud computing environment. The CLiPS framework implementation was done using ESRI's ArcGIS server, Amazon Elastic Compute Cloud (Amazon EC2), and several open source spatial tools. Some of the applications developed in this project include: 1) preprocessing tools for LiDAR data, 2) generation of large area Digital Elevation Model (DEMs) on the cloud environment, and 3) user-driven DEM derived products. We have used three different terrain types, LiDAR tile sizes, and EC2 instant types (large, Xlarge, and double Xlarge) to test for time and cost comparisons. Undulating terrain data took more time than other two terrain types used in this study and overall cost for the entire project was less than $100.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131563509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Sort-based parallel loading of R-trees 基于排序的r树并行加载
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447489
Daniar Achakeev, M. Seidemann, Markus Schmidt, B. Seeger
{"title":"Sort-based parallel loading of R-trees","authors":"Daniar Achakeev, M. Seidemann, Markus Schmidt, B. Seeger","doi":"10.1145/2447481.2447489","DOIUrl":"https://doi.org/10.1145/2447481.2447489","url":null,"abstract":"Due to the increasing amount of spatial data, parallel algorithms for processing big spatial data become more and more important. In particular, the shared nothing architecture is attractive as it offers low cost data processing. Moreover, popular MapReduce frameworks such as Hadoop allow developing conceptually simple and scalable algorithms for processing big data using this architecture. In this work we address the problem of parallel loading of R-trees on a shared-nothing platform. The R-tree is a key element for efficient query processing in large spatial database, but its creation is expensive. We proposed a novel scalable parallel loading algorithm for MapReduce. The core of our parallel loading is the state of the art sequential sort-based query-adaptive R-tree loading algorithm that builds R-trees optimized according to a commonly used cost model. In contrast to previous methods for loading R-trees with MapReduce we construct the R-tree level-wise. Our experimental results show an almost linear speedup in the number of machines. Moreover, the resulting R-trees provide a better query performance than R-trees build by other competitive bulk-loading algorithms.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126991443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Extracting storm-centric characteristics from raw rainfall data for storm analysis and mining 从原始降雨数据中提取风暴中心特征,用于风暴分析和挖掘
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447492
Kulsawasd Jitkajornwanich, R. Elmasri, J. McEnery, Chengkai Li
{"title":"Extracting storm-centric characteristics from raw rainfall data for storm analysis and mining","authors":"Kulsawasd Jitkajornwanich, R. Elmasri, J. McEnery, Chengkai Li","doi":"10.1145/2447481.2447492","DOIUrl":"https://doi.org/10.1145/2447481.2447492","url":null,"abstract":"Most rainfall data is stored in formats that are not easy to analyze and mine. In these formats, the amount of data is enormous. In this paper, we propose techniques to summarize the raw rainfall data into a model that facilitates storm analysis and mining, and reduces the data size. The result is to convert raw rainfall data into meaningful storm-centric data, which is then stored in a relational database for easy analysis and mining. The size of the storm data is less than 1% of the size of the raw data. We can determine the spatio-temporal characteristics of a storm, such as how big a storm is, how many sites are covered, and what is its overall depth (precipitation) and duration. We present formal definitions for the storm-related concepts that are needed in our data conversion. Then we describe storm identification algorithms based on these concepts. Our storm identification algorithms analyze precipitation values of adjacent sites within the period of time that covers the whole storm and combines them together to identify the overall storm characteristics.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124515111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Speeding up large-scale point-in-polygon test based spatial join on GPUs gpu上基于空间连接的大规模多边形点测试提速
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447485
Jianting Zhang, Simin You
{"title":"Speeding up large-scale point-in-polygon test based spatial join on GPUs","authors":"Jianting Zhang, Simin You","doi":"10.1145/2447481.2447485","DOIUrl":"https://doi.org/10.1145/2447481.2447485","url":null,"abstract":"Point-in-Polygon (PIP) test is fundamental to spatial databases and GIS. Motivated by the slow response times in joining large-scale point locations with polygons using traditional spatial databases and GIS, we have designed and developed an end-to-end system completely on Graphics Processing Units (GPUs) to associate points with the polygons that they fall within by utilizing massively data parallel computing power of GPUs. The system includes an efficient module to generate point quadrants that have at most K points from large-scale unordered points, a simple grid-file based spatial filtering approach to associate point quadrants and polygons, and, a PIP test module to assign polygons to points in a GPU computing block using both the block and thread level parallelisms. Experiments on joining 170 million points with more than 40 thousand polygons have resulted in a runtime of 11.165 seconds on an Nvidia Quadro 6000 GPU device. In contrast, a baseline serial CPU implementation using state-of-the-art open source GIS packages required 15+ hours to complete. We further discuss several factors and parameters that may affect the system performance.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115410178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
Towards scalable ad-hoc climate anomalies search 走向可扩展的特别气候异常搜索
International Workshop on Analytics for Big Geospatial Data Pub Date : 2012-11-06 DOI: 10.1145/2447481.2447493
P. Baumann, D. Misev
{"title":"Towards scalable ad-hoc climate anomalies search","authors":"P. Baumann, D. Misev","doi":"10.1145/2447481.2447493","DOIUrl":"https://doi.org/10.1145/2447481.2447493","url":null,"abstract":"Meteorological data contribute significantly to \"Big Data\"; however, not only is their volume ranging into Petabyte sizes for single objects a challenge, but also the number of dimensions -- such general 4-D spatio-temporal data cannot be handled through traditional GIS methods and tools. Actually, climate data tend to transcend these dimensions and add an extra time dimension for the simulation run time, ending up with 5-D data cubes.\u0000 Traditional databases, known for their flexibility and scalability, have proven inadequate due to their lack of support for multi-dimensional rasters. Consequently, file-based implementations are being used for serving such data to the community, rather than databases. This is recently overcome by Array Databases which provide storage and query support for this information category of multi-dimensional rasters, thereby unleashing the scalability and flexibility advantages for climate data management.\u0000 In this contribution, we present a case study where non-trivial analytics functionality on n-D climate data cubes has been established. Storage optimization techniques novel to standard databases allow to tune the system for interactive response in many cases. We briefly introduce the rasdaman database system used, present the database schema and practically important queries use case, and report preliminary performance observations. To the best of our knowledge, this is the first non-academic, real-life deployment of an array database for up to 5-D data sets.","PeriodicalId":416086,"journal":{"name":"International Workshop on Analytics for Big Geospatial Data","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132172974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信