32nd International Conference on Scientific and Statistical Database Management最新文献

The Vantage Index: Executing Distance Queries at Scale 优势索引:大规模执行距离查询

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400933

Giannis Evagorou, M. Lavalle, T. Heinis

引用次数: 0

Combining Two Worlds: MonetDB with Multi-Dimensional Index Structure Support to Efficiently Query Scientific Data

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3401691

Paul Blockhaus, David Broneske, Martin Schäler, V. Köppen, G. Saake

引用次数: 1

A Versatile Hypergraph Model for Document Collections 文档集合的通用超图模型

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400919

Andreas Spitz, Dennis Aumiller, Bálint Soproni, Michael Gertz

{"title":"A Versatile Hypergraph Model for Document Collections","authors":"Andreas Spitz, Dennis Aumiller, Bálint Soproni, Michael Gertz","doi":"10.1145/3400903.3400919","DOIUrl":"https://doi.org/10.1145/3400903.3400919","url":null,"abstract":"Efficiently and effectively representing large collections of text is of central importance to information retrieval tasks such as summarization and search. Since models for these tasks frequently rely on an implicit graph structure of the documents or their contents, graph-based document representations are naturally appealing. For tasks that consider the joint occurrence of words or entities, however, existing document representations often fall short in capturing cooccurrences of higher order, higher multiplicity, or at varying proximity levels. Furthermore, while numerous applications benefit from structured knowledge sources, external data sources are rarely considered as integral parts of existing document models. To address these shortcomings, we introduce heterogeneous hypergraphs as a versatile model for representing annotated document collections. We integrate external metadata, document content, entity and term annotations, and document segmentation at different granularity levels in a joint model that bridges the gap between structured and unstructured data. We discuss selection and transformation operations on the set of hyperedges, which can be chained to support a wide range of query scenarios. To ensure compatibility with established information retrieval methods, we discuss projection operations that transform hyperedges to traditional dyadic cooccurrence graph representations. Using PostgreSQL and Neo4j, we investigate the suitability of existing database systems for implementing the hypergraph document model, and explore the impact of utilizing implicit and materialized hyperedge representations on storage space requirements and query performance.","PeriodicalId":334018,"journal":{"name":"32nd International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125856096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Efficient Calculation of Empirical P-values for Association Testing of Binary Classifications 二分类关联检验经验p值的有效计算

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400923

Konstantinos Zagganas, Thanasis Vergoulis, Spiros Skiadopoulos, Theodore Dalamagas

引用次数: 1

WALLeSMART: Cloud Platform for Smart Farming WALLeSMART:智能农业云平台

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3401690

Amine Roukh, Fabrice Nolack Fote, S. Mahmoudi, S. Mahmoudi

引用次数: 6

Node Classification and Link Prediction in Social Graphs using RLVECN 基于RLVECN的社交图节点分类与链接预测

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400928

Bonaventure C. Molokwu, Shaon Bhatta Shuvo, N. Kar, Ziad Kobti

引用次数: 8

Hurricane in Bipartite Graphs: The Lethal Nodes of Butterflies 二部图中的飓风:蝴蝶的致命节点

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400916

Qiuyu Zhu, Jiahong Zheng, Han Yang, Chen Chen, Xiaoyang Wang, Ying Zhang

引用次数: 12

Deluceva: Delta-Based Neural Network Inference for Fast Video Analytics Deluceva:基于delta的快速视频分析神经网络推理

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400930

Jingjing Wang, M. Balazinska

引用次数: 2

Determining the provenance of land parcel polygons via machine learning 通过机器学习确定地块多边形的来源

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400924

Vassilis Kaffes, G. Giannopoulos, Nontas Tsakonas, Spiros Skiadopoulos

{"title":"Determining the provenance of land parcel polygons via machine learning","authors":"Vassilis Kaffes, G. Giannopoulos, Nontas Tsakonas, Spiros Skiadopoulos","doi":"10.1145/3400903.3400924","DOIUrl":"https://doi.org/10.1145/3400903.3400924","url":null,"abstract":"An important task on land registration processes is to be able to determine the prevalent data provenance for a finalized polygon that represents a cadastral parcel, since the finalized polygon is derived by the examination of a set of initial polygons, drawn from several individual registers (databases). These registers might contain different, partially similar or conflicting information regarding the ownership, usage and polygon geometry of a cadastral parcel. In such cases, the cadastration expert either select one of of the initial geometries, or (in cases none of the initial accurately represents the finalized land parcel) creates a new geometry. Maintaining this provenance information is of high importance for further cadastration and validation/quality assessment processes; however, due to the gradual and long lasting nature of cadastration procedures, this information is absent from large parts of cadastral databases. In this paper, we present an approach for effectively classifying such land parcel polygons with respect to their provenance information. We propose a method that can produce highly accurate provenance recommendations based only on attributes derived from the geometry of a land parcel. In particular, we implement a set of spatial training features, capturing polygon properties and relations. These features are fed into several classification algorithms and are evaluated on a proprietary dataset of a cadastration company.","PeriodicalId":334018,"journal":{"name":"32nd International Conference on Scientific and Statistical Database Management","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122319113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improving geocoding quality via learning to integrate multiple geocoders 通过学习集成多个地理编码器来提高地理编码质量

32nd International Conference on Scientific and Statistical Database Management Pub Date : 2020-07-07 DOI: 10.1145/3400903.3400918

Konstantinos Alexis, Vassilis Kaffes, Ilias Varkas, A. Syngros, Nontas Tsakonas, G. Giannopoulos

{"title":"Improving geocoding quality via learning to integrate multiple geocoders","authors":"Konstantinos Alexis, Vassilis Kaffes, Ilias Varkas, A. Syngros, Nontas Tsakonas, G. Giannopoulos","doi":"10.1145/3400903.3400918","DOIUrl":"https://doi.org/10.1145/3400903.3400918","url":null,"abstract":"In this paper, we introduce an approach for improving the quality of the geocoding process. Geocoding refers to the procedure of mapping an address of textual form to a pair of accurate spatial coordinates. While there is a variety of available geocoders, both open source and commercial, that curate this mapping in either a semi-automated or fully-automated way, there is no one-size-fits-all system. Depending on the underlying algorithm of each geocoder, its output may be very accurate for some addresses, districts or countries, while failing to properly locate some others. Given that, our setup can be thought of as a meta-geocoding pipeline, built on top of the available geocoders. We propose a machine learning approach, which, given an address and a sequence of coordinate pairs suggested by standalone geocoders, it is able to identify the most accurate one. In order to achieve this, we formulate the task as a multi-class classification problem and introduce a series of domain specific training features, capturing essential information about each coordinate pair suggestion, as well as computing comparative metrics among different suggestions. These features are fed into several classification algorithms and are evaluated on a proprietary address dataset of a geo-marketing company. Furthermore, we present LGM-GC, a QGIS plugin, which provides the functionality of our approach through a user-friendly interface.","PeriodicalId":334018,"journal":{"name":"32nd International Conference on Scientific and Statistical Database Management","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122752032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1