STULL: Unbiased Online Sampling for Visual Exploration of Large Spatiotemporal Data

2020 IEEE Conference on Visual Analytics Science and Technology (VAST) Pub Date : 2020-08-29 DOI:10.1109/VAST50239.2020.00012

Guizhen Wang, Jingjing Guo, Mingjie Tang, J. Q. Neto, Calvin Yau, Anas Daghistani, M. Karimzadeh, Walid G. Aref, D. Ebert

{"title":"STULL: Unbiased Online Sampling for Visual Exploration of Large Spatiotemporal Data","authors":"Guizhen Wang, Jingjing Guo, Mingjie Tang, J. Q. Neto, Calvin Yau, Anas Daghistani, M. Karimzadeh, Walid G. Aref, D. Ebert","doi":"10.1109/VAST50239.2020.00012","DOIUrl":null,"url":null,"abstract":"Online sampling-supported visual analytics is increasingly important, as it allows users to explore large datasets with acceptable approximate answers at interactive rates. However, existing online spatiotemporal sampling techniques are often biased, as most researchers have primarily focused on reducing computational latency. Biased sampling approaches select data with unequal probabilities and produce results that do not match the exact data distribution, leading end users to incorrect interpretations. In this paper, we propose a novel approach to perform unbiased online sampling of large spatiotemporal data. The proposed approach ensures the same probability of selection to every point that qualifies the specifications of a user’s multidimensional query. To achieve unbiased sampling for accurate representative interactive visualizations, we design a novel data index and an associated sample retrieval plan. Our proposed sampling approach is suitable for a wide variety of visual analytics tasks, e.g., tasks that run aggregate queries of spatiotemporal data. Extensive experiments confirm the superiority of our approach over a state-of-the-art spatial online sampling technique, demonstrating that within the same computational time, data samples generated in our approach are at least 50% more accurate in representing the actual spatial distribution of the data and enable approximate visualizations to present closer visual appearances to the exact ones.","PeriodicalId":244967,"journal":{"name":"2020 IEEE Conference on Visual Analytics Science and Technology (VAST)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Conference on Visual Analytics Science and Technology (VAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VAST50239.2020.00012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Online sampling-supported visual analytics is increasingly important, as it allows users to explore large datasets with acceptable approximate answers at interactive rates. However, existing online spatiotemporal sampling techniques are often biased, as most researchers have primarily focused on reducing computational latency. Biased sampling approaches select data with unequal probabilities and produce results that do not match the exact data distribution, leading end users to incorrect interpretations. In this paper, we propose a novel approach to perform unbiased online sampling of large spatiotemporal data. The proposed approach ensures the same probability of selection to every point that qualifies the specifications of a user’s multidimensional query. To achieve unbiased sampling for accurate representative interactive visualizations, we design a novel data index and an associated sample retrieval plan. Our proposed sampling approach is suitable for a wide variety of visual analytics tasks, e.g., tasks that run aggregate queries of spatiotemporal data. Extensive experiments confirm the superiority of our approach over a state-of-the-art spatial online sampling technique, demonstrating that within the same computational time, data samples generated in our approach are at least 50% more accurate in representing the actual spatial distribution of the data and enable approximate visualizations to present closer visual appearances to the exact ones.

查看原文本刊更多论文

STULL:对大型时空数据进行视觉探索的无偏在线抽样

在线抽样支持的可视化分析越来越重要，因为它允许用户以交互速率探索具有可接受的近似答案的大型数据集。然而，现有的在线时空采样技术往往存在偏差，因为大多数研究人员主要关注于减少计算延迟。有偏抽样方法选择概率不等的数据，产生的结果与确切的数据分布不匹配，导致最终用户做出不正确的解释。在本文中，我们提出了一种对大型时空数据进行无偏在线采样的新方法。所提出的方法确保对符合用户多维查询规范的每个点的选择概率相同。为了实现无偏采样以获得准确的代表性交互可视化，我们设计了一种新的数据索引和相关的样本检索计划。我们提出的采样方法适用于各种各样的可视化分析任务，例如，运行时空数据聚合查询的任务。大量的实验证实了我们的方法比最先进的空间在线采样技术的优越性，表明在相同的计算时间内，我们的方法生成的数据样本在表示数据的实际空间分布方面至少要准确50%，并使近似可视化能够呈现更接近确切的视觉外观。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE Conference on Visual Analytics Science and Technology (VAST)

自引率

0.00%

发文量