0采样算法的实用方面

Anais do Encontro de Teoria da Computação (ETC) Pub Date : 2018-07-26 DOI:10.5753/ETC.2018.3145

Juan P. A. Lopes, Fabiano de S. Oliveira, V. Barbosa

{"title":"0采样算法的实用方面","authors":"Juan P. A. Lopes, Fabiano de S. Oliveira, V. Barbosa","doi":"10.5753/ETC.2018.3145","DOIUrl":null,"url":null,"abstract":"The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1  i  t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.","PeriodicalId":315906,"journal":{"name":"Anais do Encontro de Teoria da Computação (ETC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Practical aspects of `0-sampling algorithms\",\"authors\":\"Juan P. A. Lopes, Fabiano de S. Oliveira, V. Barbosa\",\"doi\":\"10.5753/ETC.2018.3145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1  i  t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.\",\"PeriodicalId\":315906,\"journal\":{\"name\":\"Anais do Encontro de Teoria da Computação (ETC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Anais do Encontro de Teoria da Computação (ETC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5753/ETC.2018.3145\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do Encontro de Teoria da Computação (ETC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/ETC.2018.3145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

0采样问题在流图算法中起着重要的作用。在本文中，我们重新审视了一种近乎最优的0采样算法，提出了一种允许证明更严格的失败概率上界的变体。我们比较了两种变体的实验结果，为其在实际情况下的适用性提供了经验证据。0采样问题包括从一个动态向量a = (a1，…)中采样一个非零坐标。， an)具有均匀概率。这个向量定义在一个旋转门模型中，它由一个更新流S = hs1, s2，…组成。， a(初始值为0)，其中si = (ui, i) 2{1，…， n} > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >理想的情况是，这样的样品是在通过具有次线性空间复杂性的流的单次通过中产生的。挑战来自于这样一个事实，因为i可以是负的，因此流中的一些更新可能会取消其他更新，直接对流采样可能会导致不正确的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Practical aspects of `0-sampling algorithms

The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1  i  t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Anais do Encontro de Teoria da Computação (ETC)

自引率

0.00%

发文量