Juan P. A. Lopes, Fabiano de S. Oliveira, V. Barbosa
{"title":"0采样算法的实用方面","authors":"Juan P. A. Lopes, Fabiano de S. Oliveira, V. Barbosa","doi":"10.5753/ETC.2018.3145","DOIUrl":null,"url":null,"abstract":"The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1 i t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.","PeriodicalId":315906,"journal":{"name":"Anais do Encontro de Teoria da Computação (ETC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Practical aspects of `0-sampling algorithms\",\"authors\":\"Juan P. A. Lopes, Fabiano de S. Oliveira, V. Barbosa\",\"doi\":\"10.5753/ETC.2018.3145\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1 i t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.\",\"PeriodicalId\":315906,\"journal\":{\"name\":\"Anais do Encontro de Teoria da Computação (ETC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Anais do Encontro de Teoria da Computação (ETC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5753/ETC.2018.3145\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do Encontro de Teoria da Computação (ETC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/ETC.2018.3145","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The `0-sampling problem plays an important role in streaming graph algorithms. In this paper, we revisit a near-optimal `0-sampling algorithm, proposing a variant that allows proving a tighter upper bound for the probability of failure. We compare experimental results of both variants, providing empirical evidence of their applicability in real-case scenarios. The `0-sampling problem consists in sampling a nonzero coordinate from a dynamic vector a = (a1, . . . , an) with uniform probability. This vector is defined in a turnstile model, which consists of a stream of updates S = hs1, s2, . . . , sti on a (initially 0), where si = (ui, i) 2 { 1, . . . , n} ⇥ R for all 1 i t, meaning an increment of i units to aui . It is desirable that such sample be produced in a single pass through the stream with sublinear space complexity. The challenge arises from the fact that, since i can be negative and hence some updates in the stream may cancel others, directly sampling the stream may lead to incorrect results.