技术观点:通过时间偏差抽样的在线模型管理

SIGMOD Rec. Pub Date : 2019-11-05 DOI:10.1145/3371316.3371332

K. Yi

{"title":"技术观点:通过时间偏差抽样的在线模型管理","authors":"K. Yi","doi":"10.1145/3371316.3371332","DOIUrl":null,"url":null,"abstract":"Randoms sampling from data streams is a problem with a long history of studies, starting from the famous reservoir sampling algorithm that is at least 50 years old [2]. The reservoir sampling algorithm maintains a random sample over all data items that have ever been received from the stream. This is not suitable for many of today's applications on evolving data streams, where recent data is more important than older ones.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"6 1","pages":"68"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Technical Perspective: Online Model Management via Temporally Biased Sampling\",\"authors\":\"K. Yi\",\"doi\":\"10.1145/3371316.3371332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Randoms sampling from data streams is a problem with a long history of studies, starting from the famous reservoir sampling algorithm that is at least 50 years old [2]. The reservoir sampling algorithm maintains a random sample over all data items that have ever been received from the stream. This is not suitable for many of today's applications on evolving data streams, where recent data is more important than older ones.\",\"PeriodicalId\":21740,\"journal\":{\"name\":\"SIGMOD Rec.\",\"volume\":\"6 1\",\"pages\":\"68\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIGMOD Rec.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371316.3371332\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGMOD Rec.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371316.3371332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

从数据流中随机抽样是一个研究历史悠久的问题，从著名的水库抽样算法开始，至少有50年的历史[2]。油藏采样算法对从流中接收到的所有数据项保持随机采样。这并不适合当今许多基于不断发展的数据流的应用程序，在这些应用程序中，最近的数据比旧的数据更重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Technical Perspective: Online Model Management via Temporally Biased Sampling

Randoms sampling from data streams is a problem with a long history of studies, starting from the famous reservoir sampling algorithm that is at least 50 years old [2]. The reservoir sampling algorithm maintains a random sample over all data items that have ever been received from the stream. This is not suitable for many of today's applications on evolving data streams, where recent data is more important than older ones.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

SIGMOD Rec.

自引率

0.00%

发文量