{"title":"技术观点:通过时间偏差抽样的在线模型管理","authors":"K. Yi","doi":"10.1145/3371316.3371332","DOIUrl":null,"url":null,"abstract":"Randoms sampling from data streams is a problem with a long history of studies, starting from the famous reservoir sampling algorithm that is at least 50 years old [2]. The reservoir sampling algorithm maintains a random sample over all data items that have ever been received from the stream. This is not suitable for many of today's applications on evolving data streams, where recent data is more important than older ones.","PeriodicalId":21740,"journal":{"name":"SIGMOD Rec.","volume":"6 1","pages":"68"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Technical Perspective: Online Model Management via Temporally Biased Sampling\",\"authors\":\"K. Yi\",\"doi\":\"10.1145/3371316.3371332\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Randoms sampling from data streams is a problem with a long history of studies, starting from the famous reservoir sampling algorithm that is at least 50 years old [2]. The reservoir sampling algorithm maintains a random sample over all data items that have ever been received from the stream. This is not suitable for many of today's applications on evolving data streams, where recent data is more important than older ones.\",\"PeriodicalId\":21740,\"journal\":{\"name\":\"SIGMOD Rec.\",\"volume\":\"6 1\",\"pages\":\"68\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIGMOD Rec.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371316.3371332\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIGMOD Rec.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371316.3371332","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Technical Perspective: Online Model Management via Temporally Biased Sampling
Randoms sampling from data streams is a problem with a long history of studies, starting from the famous reservoir sampling algorithm that is at least 50 years old [2]. The reservoir sampling algorithm maintains a random sample over all data items that have ever been received from the stream. This is not suitable for many of today's applications on evolving data streams, where recent data is more important than older ones.