阈值采样

IF 0.9 4区计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS

Theoretical Computer Science Pub Date : 2024-09-06 DOI:10.1016/j.tcs.2024.114847

Stefan Rass , Max-Julian Jakobitsch , Stefan Haan , Moritz Hiebler

{"title":"阈值采样","authors":"Stefan Rass , Max-Julian Jakobitsch , Stefan Haan , Moritz Hiebler","doi":"10.1016/j.tcs.2024.114847","DOIUrl":null,"url":null,"abstract":"<div><p>We consider the problem of sampling elements with some desired property from a large set, without testing the property of interest, but with the (probabilistic) assurance to have at least one match among the random sample. Like in ranked set sampling (RSS), we consider an infinite population under study, whose properties of interest are too expensive and/or time-consuming to measure. Unlike RSS, we are void of a ranking mechanism, so our sampling is done entirely blind. We show how it is nonetheless doable to assure, with controllably large likelihood, to either have at least one of the interesting elements in a random sample, or, contrarily, sample with the likewise assurance of not having one of the interesting elements in the sample. Our technique utilizes density bounds for distributions and threshold functions from random graph theory.</p></div>","PeriodicalId":49438,"journal":{"name":"Theoretical Computer Science","volume":"1019 ","pages":"Article 114847"},"PeriodicalIF":0.9000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S030439752400464X/pdfft?md5=26ceab1a448b5baf958ad0b8d3b32343&pid=1-s2.0-S030439752400464X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Threshold sampling\",\"authors\":\"Stefan Rass , Max-Julian Jakobitsch , Stefan Haan , Moritz Hiebler\",\"doi\":\"10.1016/j.tcs.2024.114847\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>We consider the problem of sampling elements with some desired property from a large set, without testing the property of interest, but with the (probabilistic) assurance to have at least one match among the random sample. Like in ranked set sampling (RSS), we consider an infinite population under study, whose properties of interest are too expensive and/or time-consuming to measure. Unlike RSS, we are void of a ranking mechanism, so our sampling is done entirely blind. We show how it is nonetheless doable to assure, with controllably large likelihood, to either have at least one of the interesting elements in a random sample, or, contrarily, sample with the likewise assurance of not having one of the interesting elements in the sample. Our technique utilizes density bounds for distributions and threshold functions from random graph theory.</p></div>\",\"PeriodicalId\":49438,\"journal\":{\"name\":\"Theoretical Computer Science\",\"volume\":\"1019 \",\"pages\":\"Article 114847\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S030439752400464X/pdfft?md5=26ceab1a448b5baf958ad0b8d3b32343&pid=1-s2.0-S030439752400464X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Theoretical Computer Science\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S030439752400464X\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Computer Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030439752400464X","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

我们考虑的问题是从一个大集合中抽取具有某种所需属性的元素，无需测试感兴趣的属性，但（概率上）保证在随机样本中至少有一个匹配的元素。与排序集抽样（RSS）一样，我们考虑的是一个无限的研究群体，其感兴趣的属性的测量成本太高和/或耗时太长。与 RSS 不同的是，我们没有排序机制，因此我们的抽样完全是盲目的。我们展示了如何以可控的大概率确保在随机样本中至少有一个有趣的元素，或者相反，同样确保在样本中没有一个有趣的元素。我们的技术利用了随机图论中的分布密度边界和阈值函数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Threshold sampling

We consider the problem of sampling elements with some desired property from a large set, without testing the property of interest, but with the (probabilistic) assurance to have at least one match among the random sample. Like in ranked set sampling (RSS), we consider an infinite population under study, whose properties of interest are too expensive and/or time-consuming to measure. Unlike RSS, we are void of a ranking mechanism, so our sampling is done entirely blind. We show how it is nonetheless doable to assure, with controllably large likelihood, to either have at least one of the interesting elements in a random sample, or, contrarily, sample with the likewise assurance of not having one of the interesting elements in the sample. Our technique utilizes density bounds for distributions and threshold functions from random graph theory.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Theoretical Computer Science 工程技术-计算机：理论方法

CiteScore

2.60

自引率

18.20%

发文量

471

审稿时长

12.6 months

期刊介绍： Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is clearly drawn from the field of computing.