{"title":"如何验证任何(合理的)分布属性:计算合理的分布论证系统","authors":"Tal Herman, Guy Rothblum","doi":"arxiv-2409.06594","DOIUrl":null,"url":null,"abstract":"As statistical analyses become more central to science, industry and society,\nthere is a growing need to ensure correctness of their results. Approximate\ncorrectness can be verified by replicating the entire analysis, but can we\nverify without replication? Building on a recent line of work, we study\nproof-systems that allow a probabilistic verifier to ascertain that the results\nof an analysis are approximately correct, while drawing fewer samples and using\nless computational resources than would be needed to replicate the analysis. We\nfocus on distribution testing problems: verifying that an unknown distribution\nis close to having a claimed property. Our main contribution is a interactive protocol between a verifier and an\nuntrusted prover, which can be used to verify any distribution property that\ncan be decided in polynomial time given a full and explicit description of the\ndistribution. If the distribution is at statistical distance $\\varepsilon$ from\nhaving the property, then the verifier rejects with high probability. This\nsoundness property holds against any polynomial-time strategy that a cheating\nprover might follow, assuming the existence of collision-resistant hash\nfunctions (a standard assumption in cryptography). For distributions over a\ndomain of size $N$, the protocol consists of $4$ messages and the communication\ncomplexity and verifier runtime are roughly $\\widetilde{O}\\left(\\sqrt{N} /\n\\varepsilon^2 \\right)$. The verifier's sample complexity is\n$\\widetilde{O}\\left(\\sqrt{N} / \\varepsilon^2 \\right)$, and this is optimal up\nto $\\polylog(N)$ factors (for any protocol, regardless of its communication\ncomplexity). Even for simple properties, approximately deciding whether an\nunknown distribution has the property can require quasi-linear sample\ncomplexity and running time. For any such property, our protocol provides a\nquadratic speedup over replicating the analysis.","PeriodicalId":501332,"journal":{"name":"arXiv - CS - Cryptography and Security","volume":"27 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions\",\"authors\":\"Tal Herman, Guy Rothblum\",\"doi\":\"arxiv-2409.06594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As statistical analyses become more central to science, industry and society,\\nthere is a growing need to ensure correctness of their results. Approximate\\ncorrectness can be verified by replicating the entire analysis, but can we\\nverify without replication? Building on a recent line of work, we study\\nproof-systems that allow a probabilistic verifier to ascertain that the results\\nof an analysis are approximately correct, while drawing fewer samples and using\\nless computational resources than would be needed to replicate the analysis. We\\nfocus on distribution testing problems: verifying that an unknown distribution\\nis close to having a claimed property. Our main contribution is a interactive protocol between a verifier and an\\nuntrusted prover, which can be used to verify any distribution property that\\ncan be decided in polynomial time given a full and explicit description of the\\ndistribution. If the distribution is at statistical distance $\\\\varepsilon$ from\\nhaving the property, then the verifier rejects with high probability. This\\nsoundness property holds against any polynomial-time strategy that a cheating\\nprover might follow, assuming the existence of collision-resistant hash\\nfunctions (a standard assumption in cryptography). For distributions over a\\ndomain of size $N$, the protocol consists of $4$ messages and the communication\\ncomplexity and verifier runtime are roughly $\\\\widetilde{O}\\\\left(\\\\sqrt{N} /\\n\\\\varepsilon^2 \\\\right)$. The verifier's sample complexity is\\n$\\\\widetilde{O}\\\\left(\\\\sqrt{N} / \\\\varepsilon^2 \\\\right)$, and this is optimal up\\nto $\\\\polylog(N)$ factors (for any protocol, regardless of its communication\\ncomplexity). Even for simple properties, approximately deciding whether an\\nunknown distribution has the property can require quasi-linear sample\\ncomplexity and running time. For any such property, our protocol provides a\\nquadratic speedup over replicating the analysis.\",\"PeriodicalId\":501332,\"journal\":{\"name\":\"arXiv - CS - Cryptography and Security\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Cryptography and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.06594\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Cryptography and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06594","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions
As statistical analyses become more central to science, industry and society,
there is a growing need to ensure correctness of their results. Approximate
correctness can be verified by replicating the entire analysis, but can we
verify without replication? Building on a recent line of work, we study
proof-systems that allow a probabilistic verifier to ascertain that the results
of an analysis are approximately correct, while drawing fewer samples and using
less computational resources than would be needed to replicate the analysis. We
focus on distribution testing problems: verifying that an unknown distribution
is close to having a claimed property. Our main contribution is a interactive protocol between a verifier and an
untrusted prover, which can be used to verify any distribution property that
can be decided in polynomial time given a full and explicit description of the
distribution. If the distribution is at statistical distance $\varepsilon$ from
having the property, then the verifier rejects with high probability. This
soundness property holds against any polynomial-time strategy that a cheating
prover might follow, assuming the existence of collision-resistant hash
functions (a standard assumption in cryptography). For distributions over a
domain of size $N$, the protocol consists of $4$ messages and the communication
complexity and verifier runtime are roughly $\widetilde{O}\left(\sqrt{N} /
\varepsilon^2 \right)$. The verifier's sample complexity is
$\widetilde{O}\left(\sqrt{N} / \varepsilon^2 \right)$, and this is optimal up
to $\polylog(N)$ factors (for any protocol, regardless of its communication
complexity). Even for simple properties, approximately deciding whether an
unknown distribution has the property can require quasi-linear sample
complexity and running time. For any such property, our protocol provides a
quadratic speedup over replicating the analysis.