{"title":"近似局部错误发现率的推理。","authors":"Rajesh Karmakar, Ruth Heller, Saharon Rosset","doi":"10.1093/biomtc/ujaf035","DOIUrl":null,"url":null,"abstract":"<p><p>Efron's 2-group model is widely used in large-scale multiple testing. This model assumes that test statistics are drawn independently from a mixture of a null and a non-null distribution. The marginal local false discovery rate (locFDR) is the probability that the hypothesis is null given its test statistic. The procedure that rejects null hypotheses with marginal locFDRs below a fixed threshold maximizes power (the expected number of non-nulls rejected) while controlling the marginal false discovery rate in this model. However, in realistic settings the test statistics are dependent, and taking the dependence into account can boost power. Unfortunately, the resulting calculations are typically exponential in the number of hypotheses, which is impractical. Instead, we propose using $\\textrm {locFDR}_N$, which is the probability that the hypothesis is null given the test statistics in its $N$-neighborhood. We prove that rejecting for small $\\textrm {locFDR}_N$ is optimal in the restricted class where the decision for each hypothesis is only guided by its $N$-neighborhood, and that power increases with $N$. The computational complexity of computing the $\\mathrm{ locFDR}_N$s increases with $N$, so the analyst should choose the largest $N$-neighborhood that is still computationally feasible. We show through extensive simulations that our proposed procedure can be substantially more powerful than alternative practical approaches, even with small $N$-neighborhoods. We demonstrate the utility of our method in a genome-wide association study of height.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"81 2","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inference with approximate local false discovery rates.\",\"authors\":\"Rajesh Karmakar, Ruth Heller, Saharon Rosset\",\"doi\":\"10.1093/biomtc/ujaf035\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Efron's 2-group model is widely used in large-scale multiple testing. This model assumes that test statistics are drawn independently from a mixture of a null and a non-null distribution. The marginal local false discovery rate (locFDR) is the probability that the hypothesis is null given its test statistic. The procedure that rejects null hypotheses with marginal locFDRs below a fixed threshold maximizes power (the expected number of non-nulls rejected) while controlling the marginal false discovery rate in this model. However, in realistic settings the test statistics are dependent, and taking the dependence into account can boost power. Unfortunately, the resulting calculations are typically exponential in the number of hypotheses, which is impractical. Instead, we propose using $\\\\textrm {locFDR}_N$, which is the probability that the hypothesis is null given the test statistics in its $N$-neighborhood. We prove that rejecting for small $\\\\textrm {locFDR}_N$ is optimal in the restricted class where the decision for each hypothesis is only guided by its $N$-neighborhood, and that power increases with $N$. The computational complexity of computing the $\\\\mathrm{ locFDR}_N$s increases with $N$, so the analyst should choose the largest $N$-neighborhood that is still computationally feasible. We show through extensive simulations that our proposed procedure can be substantially more powerful than alternative practical approaches, even with small $N$-neighborhoods. We demonstrate the utility of our method in a genome-wide association study of height.</p>\",\"PeriodicalId\":8930,\"journal\":{\"name\":\"Biometrics\",\"volume\":\"81 2\",\"pages\":\"\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/biomtc/ujaf035\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomtc/ujaf035","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOLOGY","Score":null,"Total":0}
Inference with approximate local false discovery rates.
Efron's 2-group model is widely used in large-scale multiple testing. This model assumes that test statistics are drawn independently from a mixture of a null and a non-null distribution. The marginal local false discovery rate (locFDR) is the probability that the hypothesis is null given its test statistic. The procedure that rejects null hypotheses with marginal locFDRs below a fixed threshold maximizes power (the expected number of non-nulls rejected) while controlling the marginal false discovery rate in this model. However, in realistic settings the test statistics are dependent, and taking the dependence into account can boost power. Unfortunately, the resulting calculations are typically exponential in the number of hypotheses, which is impractical. Instead, we propose using $\textrm {locFDR}_N$, which is the probability that the hypothesis is null given the test statistics in its $N$-neighborhood. We prove that rejecting for small $\textrm {locFDR}_N$ is optimal in the restricted class where the decision for each hypothesis is only guided by its $N$-neighborhood, and that power increases with $N$. The computational complexity of computing the $\mathrm{ locFDR}_N$s increases with $N$, so the analyst should choose the largest $N$-neighborhood that is still computationally feasible. We show through extensive simulations that our proposed procedure can be substantially more powerful than alternative practical approaches, even with small $N$-neighborhoods. We demonstrate the utility of our method in a genome-wide association study of height.
期刊介绍:
The International Biometric Society is an international society promoting the development and application of statistical and mathematical theory and methods in the biosciences, including agriculture, biomedical science and public health, ecology, environmental sciences, forestry, and allied disciplines. The Society welcomes as members statisticians, mathematicians, biological scientists, and others devoted to interdisciplinary efforts in advancing the collection and interpretation of information in the biosciences. The Society sponsors the biennial International Biometric Conference, held in sites throughout the world; through its National Groups and Regions, it also Society sponsors regional and local meetings.