Incorporating the sampling variation of the disease prevalence when calculating the sample size in a study to determine the diagnostic accuracy of a test
{"title":"Incorporating the sampling variation of the disease prevalence when calculating the sample size in a study to determine the diagnostic accuracy of a test","authors":"Qilong Yi , Tony Panzarella , Paul Corey","doi":"10.1016/j.cct.2004.06.003","DOIUrl":null,"url":null,"abstract":"<div><p>During the design stage of a study to assess the population sensitivity (<em>P</em><sub>S</sub>) (or specificity) of a diagnostic test, the number of subjects (<em>N</em>) who will be administered both a gold standard test and a new test needs to be calculated. A common approach is to calculate the number of cases (<em>n</em>) with a specific disease or condition as diagnosed by the gold standard test first, and then to determine <em>N</em> based on the prevalence or incidence rate of the disease (<em>P</em><sub>P</sub>) in the population, calculated as <em>N</em>=<em>n</em>/<em>P</em><sub>P</sub>. Due to sampling variation, given the sample size <em>N</em>, the number of cases having the disease identified by the gold standard test could be less than <em>N</em>×<em>P</em><sub>P</sub>. In this case, the study would be under-powered and may fail to produce an unbiased and precise estimate. In this study, we investigated this possibility for a situation where the required sample size is calculated using the confidence interval approach. When the sampling variation is considered, the variance of the sample sensitivity is slightly inflated, but its confidence interval width becomes widely dispersed. In order to reach the originally designed precision, adjustment in the sample size, <em>N</em>, is needed and suggested in this paper.</p></div>","PeriodicalId":72706,"journal":{"name":"Controlled clinical trials","volume":"25 4","pages":"Pages 417-427"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.cct.2004.06.003","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Controlled clinical trials","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0197245604000479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
During the design stage of a study to assess the population sensitivity (PS) (or specificity) of a diagnostic test, the number of subjects (N) who will be administered both a gold standard test and a new test needs to be calculated. A common approach is to calculate the number of cases (n) with a specific disease or condition as diagnosed by the gold standard test first, and then to determine N based on the prevalence or incidence rate of the disease (PP) in the population, calculated as N=n/PP. Due to sampling variation, given the sample size N, the number of cases having the disease identified by the gold standard test could be less than N×PP. In this case, the study would be under-powered and may fail to produce an unbiased and precise estimate. In this study, we investigated this possibility for a situation where the required sample size is calculated using the confidence interval approach. When the sampling variation is considered, the variance of the sample sensitivity is slightly inflated, but its confidence interval width becomes widely dispersed. In order to reach the originally designed precision, adjustment in the sample size, N, is needed and suggested in this paper.