Nadine Stephenson, Lars Beckmann, Jenny Chang-Claude
{"title":"Carcinogen metabolism, cigarette smoking, and breast cancer risk: a Bayes model averaging approach.","authors":"Nadine Stephenson, Lars Beckmann, Jenny Chang-Claude","doi":"10.1186/1742-5573-7-10","DOIUrl":"https://doi.org/10.1186/1742-5573-7-10","url":null,"abstract":"<p><strong>Background: </strong>Standard logistic regression with or without stepwise selection has the disadvantage of not incorporating model uncertainty and the dependency of estimates on the underlying model into the final inference. We explore the use of a Bayes Model Averaging approach as an alternative to analyze the influence of genetic variants, environmental effects and their interactions on disease.</p><p><strong>Methods: </strong>Logistic regression with and without stepwise selection and Bayes Model Averaging were applied to a population-based case-control study exploring the association of genetic variants in tobacco smoke-related carcinogen pathways with breast cancer.</p><p><strong>Results: </strong>Both regression and Bayes Model Averaging highlighted a significant effect of NAT1*10 on breast cancer, while regression analysis also suggested a significant effect for packyears and for the interaction of packyears and NAT2.</p><p><strong>Conclusions: </strong>Bayes Model Averaging allows incorporation of model uncertainty, helps reduce dimensionality and avoids the problem of multiple comparisons. It can be used to incorporate biological information, such as pathway data, into the analysis. As with all Bayesian analysis methods, careful consideration must be given to prior specification.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2010-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-10","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29471852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elizabeth L Turner, Joanna E Dobson, Stuart J Pocock
{"title":"Categorisation of continuous risk factors in epidemiological publications: a survey of current practice.","authors":"Elizabeth L Turner, Joanna E Dobson, Stuart J Pocock","doi":"10.1186/1742-5573-7-9","DOIUrl":"https://doi.org/10.1186/1742-5573-7-9","url":null,"abstract":"<p><strong>Background: </strong>Reports of observational epidemiological studies often categorise (group) continuous risk factor (exposure) variables. However, there has been little systematic assessment of how categorisation is practiced or reported in the literature and no extended guidelines for the practice have been identified. Thus, we assessed the nature of such practice in the epidemiological literature. Two months (December 2007 and January 2008) of five epidemiological and five general medical journals were reviewed. All articles that examined the relationship between continuous risk factors and health outcomes were surveyed using a standard proforma, with the focus on the primary risk factor. Using the survey results we provide illustrative examples and, combined with ideas from the broader literature and from experience, we offer guidelines for good practice.</p><p><strong>Results: </strong>Of the 254 articles reviewed, 58 were included in our survey. Categorisation occurred in 50 (86%) of them. Of those, 42% also analysed the variable continuously and 24% considered alternative groupings. Most (78%) used 3 to 5 groups. No articles relied solely on dichotomisation, although it did feature prominently in 3 articles. The choice of group boundaries varied: 34% used quantiles, 18% equally spaced categories, 12% external criteria, 34% other approaches and 2% did not describe the approach used. Categorical risk estimates were most commonly (66%) presented as pairwise comparisons to a reference group, usually the highest or lowest (79%). Reporting of categorical analysis was mostly in tables; only 20% in figures.</p><p><strong>Conclusions: </strong>Categorical analyses of continuous risk factors are common. Accordingly, we provide recommendations for good practice. Key issues include pre-defining appropriate choice of groupings and analysis strategies, clear presentation of grouped findings in tables and figures, and drawing valid conclusions from categorical analyses, avoiding injudicious use of multiple alternative analyses.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2010-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29353851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Population attributable fraction: comparison of two mathematical procedures to estimate the annual attributable number of deaths.","authors":"Bernard Ck Choi","doi":"10.1186/1742-5573-7-8","DOIUrl":"https://doi.org/10.1186/1742-5573-7-8","url":null,"abstract":"<p><strong>Objective: </strong>The purpose of this paper was to compare two mathematical procedures to estimate the annual attributable number of deaths (the Allison et al procedure and the Mokdad et al procedure), and derive a new procedure that combines the best aspects of both procedures. The new procedure calculates attributable number of deaths along a continuum (i.e. for each unit of exposure), and allows for one or more neutral (neither exposed nor nonexposed) exposure categories.</p><p><strong>Methods: </strong>Mathematical derivations and real datasets were used to demonstrate the theoretical relationship and practical differences between the two procedures. Results of the comparison were used to develop a new procedure that combines the best features of both.</p><p><strong>Findings: </strong>The Allison procedure is complex because it directly estimates the number of attributable deaths. This necessitates calculation of probabilities of death. The Mokdad procedure is simpler because it estimates the number of attributable deaths indirectly through population attributable fractions. The probabilities of death cancel out in the numerator and denominator of the fractions. However, the Mokdad procedure is not applicable when a neutral exposure category exists.</p><p><strong>Conclusion: </strong>By combining the innovation of the Allison procedure (allowing for a neutral category) and the simplicity of the Mokdad procedure (using population attributable fractions), this paper proposes a new procedure to calculate attributable numbers of death.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2010-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29278019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Author's response to Poole, C. Commentary: How Many Are Affected? A Real Limit of Epidemiology.","authors":"Nicolle M Gatto, Ulka B Campbell, Sharon Schwartz","doi":"10.1186/1742-5573-7-7","DOIUrl":"10.1186/1742-5573-7-7","url":null,"abstract":"","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2010-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2939526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29271529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How many are affected? A real limit of epidemiology.","authors":"Charles Poole","doi":"10.1186/1742-5573-7-6","DOIUrl":"https://doi.org/10.1186/1742-5573-7-6","url":null,"abstract":"<p><p> A person can experience an effect on the occurrence of an outcome in a defined follow-up period without experiencing an effect on the risk of that outcome over the same period. Sufficient causes are sometimes used to deepen potential-outcome explanations of this phenomenon. In doing so, care should be taken to avoid tipping the balance between simplification and realism too far toward simplification. Death and other competing risks should not be assumed away. The time scale should be explicit, with specific times for the occurrence of specified component causes and for the completion of each sufficient cause. Component causes that affect risk should occur no later than the start of the risk period. Sufficient causes should be allowed to have component causes in common. When individuals experience all components of two or more sufficient causes, the outcome must be recurrent. In addition to effects on rates and risks, effects on incidence time itself should be considered.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2010-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29212813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Redundant causation from a sufficient cause perspective.","authors":"Nicolle M Gatto, Ulka B Campbell","doi":"10.1186/1742-5573-7-5","DOIUrl":"https://doi.org/10.1186/1742-5573-7-5","url":null,"abstract":"<p><p> Sufficient causes of disease are redundant when an individual acquires the components of two or more sufficient causes. In this circumstance, the individual still would have become diseased even if one of the sufficient causes had not been acquired. In the context of a study, when any individuals acquire components of more than one sufficient cause over the observation period, the etiologic effect of the exposure (defined as the absolute or relative difference between the proportion of the exposed who develop the disease by the end of the study period and the proportion of those individuals who would have developed the disease at the moment they did even in the absence of the exposure) may be underestimated. Even in the absence of confounding and bias, the observed effect estimate represents only a subset of the etiologic effect. This underestimation occurs regardless of the measure of effect used.To some extent, redundancy of sufficient causes is always present, and under some circumstances, it may make a true cause of disease appear to be not causal. This problem is particularly relevant when the researcher's goal is to characterize the universe of sufficient causes of the disease, identify risk factors for targeted interventions, or construct causal diagrams. In this paper, we use the sufficient component cause model and the disease response type framework to show how redundant causation arises and the factors that determine the extent of its impact on epidemiologic effect measures.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"5"},"PeriodicalIF":0.0,"publicationDate":"2010-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29160710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fitting additive Poisson models.","authors":"Hendriek C Boshuizen, Edith Jm Feskens","doi":"10.1186/1742-5573-7-4","DOIUrl":"https://doi.org/10.1186/1742-5573-7-4","url":null,"abstract":"<p><p> This paper describes how to fit an additive Poisson model using standard software. It is illustrated with SAS code, but can be similarly used for other software packages.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2010-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29138477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sylvia Ek Sudat, Elizabeth J Carlton, Edmund Yw Seto, Robert C Spear, Alan E Hubbard
{"title":"Using variable importance measures from causal inference to rank risk factors of schistosomiasis infection in a rural setting in China.","authors":"Sylvia Ek Sudat, Elizabeth J Carlton, Edmund Yw Seto, Robert C Spear, Alan E Hubbard","doi":"10.1186/1742-5573-7-3","DOIUrl":"10.1186/1742-5573-7-3","url":null,"abstract":"<p><strong>Background: </strong>Schistosomiasis infection, contracted through contact with contaminated water, is a global public health concern. In this paper we analyze data from a retrospective study reporting water contact and schistosomiasis infection status among 1011 individuals in rural China. We present semi-parametric methods for identifying risk factors through a comparison of three analysis approaches: a prediction-focused machine learning algorithm, a simple main-effects multivariable regression, and a semi-parametric variable importance (VI) estimate inspired by a causal population intervention parameter.</p><p><strong>Results: </strong>The multivariable regression found only tool washing to be associated with the outcome, with a relative risk of 1.03 and a 95% confidence interval (CI) of 1.01-1.05. Three types of water contact were found to be associated with the outcome in the semi-parametric VI analysis: July water contact (VI estimate 0.16, 95% CI 0.11-0.22), water contact from tool washing (VI estimate 0.88, 95% CI 0.80-0.97), and water contact from rice planting (VI estimate 0.71, 95% CI 0.53-0.96). The July VI result, in particular, indicated a strong association with infection status - its causal interpretation implies that eliminating water contact in July would reduce the prevalence of schistosomiasis in our study population by 84%, or from 0.3 to 0.05 (95% CI 78%-89%).</p><p><strong>Conclusions: </strong>The July VI estimate suggests possible within-season variability in schistosomiasis infection risk, an association not detected by the regression analysis. Though there are many limitations to this study that temper the potential for causal interpretations, if a high-risk time period could be detected in something close to real time, new prevention options would be opened. Most importantly, we emphasize that traditional regression approaches are usually based on arbitrary pre-specified models, making their parameters difficult to interpret in the context of real-world applications. Our results support the practical application of analysis approaches that, in contrast, do not require arbitrary model pre-specification, estimate parameters that have simple public health interpretations, and apply inference that considers model selection as a source of variation.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2010-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2913928/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"29119947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laurence S Freedman, Victor Kipnis, Arthur Schatzkin, Natasa Tasevska, Nancy Potischman
{"title":"Can we use biomarkers in combination with self-reports to strengthen the analysis of nutritional epidemiologic studies?","authors":"Laurence S Freedman, Victor Kipnis, Arthur Schatzkin, Natasa Tasevska, Nancy Potischman","doi":"10.1186/1742-5573-7-2","DOIUrl":"https://doi.org/10.1186/1742-5573-7-2","url":null,"abstract":"<p><p>Identifying diet-disease relationships in nutritional cohort studies is plagued by the measurement error in self-reported intakes. The authors propose using biomarkers known to be correlated with dietary intake, so as to strengthen analyses of diet-disease hypotheses. The authors consider combining self-reported intakes and biomarker levels using principal components, Howe's method, or a joint statistical test of effects in a bivariate model. They compared the statistical power of these methods with that of conventional univariate analyses of self-reported intake or of biomarker level. They used computer simulation of different disease risk models, with input parameters based on data from the literature on the relationship between lutein intake and age-related macular degeneration. The results showed that if the dietary effect on disease was fully mediated through the biomarker level, then the univariate analysis of the biomarker was the most powerful approach. However, combination methods, particularly principal components and Howe's method, were not greatly inferior in this situation, and were as good as, or better than, univariate biomarker analysis if mediation was only partial or non-existent. In some circumstances sample size requirements were reduced to 20-50% of those required for conventional analyses of self-reported intake. The authors conclude that (i) including biomarker data in addition to the usual dietary data in a cohort could greatly strengthen the investigation of diet-disease relationships, and (ii) when the extent of mediation through the biomarker is unknown, use of principal components or Howe's method appears a good strategy.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2010-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28734975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Grace L Smith, Ya-Chen T Shih, Sharon H Giordano, Benjamin D Smith, Thomas A Buchholz
{"title":"A method to predict breast cancer stage using Medicare claims.","authors":"Grace L Smith, Ya-Chen T Shih, Sharon H Giordano, Benjamin D Smith, Thomas A Buchholz","doi":"10.1186/1742-5573-7-1","DOIUrl":"https://doi.org/10.1186/1742-5573-7-1","url":null,"abstract":"<p><strong>Background: </strong>In epidemiologic studies, cancer stage is an important predictor of outcomes. However, cancer stage is typically unavailable in medical insurance claims datasets, thus limiting the usefulness of such data for epidemiologic studies. Therefore, we sought to develop an algorithm to predict cancer stage based on covariates available from claims-based data.</p><p><strong>Methods: </strong>We identified a cohort of 77,306 women age >/= 66 years with stage I-IV breast cancer, using the Surveillence Epidemiology and End Results (SEER)-Medicare database. We formulated an algorithm to predict cancer stage using covariates (demographic, tumor, and treatment characteristics) obtained from claims. Logistic regression models derived prediction equations in a training set, and equations' test characteristics (sensitivity, specificity, positive predictive value (PPV), and negative predictive value [NPV]) were calculated in a validation set.</p><p><strong>Results: </strong>Of the entire sample of women diagnosed with invasive breast cancer, 51% had stage I; 26% stage II; 11% stage III; and 4% stage IV disease. The equation predicting stage IV disease achieved sensitivity of 81%, specificity 89%, positive predictive value (PPV) 24%, and negative predictive value (NPV) 99%, while the equation distinguishing stage I/II from stage III disease achieved sensitivity 83%, specificity 78%, PPV 98%, and NPV 31%. Combined, the equations most accurately identified early stage disease and ascertained a sample in which 98% of patients were stage I or II.</p><p><strong>Conclusions: </strong>A claims-based algorithm was utilized to predict breast cancer stage, and was particularly successful when used to identify early stage disease. These prediction equations may be applied in future studies of breast cancer patients, substantially improving the utility of claims-based studies in this group. This method may similarly be employed to develop algorithms permitting claims-based epidemiologic studies of patients with other cancers.</p>","PeriodicalId":87082,"journal":{"name":"Epidemiologic perspectives & innovations : EP+I","volume":"7 ","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2010-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1742-5573-7-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28705499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}