Epidemiology with RPub Date : 2020-12-31DOI: 10.1093/oso/9780198841326.003.0009
Bendix Carstensen
{"title":"Survival analysis","authors":"Bendix Carstensen","doi":"10.1093/oso/9780198841326.003.0009","DOIUrl":"https://doi.org/10.1093/oso/9780198841326.003.0009","url":null,"abstract":"This chapter describes survival analysis. Survival analysis concerns data where the outcome is a length of time, namely the time from inclusion in the study (such as diagnosis of some disease) till death or some other event — hence the term 'time to event analysis', which is also used. There are two primary targets normally addressed in survival analysis: survival probabilities and event rates. The chapter then looks at the life table estimator of survival function and the Kaplan–Meier estimator of survival. It also considers the Cox model and its relationship with Poisson models, as well as the Fine–Gray approach to competing risks.","PeriodicalId":177736,"journal":{"name":"Epidemiology with R","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116730819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epidemiology with RPub Date : 2020-12-31DOI: 10.1093/oso/9780198841326.003.0005
Bendix Carstensen
{"title":"Regression models","authors":"Bendix Carstensen","doi":"10.1093/oso/9780198841326.003.0005","DOIUrl":"https://doi.org/10.1093/oso/9780198841326.003.0005","url":null,"abstract":"This chapter evaluates regression models, focusing on the normal linear regression model. The normal linear regression model establishes a relationship between a quantitative response (also called outcome or dependent) variable, assumed to be normally distributed, and one or more explanatory (also called regression, predictor, or independent) variables about which no distributional assumptions are made. The model is usually referred to as 'the general linear model'. The chapter then differentiates between simple linear regression and multiple regression. The term 'simple linear regression' covers the regression model where there is one response variable and one explanatory variable, assuming a linear relationship between the two. The chapter also discusses the model formulae in R; generalized linear models; collinearity and aliasing; and logarithmic transformations.","PeriodicalId":177736,"journal":{"name":"Epidemiology with R","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125804574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epidemiology with RPub Date : 2020-12-31DOI: 10.1093/oso/9780198841326.003.0002
Bendix Carstensen
{"title":"Using R","authors":"Bendix Carstensen","doi":"10.1093/oso/9780198841326.003.0002","DOIUrl":"https://doi.org/10.1093/oso/9780198841326.003.0002","url":null,"abstract":"This chapter discusses how the best way to learn R is to use it. One should start by using it as a simple calculator, and keep on exploring what one gets back by inspecting the size, shape, and content of what one creates. R is available from CRAN, the Comprehensive R Archive Network. A nice interface to R is RStudio, which is a commercial product, but RStudio has a free open source license that allows one to have a very good and handy interface to R for free, including the possibility of writing reports using Rmarkdown, Sweave, or knitr. The chapter then looks at the two main graphics systems used in R: base graphics, which is an integral part of any R distribution, and ggplot2 (gg referring to grammar of graphics). Data from large epidemiological studies are often summarized in the form of frequency data, which record the frequency of all possible combinations of values of the variables in the study.","PeriodicalId":177736,"journal":{"name":"Epidemiology with R","volume":"27 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126035847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epidemiology with RPub Date : 2020-12-31DOI: 10.1093/OSO/9780198841326.003.0003
Bendix Carstensen
{"title":"Measures of disease occurrence","authors":"Bendix Carstensen","doi":"10.1093/OSO/9780198841326.003.0003","DOIUrl":"https://doi.org/10.1093/OSO/9780198841326.003.0003","url":null,"abstract":"This chapter provides a brief introduction to some of the most common measures of disease occurrence used in epidemiology, both the empirical and theoretical versions of the measures. It begins with the prevalence of a disease in a population, which is the fraction of the population that has the disease at a given date. The chapter then considers mortality rate, incidence rate, standardized mortality ratio (SMR), and survival. Mortality is typically reported as a number of people that have died in a population of a certain size. Incidence rates are defined exactly as mortality rates, where one just counts incident cases, that is, newly diagnosed cases of a particular disease. Meanwhile, the SMR is a measure of the mortality in a group of persons as compared to the general population. Finally, the survival after diagnosis of a disease is defined as the fraction of diagnosed individuals alive at a given time after diagnosis.","PeriodicalId":177736,"journal":{"name":"Epidemiology with R","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133435962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epidemiology with RPub Date : 2020-12-31DOI: 10.1093/OSO/9780198841326.003.0010
Bendix Carstensen
{"title":"Do not group quantitative variables","authors":"Bendix Carstensen","doi":"10.1093/OSO/9780198841326.003.0010","DOIUrl":"https://doi.org/10.1093/OSO/9780198841326.003.0010","url":null,"abstract":"This chapter explores the problems caused by categorizing quantitative variables (here termed continuous variables). Optimum decisions are made by applying a utility function to a predicted value. At the decision point, one can solve for the personalized cutpoint for predicted risk that optimizes the decision. Dichotomization on independent variables is completely at odds with making optimal decisions. To make an optimal decision, the cutpoint for a predictor would necessarily be a function of the continuous values of all the other predictors. Moreover, categorization assumes that the relationship between the predictor and the response is flat within intervals; this assumption is far less reasonable than a linearity assumption in most cases. Categorization of continuous variables using percentiles is particularly hazardous. To make a continuous predictor be more accurately modelled when categorization is used, multiple intervals are required.","PeriodicalId":177736,"journal":{"name":"Epidemiology with R","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124620702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Epidemiology with RPub Date : 2020-12-31DOI: 10.1093/OSO/9780198841326.003.0008
B. Carstensen
{"title":"Case-control and case-cohort studies","authors":"B. Carstensen","doi":"10.1093/OSO/9780198841326.003.0008","DOIUrl":"https://doi.org/10.1093/OSO/9780198841326.003.0008","url":null,"abstract":"This chapter addresses Case-control and case-cohort studies. In a Case-control study, one samples persons based on their disease outcome, so the fraction of diseased persons in a Case-control study is usually known (at least approximately) before data collection. In a cohort (follow-up) study, the relationship between some exposure and disease incidence is investigated by following the entire cohort and measuring the rate of occurrence of new cases in the different exposure groups. The follow-up records all persons who develop the disease during the study period. Implicit in this is that the relevant exposure information is available at all times for all persons under follow-up. The chapter then looks at the statistical model for the odds ratio, before differentiating between odds ratio and rate ratio. It also considers confounding and stratified sampling; individually matched studies; and nested Case-control studies.","PeriodicalId":177736,"journal":{"name":"Epidemiology with R","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122657122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}