Statistical Thinking from Scratch最新文献

筛选
英文 中文
Parametric estimation and inference 参数估计与推理
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0011
M. Edge
{"title":"Parametric estimation and inference","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0011","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0011","url":null,"abstract":"If it is reasonable to assume that the data are generated by a fully parametric model, then maximum-likelihood approaches to estimation and inference have many appealing properties. Maximum-likelihood estimators are obtained by identifying parameters that maximize the likelihood function, which can be done using calculus or using numerical approaches. Such estimators are consistent, and if the costs of errors in estimation are described by a squared-error loss function, then they are also efficient compared with their consistent competitors. The sampling variance of a maximum-likelihood estimate can be estimated in various ways. As always, one possibility is the bootstrap. In many models, the variance of the maximum-likelihood estimator can be derived directly once its form is known. A third approach is to rely on general properties of maximum-likelihood estimators and use the Fisher information. Similarly, there are many ways to test hypotheses about parameters estimated by maximum likelihood. This chapter discusses the Wald test, which relies on the fact that the sampling distribution of maximum-likelihood estimators is normal in large samples, and the likelihood-ratio test, which is a general approach for testing hypotheses relating nested pairs of models.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126641844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiparametric estimation and inference 半参数估计与推理
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0010
M. Edge
{"title":"Semiparametric estimation and inference","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0010","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0010","url":null,"abstract":"Nonparametric and semiparametric statistical methods assume models whose properties cannot be described by a finite number of parameters. For example, a linear regression model that assumes that the disturbances are independent draws from an unknown distribution is semiparametric—it includes the intercept and slope as regression parameters but has a nonparametric part, the unknown distribution of the disturbances. Nonparametric and semiparametric methods focus on the empirical distribution function, which, assuming that the data are really independent observations from the same distribution, is a consistent estimator of the true cumulative distribution function. In this chapter, with plug-in estimation and the method of moments, functionals or parameters are estimated by treating the empirical distribution function as if it were the true cumulative distribution function. Such estimators are consistent. To understand the variation of point estimates, bootstrapping is used to resample from the empirical distribution function. For hypothesis testing, one can either use a bootstrap-based confidence interval or conduct a permutation test, which can be designed to test null hypotheses of independence or exchangeability. Resampling methods—including bootstrapping and permutation testing—are flexible and easy to implement with a little programming expertise.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131414637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Encountering data 遇到数据
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0002
M. Edge
{"title":"Encountering data","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0002","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0002","url":null,"abstract":"Statistics is concerned with using data to learn about the world. In this book, concepts for reasoning from data are developed using a combination of math and simulation. Using a running example, we will consider probability theory, statistical estimation, and statistical inference. Estimation and inference will be considered from three different perspectives.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115111230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
R and exploratory data analysis R和探索性数据分析
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0003
M. Edge
{"title":"R and exploratory data analysis","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0003","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0003","url":null,"abstract":"R is a powerful, free software package for performing statistical tasks. It will be used to simulate data, analyze data, and make data displays. More details about R are given in Appendix B.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116562190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Postlude: models and data 结论:模型和数据
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0013
M. Edge
{"title":"Postlude: models and data","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0013","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0013","url":null,"abstract":"Becoming a well-rounded data analyst requires more than the skills covered in this book. This postlude sketches some ways in which the types of thinking covered here can be extended to real problems in data analysis. Different ways of evaluating the assumptions of linear regression are considered, including plotting, hypothesis tests, and out-of-sample prediction. If the assumptions are not met, simple linear regression can be extended in various ways, including multiple regression, generalized linear models, and mixed models (among many other possibilities). This postlude concludes with a short discussion of the themes of the book: probabilistic models, methodological pluralism, and the value of elementary statistical thinking.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115409945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prelude 前奏
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0001
M. Edge
{"title":"Prelude","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0001","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0001","url":null,"abstract":"There are two traditional ways to learn statistics. One way is to pass over the mathematical underpinnings and focus on developing relatively shallow knowledge about a wide variety of statistical procedures. Another is to spend years learning the mathematics necessary for traditional mathematical approaches to statistics. For many people who need to analyze data, neither of these paths is sufficient. The shallow-but-wide approach fails to provide students with the foundation that allows for confidence and creativity in analyzing modern datasets, and many researchers—though possibly motivated to learn math—do not have the background to start immediately on a traditional mathematical approach. This book exists to help researchers jump between tracks, providing motivated students whose knowledge of mathematics may be incomplete or rusty with a serious introduction to statistics that allows further study from more mathematical sources. This is done by focusing on a single statistical technique that is fundamental to statistical practice—simple linear regression—and supplementing the exposition with ample simulations conducted in the statistical programming language R. The first half of the book focuses on preliminaries, including the use of R and probability theory, whereas the second half covers statistical estimation and inference from semiparametric, parametric, and Bayesian perspectives.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129904577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Properties of random variables 随机变量的性质
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0006
M. Edge
{"title":"Properties of random variables","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0006","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0006","url":null,"abstract":"In this chapter, the behavior of random variables is summarized using the concepts of expectation, variance, and covariance. The expectation is a measurement of the location of a random variable’s distribution. The variance and its square root, the standard deviation, are measurements of the spread of a random variable’s distribution. Covariance and correlation are measurements of the extent of linear relationship between two random variables. The chapter also describe two important theorems that describe the distribution of means of samples from a distribution. As the sample size becomes larger, the distribution of the sample mean becomes bunched more tightly around the expectation—this is the law of large numbers—and the distribution of the sample mean approaches the shape of a normal distribution—this is the central limit theorem. Finally, a model describing a linear relationship between two random variables is considered, and the properties of those two random variables are analyzed.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115874137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Properties of point estimators 点估计量的性质
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0008
M. Edge
{"title":"Properties of point estimators","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0008","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0008","url":null,"abstract":"Point estimation is the attempt to identify a value associated with some underlying process or population using data. The unknown number that is the target of estimation is called an estimand. An estimator is a function that takes in data and produces an estimate. In this chapter, estimators are evaluated according to a number of criteria. An unbiased estimator is one whose expected value is equal to the estimand—in lay terms, it is accurate. Low-variance estimators, which are precise, are also evaluated. Consistent estimators converge to the estimand as the number of data collected approaches infinity. Mean squared error is the expected squared difference between the estimator and the estimand. Efficient estimators are those that converge to the estimand relatively quickly—i.e., fewer data are necessary to get close to the right answer. An optional section discusses statistical decision theory, which is a general framework for evaluating estimators. Finally, some ideas of robustness are discussed. A robust estimator is one that can still provide useful information even if the model is not quite right or the data are contaminated.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121397401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interval estimation and inference 区间估计与推断
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0009
M. Edge
{"title":"Interval estimation and inference","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0009","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0009","url":null,"abstract":"Interval estimation is the attempt to define intervals that quantify the degree of uncertainty in an estimate. The standard deviation of an estimate is called a standard error. Confidence intervals are designed to cover the true value of an estimand with a specified probability. Hypothesis testing is the attempt to assess the degree of evidence for or against a specific hypothesis. One tool for frequentist hypothesis testing is the p value, or the probability that if the null hypothesis is in fact true, the data would depart as extremely or more extremely from expectations under the null hypothesis than they were observed to do. In Neyman–Pearson hypothesis testing, the null hypothesis is rejected if p is less than a pre-specified value, often chosen to be 0.05. A test’s power function gives the probability that the null hypothesis is rejected given the significance level γ‎, a sample size n, and a specified alternative hypothesis. This chapter discusses some limitations of hypothesis testing as commonly practiced in the research literature.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"147 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124716415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The line of best fit 最合适的线
Statistical Thinking from Scratch Pub Date : 2019-06-07 DOI: 10.1093/oso/9780198827627.003.0004
M. Edge
{"title":"The line of best fit","authors":"M. Edge","doi":"10.1093/oso/9780198827627.003.0004","DOIUrl":"https://doi.org/10.1093/oso/9780198827627.003.0004","url":null,"abstract":"One way to visualize a set of data on two variables is to plot them on a pair of axes. A line that “best fits” the data can then be drawn as a summary. This chapter considers how to define a line of “best” fit—there is no sole best choice. The most commonly chosen line to summarize the data is the “least-squares” line—the line that minimizes the sum of the squared vertical distances between the points and the line. One reason for the least-squares line’s popularity is convenience, but, as will be seen later, it is also related to some key ideas in statistical estimation. The derivations of expressions for the intercept and slope of the least-squares line are discussed.","PeriodicalId":192186,"journal":{"name":"Statistical Thinking from Scratch","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128127419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信