Probability and the Central Limit Theorem

Statistics for the Social Sciences Pub Date : 2017-12-01 DOI:10.1017/9781316442715.007

Russell T Warne

{"title":"Probability and the Central Limit Theorem","authors":"Russell T Warne","doi":"10.1017/9781316442715.007","DOIUrl":null,"url":null,"abstract":"Everything in the preceding chapters of this book has been about using models to describe or represent data that have been collected from a sample – which we learned is a branch of statistics called descriptive statistics. Describing data is an important task. (It would be difficult to learn about data without describing it!) But it is often of limited usefulness because almost all datasets are collected from samples – and most researchers and practitioners in the social sciences are interested in the population as a whole. After all, if a psychologist says, “I have discovered that 35 out of 50 people in my sample got better after therapy,” that isn't very interesting to anyone who isn't a friend or family member of the people in the sample. The vast majority of social science researchers are interested in how their data from their sample applies to a population. But drawing conclusions about an entire population (which may consist of millions of people) based on a sample that consists of a tiny fraction of the population is a difficult logical leap to make. Yet, that leap is not impossible. In fact, the process of how to draw conclusions about a population from sample data was worked out in the early twentieth century, and it is now common in the social sciences to draw these conclusions about populations. This chapter provides the necessary theory of this process. The rest of the chapters in this textbook will discuss the nuts and bolts of actually performing the calculations needed to learn valuable information about a population with just sample data. Learning Goals • Calculate the probability that a particular outcome will occur in a set of events. • Construct a probability distribution based on theoretical probabilities or empirical probabilities and describe why the differences between the two distribution types occur. • Explain the process of generalizing a conclusion based on sample data to the entire population. • Differentiate between a sample histogram, a sampling distribution, and a probability distribution. • Explain the Central Limit Theorem (CLT) and why it permits estimation of the population mean and standard deviation. • Estimate a population mean and standard deviation by taking multiple samples from the population. Basic Probability Statistics is based entirely on a branch of mathematics called probability , which is concerned with the likelihood of outcomes for an event.","PeriodicalId":334587,"journal":{"name":"Statistics for the Social Sciences","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics for the Social Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/9781316442715.007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Everything in the preceding chapters of this book has been about using models to describe or represent data that have been collected from a sample – which we learned is a branch of statistics called descriptive statistics. Describing data is an important task. (It would be difficult to learn about data without describing it!) But it is often of limited usefulness because almost all datasets are collected from samples – and most researchers and practitioners in the social sciences are interested in the population as a whole. After all, if a psychologist says, “I have discovered that 35 out of 50 people in my sample got better after therapy,” that isn't very interesting to anyone who isn't a friend or family member of the people in the sample. The vast majority of social science researchers are interested in how their data from their sample applies to a population. But drawing conclusions about an entire population (which may consist of millions of people) based on a sample that consists of a tiny fraction of the population is a difficult logical leap to make. Yet, that leap is not impossible. In fact, the process of how to draw conclusions about a population from sample data was worked out in the early twentieth century, and it is now common in the social sciences to draw these conclusions about populations. This chapter provides the necessary theory of this process. The rest of the chapters in this textbook will discuss the nuts and bolts of actually performing the calculations needed to learn valuable information about a population with just sample data. Learning Goals • Calculate the probability that a particular outcome will occur in a set of events. • Construct a probability distribution based on theoretical probabilities or empirical probabilities and describe why the differences between the two distribution types occur. • Explain the process of generalizing a conclusion based on sample data to the entire population. • Differentiate between a sample histogram, a sampling distribution, and a probability distribution. • Explain the Central Limit Theorem (CLT) and why it permits estimation of the population mean and standard deviation. • Estimate a population mean and standard deviation by taking multiple samples from the population. Basic Probability Statistics is based entirely on a branch of mathematics called probability , which is concerned with the likelihood of outcomes for an event.

查看原文本刊更多论文

概率和中心极限定理

本书前几章的所有内容都是关于使用模型来描述或表示从样本中收集的数据-我们了解到这是统计学的一个分支，称为描述性统计。描述数据是一项重要的任务。(如果不描述数据，就很难了解数据!)但是它的用处往往有限，因为几乎所有的数据集都是从样本中收集的——而且社会科学领域的大多数研究人员和实践者对整个人口感兴趣。毕竟，如果一个心理学家说，“我发现在我的样本中，50个人中有35个人在接受治疗后病情有所好转”，对于那些不是样本中这些人的朋友或家人的人来说，这并不是很有趣。绝大多数社会科学研究人员感兴趣的是他们的样本数据如何适用于总体。但是，根据一小部分人口组成的样本得出关于整个人口(可能由数百万人组成)的结论是一个困难的逻辑飞跃。然而，这种飞跃并非不可能。事实上，如何从样本数据中得出关于人口的结论的过程是在20世纪初制定的，现在在社会科学中得出这些关于人口的结论是很常见的。本章为这一过程提供了必要的理论依据。本教材的其余章节将讨论实际执行计算所需的具体细节，以获取有关样本数据的有价值信息。•计算某一特定结果在一系列事件中发生的概率。•构建一个基于理论概率或经验概率的概率分布，并描述为什么两种分布类型之间会出现差异。•解释将基于样本数据的结论推广到整个人群的过程。•区分样本直方图、抽样分布和概率分布。•解释中心极限定理(CLT)，以及为什么它允许估计总体均值和标准差。•通过从总体中抽取多个样本来估计总体均值和标准差。基本概率统计完全建立在一个叫做概率的数学分支的基础上，它与事件结果的可能性有关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistics for the Social Sciences

自引率

0.00%

发文量