{"title":"Probability and the Central Limit Theorem","authors":"Russell T Warne","doi":"10.1017/9781316442715.007","DOIUrl":null,"url":null,"abstract":"Everything in the preceding chapters of this book has been about using models to describe or represent data that have been collected from a sample – which we learned is a branch of statistics called descriptive statistics. Describing data is an important task. (It would be difficult to learn about data without describing it!) But it is often of limited usefulness because almost all datasets are collected from samples – and most researchers and practitioners in the social sciences are interested in the population as a whole. After all, if a psychologist says, “I have discovered that 35 out of 50 people in my sample got better after therapy,” that isn't very interesting to anyone who isn't a friend or family member of the people in the sample. The vast majority of social science researchers are interested in how their data from their sample applies to a population. But drawing conclusions about an entire population (which may consist of millions of people) based on a sample that consists of a tiny fraction of the population is a difficult logical leap to make. Yet, that leap is not impossible. In fact, the process of how to draw conclusions about a population from sample data was worked out in the early twentieth century, and it is now common in the social sciences to draw these conclusions about populations. This chapter provides the necessary theory of this process. The rest of the chapters in this textbook will discuss the nuts and bolts of actually performing the calculations needed to learn valuable information about a population with just sample data. Learning Goals • Calculate the probability that a particular outcome will occur in a set of events. • Construct a probability distribution based on theoretical probabilities or empirical probabilities and describe why the differences between the two distribution types occur. • Explain the process of generalizing a conclusion based on sample data to the entire population. • Differentiate between a sample histogram, a sampling distribution, and a probability distribution. • Explain the Central Limit Theorem (CLT) and why it permits estimation of the population mean and standard deviation. • Estimate a population mean and standard deviation by taking multiple samples from the population. Basic Probability Statistics is based entirely on a branch of mathematics called probability , which is concerned with the likelihood of outcomes for an event.","PeriodicalId":334587,"journal":{"name":"Statistics for the Social Sciences","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics for the Social Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/9781316442715.007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Everything in the preceding chapters of this book has been about using models to describe or represent data that have been collected from a sample – which we learned is a branch of statistics called descriptive statistics. Describing data is an important task. (It would be difficult to learn about data without describing it!) But it is often of limited usefulness because almost all datasets are collected from samples – and most researchers and practitioners in the social sciences are interested in the population as a whole. After all, if a psychologist says, “I have discovered that 35 out of 50 people in my sample got better after therapy,” that isn't very interesting to anyone who isn't a friend or family member of the people in the sample. The vast majority of social science researchers are interested in how their data from their sample applies to a population. But drawing conclusions about an entire population (which may consist of millions of people) based on a sample that consists of a tiny fraction of the population is a difficult logical leap to make. Yet, that leap is not impossible. In fact, the process of how to draw conclusions about a population from sample data was worked out in the early twentieth century, and it is now common in the social sciences to draw these conclusions about populations. This chapter provides the necessary theory of this process. The rest of the chapters in this textbook will discuss the nuts and bolts of actually performing the calculations needed to learn valuable information about a population with just sample data. Learning Goals • Calculate the probability that a particular outcome will occur in a set of events. • Construct a probability distribution based on theoretical probabilities or empirical probabilities and describe why the differences between the two distribution types occur. • Explain the process of generalizing a conclusion based on sample data to the entire population. • Differentiate between a sample histogram, a sampling distribution, and a probability distribution. • Explain the Central Limit Theorem (CLT) and why it permits estimation of the population mean and standard deviation. • Estimate a population mean and standard deviation by taking multiple samples from the population. Basic Probability Statistics is based entirely on a branch of mathematics called probability , which is concerned with the likelihood of outcomes for an event.