{"title":"Analysis of household samples: the 1901 census of Canada.","authors":"M Ornstein","doi":"10.1080/01615440009598960","DOIUrl":null,"url":null,"abstract":"he sample of the nominal census for 1901 prepared by the Canadian Families Project is a sample of T households or dwellings, and the sampling point is the count of dwellings entered by the enumerator in column 1 of Schedule 1. Five percent of all dwellings on each microfilm reel were selected randomly; thus, the sample is stratified by microfilm reel. All individuals in each sampled dwelling were entered into the data set. Household samples for which information is gathered for every household member actually involve two levels of sampling and analysis. Usually, a simple random sample or stratified sample of households is selected. The resulting sample of individuals, however, is a cluster sample; it is a stratified cluster sample if the household sample is stratified. The selection probabilities are the same for individuals and for households. Thus, if the household sample is selfweighting, or epsem (equal probability of selection method)-which means that no weights are required to obtain unbiased estimates of population characteristicsthen so is the individual sample. The analysis of household characteristics is straightforward. For example, regional comparisons of household size require only the household sample. However, a cluster sample generally provides less information than a simple random sample of the same size, in this case, because members of the same household are less different than a simple random sample of individuals, who for the most part were from different households. In other words, the characteristics of one household member of a cluster usually go some way toward predicting the characteristics of the other household members. With household samples, that is often true for religion, for example. Usually, the religion of one household member is a good (but not perfect) predictor of the religion of all the other household members. The degree of within-household similarity is different for each variable. Because parents and children, and women and men, live together, households are not particularly homogeneous in age or sex composition. The consequence of within-cluster similarity is that estimates of statistical parameters generally have less precision than the parameters that would be obtained from a simple random sample of the same size. When cluster samples are used with computer programs that cannot, or are not “instructed” to, take account of clustering and assume a simple random sample, such as SPSS and SAS, erroneous standard errors, confidence intervals, and significance tests are computed. Almost always, standard errors are underestimated, the confidence intervals are too narrow, and statistical significance is overestimated. One can compute the degree of misestimation exactly by measuring the withincluster homogeneity, but the degree of misestimation cannot be predicted beforehand and is different for every variable. For that reason, the commonsense “fix” of decreasing the weight for each observation by some multiplier to take account of the loss in precision results in overestimates of some confidence intervals and underestimates of others. Some software, such as the STATA package, used in examples cited later, provides correct confidence intervals and significance levels, but STATA is not in general use by sociologists and historians.’ Clustering is always a problem in principle, but whether the considerable effort required to deal with the statistical issues is justified depends on the particular analysis. The key issue is whether one is working with a “small” or “large” sample. With a large sample, effects that just reach statistical significance are usually too small to be of substantive interest, so incorrect significance tests are not a worry. Still, it is hard to argue in favor of using statistical procedures and programs that are known to give wrong answers. With small samples, the problem is much more severe: not accounting for clustering may dramatically increase the risk of obtaining “findings” that are not actual-","PeriodicalId":45535,"journal":{"name":"Historical Methods","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2000-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/01615440009598960","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Historical Methods","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1080/01615440009598960","RegionNum":2,"RegionCategory":"历史学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HISTORY","Score":null,"Total":0}
引用次数: 16
Abstract
he sample of the nominal census for 1901 prepared by the Canadian Families Project is a sample of T households or dwellings, and the sampling point is the count of dwellings entered by the enumerator in column 1 of Schedule 1. Five percent of all dwellings on each microfilm reel were selected randomly; thus, the sample is stratified by microfilm reel. All individuals in each sampled dwelling were entered into the data set. Household samples for which information is gathered for every household member actually involve two levels of sampling and analysis. Usually, a simple random sample or stratified sample of households is selected. The resulting sample of individuals, however, is a cluster sample; it is a stratified cluster sample if the household sample is stratified. The selection probabilities are the same for individuals and for households. Thus, if the household sample is selfweighting, or epsem (equal probability of selection method)-which means that no weights are required to obtain unbiased estimates of population characteristicsthen so is the individual sample. The analysis of household characteristics is straightforward. For example, regional comparisons of household size require only the household sample. However, a cluster sample generally provides less information than a simple random sample of the same size, in this case, because members of the same household are less different than a simple random sample of individuals, who for the most part were from different households. In other words, the characteristics of one household member of a cluster usually go some way toward predicting the characteristics of the other household members. With household samples, that is often true for religion, for example. Usually, the religion of one household member is a good (but not perfect) predictor of the religion of all the other household members. The degree of within-household similarity is different for each variable. Because parents and children, and women and men, live together, households are not particularly homogeneous in age or sex composition. The consequence of within-cluster similarity is that estimates of statistical parameters generally have less precision than the parameters that would be obtained from a simple random sample of the same size. When cluster samples are used with computer programs that cannot, or are not “instructed” to, take account of clustering and assume a simple random sample, such as SPSS and SAS, erroneous standard errors, confidence intervals, and significance tests are computed. Almost always, standard errors are underestimated, the confidence intervals are too narrow, and statistical significance is overestimated. One can compute the degree of misestimation exactly by measuring the withincluster homogeneity, but the degree of misestimation cannot be predicted beforehand and is different for every variable. For that reason, the commonsense “fix” of decreasing the weight for each observation by some multiplier to take account of the loss in precision results in overestimates of some confidence intervals and underestimates of others. Some software, such as the STATA package, used in examples cited later, provides correct confidence intervals and significance levels, but STATA is not in general use by sociologists and historians.’ Clustering is always a problem in principle, but whether the considerable effort required to deal with the statistical issues is justified depends on the particular analysis. The key issue is whether one is working with a “small” or “large” sample. With a large sample, effects that just reach statistical significance are usually too small to be of substantive interest, so incorrect significance tests are not a worry. Still, it is hard to argue in favor of using statistical procedures and programs that are known to give wrong answers. With small samples, the problem is much more severe: not accounting for clustering may dramatically increase the risk of obtaining “findings” that are not actual-
期刊介绍:
Historical Methodsreaches an international audience of social scientists concerned with historical problems. It explores interdisciplinary approaches to new data sources, new approaches to older questions and material, and practical discussions of computer and statistical methodology, data collection, and sampling procedures. The journal includes the following features: “Evidence Matters” emphasizes how to find, decipher, and analyze evidence whether or not that evidence is meant to be quantified. “Database Developments” announces major new public databases or large alterations in older ones, discusses innovative ways to organize them, and explains new ways of categorizing information.