{"title":"Comparing Propensity Score Methods in Balancing Covariates and Recovering Impact in Small Sample Educational Program Evaluations.","authors":"Clement A. Stone, Yun Tang","doi":"10.7275/QKQA-9K50","DOIUrl":null,"url":null,"abstract":"Propensity score applications are often used to evaluate educational program impact. However, various options are available to estimate both propensity scores and construct comparison groups. This study used a student achievement dataset with commonly available covariates to compare different propensity scoring estimation methods (logistic regression, boosted regression, and Bayesian logistic regression) in combination with different methods for constructing comparison groups (nearest-neighbor matching, optimal matching, weighting) relative to balancing pre-existing differences and recovering a simulated treatment effect in small samples. Results indicated that applied researchers evaluating program impact should first consider use of standard logistic regression methods with nearest-neighbor or optimal matching or boosted regression in combination with propensity score weighting. Advantages and disadvantages of the methods are discussed.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Practical Assessment, Research and Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7275/QKQA-9K50","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 46
Abstract
Propensity score applications are often used to evaluate educational program impact. However, various options are available to estimate both propensity scores and construct comparison groups. This study used a student achievement dataset with commonly available covariates to compare different propensity scoring estimation methods (logistic regression, boosted regression, and Bayesian logistic regression) in combination with different methods for constructing comparison groups (nearest-neighbor matching, optimal matching, weighting) relative to balancing pre-existing differences and recovering a simulated treatment effect in small samples. Results indicated that applied researchers evaluating program impact should first consider use of standard logistic regression methods with nearest-neighbor or optimal matching or boosted regression in combination with propensity score weighting. Advantages and disadvantages of the methods are discussed.