{"title":"Plasmode simulation for the evaluation of causal inference methods in homophilous social networks","authors":"Vanessa McNealis, Erica E. M. Moodie, Nema Dean","doi":"arxiv-2409.01316","DOIUrl":null,"url":null,"abstract":"Typical simulation approaches for evaluating the performance of statistical\nmethods on populations embedded in social networks may fail to capture\nimportant features of real-world networks. It can therefore be unclear whether\ninference methods for causal effects due to interference that have been shown\nto perform well in such synthetic networks are applicable to social networks\nwhich arise in the real world. Plasmode simulation studies use a real dataset\ncreated from natural processes, but with part of the data-generation mechanism\nknown. However, given the sensitivity of relational data, many network data are\nprotected from unauthorized access or disclosure. In such case, plasmode\nsimulations cannot use released versions of real datasets which often omit the\nnetwork links, and instead can only rely on parameters estimated from them. A\nstatistical framework for creating replicated simulation datasets from private\nsocial network data is developed and validated. The approach consists of\nsimulating from a parametric exponential family random graph model fitted to\nthe network data and resampling from the observed exposure and covariate\ndistributions to preserve the associations among these variables.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Typical simulation approaches for evaluating the performance of statistical
methods on populations embedded in social networks may fail to capture
important features of real-world networks. It can therefore be unclear whether
inference methods for causal effects due to interference that have been shown
to perform well in such synthetic networks are applicable to social networks
which arise in the real world. Plasmode simulation studies use a real dataset
created from natural processes, but with part of the data-generation mechanism
known. However, given the sensitivity of relational data, many network data are
protected from unauthorized access or disclosure. In such case, plasmode
simulations cannot use released versions of real datasets which often omit the
network links, and instead can only rely on parameters estimated from them. A
statistical framework for creating replicated simulation datasets from private
social network data is developed and validated. The approach consists of
simulating from a parametric exponential family random graph model fitted to
the network data and resampling from the observed exposure and covariate
distributions to preserve the associations among these variables.