Zhuojian Chen, Zhanfeng Wang, Yuan-chin Ivan Chang
{"title":"Distributed sequential estimation procedures","authors":"Zhuojian Chen, Zhanfeng Wang, Yuan-chin Ivan Chang","doi":"10.1002/cjs.11762","DOIUrl":null,"url":null,"abstract":"<p>Data collected from distributed sources or sites commonly have different distributions or contaminated observations. Active learning procedures allow us to assess data when recruiting new data into model building. Thus, combining several active learning procedures together is a promising idea, even when the collected data set is contaminated. Here, we study how to conduct and integrate several adaptive sequential procedures at a time to produce a valid result via several machines or a parallel-computing framework. To avoid distraction by complicated modelling processes, we use confidence set estimation for linear models to illustrate the proposed method and discuss the approach's statistical properties. We then evaluate its performance using both synthetic and real data. We have implemented our method using Python and made it available through Github at https://github.com/zhuojianc/dsep.</p>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cjs.11762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data collected from distributed sources or sites commonly have different distributions or contaminated observations. Active learning procedures allow us to assess data when recruiting new data into model building. Thus, combining several active learning procedures together is a promising idea, even when the collected data set is contaminated. Here, we study how to conduct and integrate several adaptive sequential procedures at a time to produce a valid result via several machines or a parallel-computing framework. To avoid distraction by complicated modelling processes, we use confidence set estimation for linear models to illustrate the proposed method and discuss the approach's statistical properties. We then evaluate its performance using both synthetic and real data. We have implemented our method using Python and made it available through Github at https://github.com/zhuojianc/dsep.