{"title":"软阈值抽样下二元正态分布的选择偏差调整推理","authors":"Joseph B. Lang","doi":"10.1007/s10463-025-00925-0","DOIUrl":null,"url":null,"abstract":"<div><p>The problem of estimating parameters and predicting outcomes of a bivariate Normal distribution is more challenging when, owing to data-dependent selection (or missingness or dropout), the available data are not a representative sample of bivariate realizations. This problem is addressed using an observation model that is induced by a combination of a multivariate Normal “science” model and a realistic “soft-threshold selection” model with unknown truncation point. This observation model, which is expressed using an intuitive selection subset notation, is a generalization of existing “hard-threshold” models. It affords simple-to-compute selection-bias-adjusted estimates of both the regression (conditional mean) parameters and the bivariate correlation. In addition, a simple bootstrap approach for computing both confidence and prediction intervals in the soft-threshold selection setting is described. Simulation results are promising. To motivate this research, two illustrative examples describe a setting where selection bias is an issue of concern.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"77 4","pages":"597 - 625"},"PeriodicalIF":0.6000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Selection-bias-adjusted inference for the bivariate normal distribution under soft-threshold sampling\",\"authors\":\"Joseph B. Lang\",\"doi\":\"10.1007/s10463-025-00925-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The problem of estimating parameters and predicting outcomes of a bivariate Normal distribution is more challenging when, owing to data-dependent selection (or missingness or dropout), the available data are not a representative sample of bivariate realizations. This problem is addressed using an observation model that is induced by a combination of a multivariate Normal “science” model and a realistic “soft-threshold selection” model with unknown truncation point. This observation model, which is expressed using an intuitive selection subset notation, is a generalization of existing “hard-threshold” models. It affords simple-to-compute selection-bias-adjusted estimates of both the regression (conditional mean) parameters and the bivariate correlation. In addition, a simple bootstrap approach for computing both confidence and prediction intervals in the soft-threshold selection setting is described. Simulation results are promising. To motivate this research, two illustrative examples describe a setting where selection bias is an issue of concern.</p></div>\",\"PeriodicalId\":55511,\"journal\":{\"name\":\"Annals of the Institute of Statistical Mathematics\",\"volume\":\"77 4\",\"pages\":\"597 - 625\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2025-05-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of the Institute of Statistical Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10463-025-00925-0\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of the Institute of Statistical Mathematics","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10463-025-00925-0","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Selection-bias-adjusted inference for the bivariate normal distribution under soft-threshold sampling
The problem of estimating parameters and predicting outcomes of a bivariate Normal distribution is more challenging when, owing to data-dependent selection (or missingness or dropout), the available data are not a representative sample of bivariate realizations. This problem is addressed using an observation model that is induced by a combination of a multivariate Normal “science” model and a realistic “soft-threshold selection” model with unknown truncation point. This observation model, which is expressed using an intuitive selection subset notation, is a generalization of existing “hard-threshold” models. It affords simple-to-compute selection-bias-adjusted estimates of both the regression (conditional mean) parameters and the bivariate correlation. In addition, a simple bootstrap approach for computing both confidence and prediction intervals in the soft-threshold selection setting is described. Simulation results are promising. To motivate this research, two illustrative examples describe a setting where selection bias is an issue of concern.
期刊介绍:
Annals of the Institute of Statistical Mathematics (AISM) aims to provide a forum for open communication among statisticians, and to contribute to the advancement of statistics as a science to enable humans to handle information in order to cope with uncertainties. It publishes high-quality papers that shed new light on the theoretical, computational and/or methodological aspects of statistical science. Emphasis is placed on (a) development of new methodologies motivated by real data, (b) development of unifying theories, and (c) analysis and improvement of existing methodologies and theories.