Manuel Febrero-Bande, Pedro Galeano, Eduardo García-Portugués, Wenceslao González-Manteiga
{"title":"Testing for linearity in scalar-on-function regression with responses missing at random","authors":"Manuel Febrero-Bande, Pedro Galeano, Eduardo García-Portugués, Wenceslao González-Manteiga","doi":"10.1007/s00180-023-01445-2","DOIUrl":null,"url":null,"abstract":"<p>A goodness-of-fit test for the Functional Linear Model with Scalar Response (FLMSR) with responses Missing at Random (MAR) is proposed in this paper. The test statistic relies on a marked empirical process indexed by the projected functional covariate and its distribution under the null hypothesis is calibrated using a wild bootstrap procedure. The computation and performance of the test rely on having an accurate estimator of the functional slope of the FLMSR when the sample has MAR responses. Three estimation methods based on the Functional Principal Components (FPCs) of the covariate are considered. First, the <i>simplified</i> method estimates the functional slope by simply discarding observations with missing responses. Second, the <i>imputed</i> method estimates the functional slope by imputing the missing responses using the simplified estimator. Third, the <i>inverse probability weighted</i> method incorporates the missing response generation mechanism when imputing. Furthermore, both cross-validation and LASSO regression are used to select the FPCs used by each estimator. Several Monte Carlo experiments are conducted to analyze the behavior of the testing procedure in combination with the functional slope estimators. Results indicate that estimators performing missing-response imputation achieve the highest power. The testing procedure is applied to check for linear dependence between the average number of sunny days per year and the mean curve of daily temperatures at weather stations in Spain.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"8 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s00180-023-01445-2","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
A goodness-of-fit test for the Functional Linear Model with Scalar Response (FLMSR) with responses Missing at Random (MAR) is proposed in this paper. The test statistic relies on a marked empirical process indexed by the projected functional covariate and its distribution under the null hypothesis is calibrated using a wild bootstrap procedure. The computation and performance of the test rely on having an accurate estimator of the functional slope of the FLMSR when the sample has MAR responses. Three estimation methods based on the Functional Principal Components (FPCs) of the covariate are considered. First, the simplified method estimates the functional slope by simply discarding observations with missing responses. Second, the imputed method estimates the functional slope by imputing the missing responses using the simplified estimator. Third, the inverse probability weighted method incorporates the missing response generation mechanism when imputing. Furthermore, both cross-validation and LASSO regression are used to select the FPCs used by each estimator. Several Monte Carlo experiments are conducted to analyze the behavior of the testing procedure in combination with the functional slope estimators. Results indicate that estimators performing missing-response imputation achieve the highest power. The testing procedure is applied to check for linear dependence between the average number of sunny days per year and the mean curve of daily temperatures at weather stations in Spain.
期刊介绍:
Computational Statistics (CompStat) is an international journal which promotes the publication of applications and methodological research in the field of Computational Statistics. The focus of papers in CompStat is on the contribution to and influence of computing on statistics and vice versa. The journal provides a forum for computer scientists, mathematicians, and statisticians in a variety of fields of statistics such as biometrics, econometrics, data analysis, graphics, simulation, algorithms, knowledge based systems, and Bayesian computing. CompStat publishes hardware, software plus package reports.