{"title":"Geo-additive mixed model with variable selection using the adaptive elastic net to handle nonresponse in official rice productivity survey","authors":"Muhlis Ardiansyah , Hari Wijayanto , Anang Kurnia , Anik Djuraidah","doi":"10.1016/j.spasta.2023.100761","DOIUrl":null,"url":null,"abstract":"<div><p>This study is motivated by the nonresponse problem in the official rice productivity survey conducted by Statistics<span><span> Indonesia. Handling nonresponse is essential to support the vision as a quality statistical data provider for advanced Indonesia. This study aimed to improve the quality of official rice productivity data by imputing nonresponse data using the geo-additive mixed model with variable selection. Then we simulated three nonresponse data scenarios to determine whether the imputation technique is better than the listwise deletion. The results showed that the proposed imputation model was the best-imputed model for estimating rice productivity compared to the linear regression, SVM<span><span>, and geo-additive mixed models without variable selection. The proposed model outperforms other models when the data conditions experience spatial autocorrelation and </span>multicollinearity. The proposed model had two advantages. First, variable selection using the adaptive </span></span>elastic net<span><span><span> could overcome multicollinearity problems. Second, adding the mixed geo-additive function caused the model’s residuals to have no spatial autocorrelation. We showed by simulation using empirical data that the proposed </span>imputation method reduces bias when the nonresponse data is not random. Our methodology presents a valuable alternative for improving the quality of </span>official statistics.</span></span></p></div>","PeriodicalId":48771,"journal":{"name":"Spatial Statistics","volume":"56 ","pages":"Article 100761"},"PeriodicalIF":2.1000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spatial Statistics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2211675323000362","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
This study is motivated by the nonresponse problem in the official rice productivity survey conducted by Statistics Indonesia. Handling nonresponse is essential to support the vision as a quality statistical data provider for advanced Indonesia. This study aimed to improve the quality of official rice productivity data by imputing nonresponse data using the geo-additive mixed model with variable selection. Then we simulated three nonresponse data scenarios to determine whether the imputation technique is better than the listwise deletion. The results showed that the proposed imputation model was the best-imputed model for estimating rice productivity compared to the linear regression, SVM, and geo-additive mixed models without variable selection. The proposed model outperforms other models when the data conditions experience spatial autocorrelation and multicollinearity. The proposed model had two advantages. First, variable selection using the adaptive elastic net could overcome multicollinearity problems. Second, adding the mixed geo-additive function caused the model’s residuals to have no spatial autocorrelation. We showed by simulation using empirical data that the proposed imputation method reduces bias when the nonresponse data is not random. Our methodology presents a valuable alternative for improving the quality of official statistics.
期刊介绍:
Spatial Statistics publishes articles on the theory and application of spatial and spatio-temporal statistics. It favours manuscripts that present theory generated by new applications, or in which new theory is applied to an important practical case. A purely theoretical study will only rarely be accepted. Pure case studies without methodological development are not acceptable for publication.
Spatial statistics concerns the quantitative analysis of spatial and spatio-temporal data, including their statistical dependencies, accuracy and uncertainties. Methodology for spatial statistics is typically found in probability theory, stochastic modelling and mathematical statistics as well as in information science. Spatial statistics is used in mapping, assessing spatial data quality, sampling design optimisation, modelling of dependence structures, and drawing of valid inference from a limited set of spatio-temporal data.