{"title":"Approximate tolerance intervals for nonparametric regression models","authors":"Yafan Guo, Derek S. Young","doi":"10.1080/10485252.2023.2277260","DOIUrl":"https://doi.org/10.1080/10485252.2023.2277260","url":null,"abstract":"AbstractTolerance intervals in regression allow the user to quantify, with a specified degree of confidence, bounds for a specified proportion of the sampled population when conditioned on a set of covariate values. While methods are available for tolerance intervals in fully-parametric regression settings, the construction of tolerance intervals for nonparametric regression models has been treated in a limited capacity. This paper fills this gap and develops likelihood-based approaches for the construction of pointwise one-sided and two-sided tolerance intervals for nonparametric regression models. A numerical approach is also presented for constructing simultaneous tolerance intervals. An appealing facet of this work is that the resulting methodology is consistent with what is done for fully-parametric regression tolerance intervals. Extensive coverage studies are presented, which demonstrate very good performance of the proposed methods. The proposed tolerance intervals are calculated and interpreted for analyses involving a fertility dataset and a triceps measurement dataset.Keywords: Bootstrapboundary effectscoverage probabilitiesk-factorsmoothing splineAMS Subject Classifications: 62G0862G15 AcknowledgmentsWe would thank the University of Kentucky Center for Computational Sciences and Information Technology Services Research Computing for their support and use of the Lipscomb Compute Cluster and associated research computing resources. The authors are also thankful to the Associate Editor and two reviewers who provided numerous insightful comments that improved the overall quality of this work.Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementThe fertility data are available at the HFC's website bluehttps://www.fertilitydata.org/cgi-bin/data.php. The triceps data are available in the R package MultiKink (Wan and Zhong Citation2020), and can be accessed by typing data(triceps).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"33 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135933082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Penalised estimation of partially linear additive zero-inflated Bernoulli regression models","authors":"Minggen Lu, Chin-Shang Li, Karla D. Wagner","doi":"10.1080/10485252.2023.2275056","DOIUrl":"https://doi.org/10.1080/10485252.2023.2275056","url":null,"abstract":"AbstractWe develop a practical and computationally efficient penalised estimation approach for partially linear additive models to zero-inflated binary outcome data. To facilitate estimation, B-splines are employed to approximate unknown nonparametric components. A two-stage iterative expectation-maximisation (EM) algorithm is proposed to calculate penalised spline estimates. The large-sample properties such as the uniform convergence and the optimal rate of convergence for functional estimators, and the asymptotic normality and efficiency for regression coefficient estimators are established. Further, two variance-covariance estimation approaches are proposed to provide reliable Wald-type inference for regression coefficients. We conducted an extensive Monte Carlo study to evaluate the numerical properties of the proposed penalised methodology and compare it to the competing spline method [Li and Lu. ‘Semiparametric Zero-Inflated Bernoulli Regression with Applications’, Journal of Applied Statistics, 49, 2845–2869]. The methodology is further illustrated by an egocentric network study.Keywords: Additive Bernoulli regressionB-splineEM algorithmpenalised estimationzero-inflatedAMS SUBJECT CLASSIFICATIONS: 62G0562G2062G08 AcknowledgmentsThe authors are grateful to the Editor, the Associate Editor, and two reviewers for their useful comments and constructive suggestions which led to significant improvement in the revised manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research was partially supported by the National Institute on Drug Abuse (NIDA) of the National Institutes of Health under Award Number R01DA038185.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"74 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136235012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data","authors":"Kin Yap Cheung, Stephen M. S. Lee","doi":"10.1080/10485252.2023.2270079","DOIUrl":"https://doi.org/10.1080/10485252.2023.2270079","url":null,"abstract":"AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135883797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Threshold selection for extremal index estimation","authors":"Natalia M. Markovich, Igor V. Rodionov","doi":"10.1080/10485252.2023.2266050","DOIUrl":"https://doi.org/10.1080/10485252.2023.2266050","url":null,"abstract":"ABSTRACTWe propose a new threshold selection method for nonparametric estimation of the extremal index of stochastic processes. The discrepancy method was proposed as a data-driven smoothing tool for estimation of a probability density function. Now it is modified to select a threshold parameter of an extremal index estimator. A modification of the discrepancy statistic based on the Cramér–von Mises–Smirnov statistic ω2 is calculated by k largest order statistics instead of an entire sample. Its asymptotic distribution as k→∞ is proved to coincide with the ω2-distribution. Its quantiles are used as discrepancy values. The convergence rate of an extremal index estimate coupled with the discrepancy method is derived. The discrepancy method is used as an automatic threshold selection for the intervals and K-gaps estimators. It may be applied to other estimators of the extremal index. The performance of our method is evaluated by simulated and real data examples.KEYWORDS: Cramér–von Mises–Smirnov statisticdiscrepancy methodextremal indexnonparametric estimationthreshold selectionAMS SUBJECT CLASSIFICATION:: 62G32 Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 The connection between (Equation1(1) ωn2=n∫−∞∞(Fn(x)−F(x))2dF(x)(1) ) and (Equation2(2) ω^n2(h)=∑i=1n(F^h(Xi,n)−i−0.5n)2+112n(2) ) can be found in Markovich (Citation2007, p. 81).2 Theoretically, events {Ti=1} are allowed. In practice, such cases related to single inter-arrival times between consecutive exceedances are meaningless.3 The modification (ω^n2−0.4/n+0.6/n2)(1+1/n) of classical statistic (Equation2(2) ω^n2(h)=∑i=1n(F^h(Xi,n)−i−0.5n)2+112n(2) ) eliminates the dependence of the percentage points of the C–M–S statistic on the sample size (Stephens Citation1974). For n>40 it changes the statistic on less than one percent. One can use the modification with regard to ω~L2(θ^) for finite L due to the closeness of its distribution to the limit distribution of the C–M–S statistic by Theorem 3.2.Additional informationFundingThe work of N.M. Markovich in Sections 1, 2, 4 and 5 was supported by the Russian Science Foundation [grant number 22-21-00177]. The work of I. V. Rodionov in Section 3 and proofs in Markovich and Rodionov (Citation2022) was performed at the Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences with the support of the Russian Science Foundation (grant No. 21-71-00035).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135805169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model-free prediction of time series: a nonparametric approach","authors":"Mohammad Mohammadi, Meng Li","doi":"10.1080/10485252.2023.2266740","DOIUrl":"https://doi.org/10.1080/10485252.2023.2266740","url":null,"abstract":"AbstractWe propose a novel approach for model-free time series forecasting. Unlike most existing methods, the proposed method does not rely on parametric error distributions nor assume parametric forms of the mean function, leading to broad applicability. We achieve such generality by establishing a simple but powerful representation of a time series {Xt;t∈Z} with suptE|Xt|<∞, that is, Xt has a solution which is a linear combination of infinite past values. Then using the obtained solution a prediction algorithm is presented, with large sample theoretical guarantees. Simulation studies show favourable performance of the proposed method compared with popular parametric and neural networks methods, and suggest its superiority when the sample size is small. An application to practical time series is discussed.Keywords: Predictionnonparametric methodsneural networksα-stable distributionMSC2010 subject classifications:: Primary: 60G25Secondary: 62M20 Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 See https://www.sciencedirect.com/topics/engineering/left-inverse.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136097527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On estimation of covariance function for functional data with detection limits","authors":"Haiyan Liu, Jeanine Houwing-Duistermaat","doi":"10.1080/10485252.2023.2258999","DOIUrl":"https://doi.org/10.1080/10485252.2023.2258999","url":null,"abstract":"In many studies on disease progression, biomarkers are restricted by detection limits, hence informatively missing. Current approaches ignore the problem by just filling in the value of the detection limit for the missing observations for the estimation of the mean and covariance function, which yield inaccurate estimation. Inspired by our recent work [Liu and Houwing-Duistermaat (2022), ‘Fast Estimators for the Mean Function for Functional Data with Detection Limits’, Stat, e467.] in which novel estimators for mean function for data subject to detection limit are proposed, in this paper, we will propose a novel estimator for the covariance function for sparse and dense data subject to a detection limit. We will derive the asymptotic properties of the estimator. We will compare our method to the standard method, which ignores the detection limit, via simulations. We will illustrate the new approach by analysing biomarker data subject to a detection limit. In contrast to the standard method, our method appeared to provide more accurate estimates of the covariance. Moreover its computation time is small.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135106403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fighting selection bias in statistical learning: application to visual recognition from biased image databases","authors":"Stephan Clémençon, Pierre Laforgue, Robin Vogel","doi":"10.1080/10485252.2023.2259011","DOIUrl":"https://doi.org/10.1080/10485252.2023.2259011","url":null,"abstract":"AbstractIn practice, and especially when training deep neural networks, visual recognition rules are often learned based on various sources of information. On the other hand, the recent deployment of facial recognition systems with uneven performances on different population segments has highlighted the representativeness issues induced by a naive aggregation of the datasets. In this paper, we show how biasing models can remedy these problems. Based on the (approximate) knowledge of the biasing mechanisms at work, our approach consists in reweighting the observations, so as to form a nearly debiased estimator of the target distribution. One key condition is that the supports of the biased distributions must partly overlap, and cover the support of the target distribution. In order to meet this requirement in practice, we propose to use a low dimensional image representation, shared across the image databases. Finally, we provide numerical experiments highlighting the relevance of our approach.Keywords: Sampling biasselection effectvisual recognitionreliable statistical learning Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was partially supported by the research chair ‘Good In Tech : Rethinking innovation and technology as drivers of a better world for and by humans’, under the auspices of the ‘Fondation du Risque’ and in partnership with the Institut Mines-Télécom, Sciences Po, Afnor, Ag2r La Mondiale, CGI France, Danone and Sycomore.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135106810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient nonparametric estimation of generalised autocovariances","authors":"Alessandra Luati, Francesca Papagni, Tommaso Proietti","doi":"10.1080/10485252.2023.2252527","DOIUrl":"https://doi.org/10.1080/10485252.2023.2252527","url":null,"abstract":"This paper provides a necessary and sufficient condition for asymptotic efficiency of a nonparametric estimator of the generalised autocovariance function of a stationary random process. The generalised autocovariance function is the inverse Fourier transform of a power transformation of the spectral density and encompasses the traditional and inverse autocovariance functions as particular cases. A nonparametric estimator is based on the inverse discrete Fourier transform of the power transformation of the pooled periodogram. We consider two cases: the fixed bandwidth design and the adaptive bandwidth design. The general result on the asymptotic efficiency, established for linear processes, is then applied to the class of stationary ARMA processes and its implications are discussed. Finally, we illustrate that for a class of contrast functionals and spectral densities, the minimum contrast estimator of the spectral density satisfies a Yule–Walker system of equations in the generalised autocovariance estimator.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134950151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonparametric relative error estimation of the regression function for left truncated and right censored time series data","authors":"N. Bayarassou, F. Hamrani, E. Ould Saïd","doi":"10.1080/10485252.2023.2241572","DOIUrl":"https://doi.org/10.1080/10485252.2023.2241572","url":null,"abstract":"","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"30 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76728038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Boundary-adaptive kernel density estimation: the case of (near) uniform density","authors":"J. Racine, Qi Li, Qiaoyu Wang","doi":"10.1080/10485252.2023.2250011","DOIUrl":"https://doi.org/10.1080/10485252.2023.2250011","url":null,"abstract":"","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"110 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73284507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}