{"title":"Empirical likelihood in a partially linear single-index model with censored response data","authors":"Liugen Xue","doi":"10.1016/j.csda.2023.107912","DOIUrl":"10.1016/j.csda.2023.107912","url":null,"abstract":"<div><p><span><span>An empirical likelihood (EL) approach for a partial linear single-index model with censored response data is studied. A bias-corrected EL ratio is proposed, and the asymptotic chi-squared distribution of this ratio is obtained. The result can be directly used to construct the confidence regions of the regression parameters. The estimators of regression parameters and link function are constructed, and their </span>asymptotic distributions are obtained. Also, a confidence band of the link function is constructed. The proposed method has two main features: The first feature is that the EL ratio is calibrated directly from within, instead of multiplying an adjustment factor by an EL ratio, which reflects the nature of EL. The second feature is avoiding undersmoothing of nonparametric functions, thus ensuring that the </span><span><math><msqrt><mrow><mi>n</mi></mrow></msqrt></math></span>-consistency of the parameter estimator. As a byproduct, the EL and estimation of a single-index model with censored response data are studied. The performance of the bias-corrected EL is evaluated by the simulation studies. The proposed method is illustrated with an example of a real data analysis.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139456535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized latent space model for one-mode networks with awareness of two-mode networks","authors":"Xinyan Fan , Kuangnan Fang , Dan Pu , Ruixuan Qin","doi":"10.1016/j.csda.2023.107915","DOIUrl":"10.1016/j.csda.2023.107915","url":null,"abstract":"<div><p>Latent space models have been widely studied for one-mode networks, in which the same type of nodes connect with each other. In many applications, one-mode networks are often observed along with two-mode networks, which reflect connections between different types of nodes and provide important information for understanding the one-mode network structure. However, the classical one-mode latent space models have several limitations in incorporating two-mode networks. To address this gap, a generalized latent space model is proposed to capture common structures and heterogeneous connecting patterns across one-mode and two-mode networks. Specifically, each node is embedded with a latent vector and network-specific degree parameters that determine the connection probabilities<span> between nodes. A projected gradient descent algorithm is developed to estimate the latent vectors and degree parameters. Moreover, the theoretical properties of the estimators are established and it has been proven that the estimation accuracy of the shared latent vectors can be improved through incorporating two-mode networks. Finally, simulation studies and applications on two real-world datasets demonstrate the usefulness of the proposed model.</span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139455082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Oracle-efficient estimation and trend inference in non-stationary time series with trend and heteroscedastic ARMA error","authors":"Chen Zhong","doi":"10.1016/j.csda.2024.107917","DOIUrl":"https://doi.org/10.1016/j.csda.2024.107917","url":null,"abstract":"<div><p><span>The non-stationary time series often contain an unknown trend and unobserved error terms. The error terms in the proposed model consist of a smooth variance function and the latent stationary ARMA series, which allows heteroscedasticity at different </span>time points<span>. The theoretically justified two-step B-spline estimation method is proposed for the trend and variance function in the model, and then residuals are obtained by removing the trend and variance function estimators from the data. The maximum likelihood estimator<span><span><span> (MLE) for the latent ARMA error coefficients based on the residuals is shown to be oracally efficient in the sense that it has the same </span>asymptotic distribution<span> as the infeasible MLE if the trend and variance function were known. In addition to the oracle efficiency, a kernel estimator is obtained for the trend function and shown to converge to the </span></span>Gumbel distribution. It yields an asymptotically correct simultaneous confidence band (SCB) for the trend function, which can be used to test the specific form of trend. A simulation-based procedure is proposed to implement the SCB, and simulation and real data analysis illustrate the finite sample performance.</span></span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139433912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Change point detection via feedforward neural networks with theoretical guarantees","authors":"Houlin Zhou, Hanbing Zhu, Xuejun Wang","doi":"10.1016/j.csda.2023.107913","DOIUrl":"https://doi.org/10.1016/j.csda.2023.107913","url":null,"abstract":"<div><p><span>This article mainly studies change point detection for mean shift<span> change point model. An estimation method is proposed to estimate the change point via feedforward neural networks. The complete </span></span><em>f</em><span>-moment consistency of the proposed estimator is obtained. Numerical simulation results show that the performance of the proposed estimator is better than that of cumulative sum type estimator which is widely used in the change point detection, especially when the mean shift signal size is small. Finally, we demonstrate the proposed method by empirically analyzing a stock data set.</span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139434483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Group variable selection via group sparse neural network","authors":"Xin Zhang , Junlong Zhao","doi":"10.1016/j.csda.2023.107911","DOIUrl":"10.1016/j.csda.2023.107911","url":null,"abstract":"<div><p><span>Group variable selection is an important issue in high-dimensional data modeling and most of existing methods consider only the linear model. Therefore, a new method based on the deep neural network<span><span><span> (DNN), an increasingly popular nonlinear method in both statistics and </span>deep learning communities, is proposed. The method is applicable to general </span>nonlinear models, including the linear model as a special case. Specifically, a </span></span><span><em>group sparse </em><em>neural network</em></span> (GSNN) is designed, where the definition of <em>nonlinear group high-level features</em> (NGHFs) is generalized to the network structure. A <em>two-stage group sparse</em><span><span> (TGS) algorithm is employed to induce group variables selection by performing group structure selection on the network. GSNN is promising for complex nonlinear systems with interactions and </span>correlated predictors, overcoming the shortcomings of linear or marginal variable selection methods. Theoretical results on convergence and group-level selection consistency are also given. Simulations results and real data analysis demonstrate the superiority of our method.</span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139062656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HiQR: An efficient algorithm for high-dimensional quadratic regression with penalties","authors":"Cheng Wang , Haozhe Chen , Binyan Jiang","doi":"10.1016/j.csda.2023.107904","DOIUrl":"10.1016/j.csda.2023.107904","url":null,"abstract":"<div><p><span><span>This paper investigates the efficient solution of penalized quadratic regressions in high-dimensional settings. A novel and efficient algorithm for ridge-penalized quadratic regression is proposed, leveraging the matrix structures of the regression with interactions. Additionally, an </span>alternating direction method of multipliers (ADMM) framework is developed for penalized quadratic regression with general penalties, including both single and hybrid penalty functions. The approach simplifies the calculations to basic matrix-based operations, making it appealing in terms of both memory storage and </span>computational complexity for solving penalized quadratic regressions in high-dimensional settings.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139063073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tingting Cai , Jianbo Li , Qin Zhou , Songlou Yin , Riquan Zhang
{"title":"Subgroup detection based on partially linear additive individualized model with missing data in response","authors":"Tingting Cai , Jianbo Li , Qin Zhou , Songlou Yin , Riquan Zhang","doi":"10.1016/j.csda.2023.107910","DOIUrl":"https://doi.org/10.1016/j.csda.2023.107910","url":null,"abstract":"<div><p><span>Based on partially linear additive individualized model, a fusion-penalized inverse probability<span> weighted least squares method<span> is proposed to detect the subgroup for missing data in response. Firstly, the B-spline technique is used to approximate the unknown additive individualized functions and then an inverse probability weighted quadratic loss function<span> is established with fusion penalty on the difference of subject-wise B-spline coefficients. Secondly, minimization of such quadratic loss function leads to the estimation of linear regression parameters<span> and individualized B spline coefficients. With a proper tuning parameter, some differences in penalty term are shrunk into zero and thus the corresponding subjects will be clustered into the same subgroup. Thirdly, a </span></span></span></span></span>clustering method<span> is developed to automatically determine the subgroup membership for the subjects with missing data. Fourthly, large sample properties of resulting estimates are given under some regular conditions. Finally, numerical studies are presented to illustrate the performance of the proposed subgroup detection method.</span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138838653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katarzyna Filipiak , Daniel Klein , Monika Mokrzycka
{"title":"Discrepancy between structured matrices in the power analysis of a separability test","authors":"Katarzyna Filipiak , Daniel Klein , Monika Mokrzycka","doi":"10.1016/j.csda.2023.107907","DOIUrl":"10.1016/j.csda.2023.107907","url":null,"abstract":"<div><p><span><span>An important task in the analysis of multivariate data is testing of the </span>covariance matrix<span><span> structure. In particular, for assessing separability, various tests have been proposed. However, the development of a method of measuring discrepancy between two covariance matrix structures, in relation to the study of the power of the test, remains an open problem. Therefore, a </span>discrepancy measure is proposed such that for two arbitrary alternative hypotheses with the same value of discrepancy, the power of tests remains stable, while for increasing discrepancy the power increases. The basic hypothesis is related to the separable structure of the </span></span>observation matrix<span><span><span><span> under a doubly multivariate normal model, as assessed by the likelihood ratio and Rao score tests. It is shown that the particular one-parameter method and the </span>Frobenius norm fail in the power analysis of tests, while the entropy and </span>quadratic loss functions<span> can be efficiently used to measure the discrepancy between separable and non-separable covariance structures for a </span></span>multivariate normal distribution.</span></p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2023-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138680557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cristina Tortora , Brian C. Franczak , Luca Bagnato , Antonio Punzo
{"title":"A Laplace-based model with flexible tail behavior","authors":"Cristina Tortora , Brian C. Franczak , Luca Bagnato , Antonio Punzo","doi":"10.1016/j.csda.2023.107909","DOIUrl":"10.1016/j.csda.2023.107909","url":null,"abstract":"<div><p>The proposed multiple scaled contaminated asymmetric Laplace (MSCAL) distribution is an extension of the multivariate asymmetric Laplace distribution to allow for a different excess kurtosis on each dimension and for more flexible shapes of the hyper-contours. These peculiarities are obtained by working on the principal component (PC) space. The structure of the MSCAL distribution has the further advantage of allowing for automatic PC-wise outlier detection – i.e., detection of outliers separately on each PC – when convenient constraints on the parameters are imposed. The MSCAL is fitted using a Monte Carlo expectation-maximization (MCEM) algorithm that uses a Monte Carlo method to estimate the orthogonal matrix of eigenvectors. A simulation study is used to assess the proposed MCEM in terms of computational efficiency and parameter recovery. In a real data application, the MSCAL is fitted to a real data set containing the anthropometric measurements of monozygotic/dizygotic twins. Both a skewed bivariate subset of the full data, perturbed by some outlying points, and the full data are considered.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167947323002207/pdfft?md5=d2a7615bc71ed59a59a646714a4b93c6&pid=1-s2.0-S0167947323002207-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138680780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-based spatial segmentation of areal data","authors":"Vivien Goepp , Jan van de Kassteele","doi":"10.1016/j.csda.2023.107908","DOIUrl":"10.1016/j.csda.2023.107908","url":null,"abstract":"<div><p><span>Smoothing is often used to improve the readability and interpretability of noisy areal data. However, there are many instances where the underlying quantity is discontinuous. For such cases, specific methods are needed to estimate the piecewise constant spatial process. A well-known approach in this setting is to perform segmentation of the signal using the adjacency graph, such as the graph-based fused lasso. However, this method does not scale well to large graphs. A new method is introduced for piecewise constant spatial estimation that </span><em>(i)</em> is faster to compute on large graphs and <em>(ii)</em> yields sparser models than the fused lasso (for the same amount of regularization), resulting in estimates that are easier to interpret. The method is illustrated on simulated data and applied to real data on overweight prevalence in the Netherlands. Healthy and unhealthy zones are identified, which cannot be explained by demographic or socio-economic characteristics alone. The method is found capable of identifying such zones and can assist policymakers with their health improving strategies.</p></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138680561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}