{"title":"A dynamic count process","authors":"Namhyun Kim , Pipat Wongsa-art , Yingcun Xia","doi":"10.1016/j.jspi.2024.106187","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106187","url":null,"abstract":"<div><p>The current paper aims to complement the recent development of the observation-driven models of dynamic counts with a parametric-driven one for a general case, particularly discrete two parameters exponential family distributions. The current paper proposes a finite semiparametric exponential mixture of SETAR processes of the conditional mean of counts to capture the nonlinearity and complexity. Because of the intrinsic latency of the conditional mean, the general additive state-space representation of dynamic counts is firstly proposed then stationarity and geometric ergodicity are established under a mild set of conditions. We also propose to estimate the unknown parameters by using quasi maximum likelihood estimation and establishes the asymptotic properties of the quasi maximum likelihood estimators (QMLEs), particularly <span><math><msqrt><mrow><mi>T</mi></mrow></msqrt></math></span>-consistency and normality under the relatively mild set of conditions. Furthermore, the finite sample properties of the QMLEs are investigated via simulation exercises and an illustration of the proposed process is presented by applying the proposed method to the intraday transaction counts per minute of AstraZeneca stock.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106187"},"PeriodicalIF":0.9,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140894991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Consistency of the maximum likelihood estimator of population tree in a coalescent framework","authors":"Arindam RoyChoudhury","doi":"10.1016/j.jspi.2024.106172","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106172","url":null,"abstract":"<div><p>We present a proof of consistency of the maximum likelihood estimator (MLE) of population tree in a previously proposed coalescent model. As the model involves tree-topology as a parameter, the standard proof of consistency for continuous parameters does not directly apply. In addition to proving that a consistent sequence of MLE exists, we also prove that the overall MLE, computed by maximizing the likelihood over all tree-topologies, is also consistent. Thus, the MLE of tree-topology is consistent as well. The last result is important because local maxima occur in the likelihood of population trees, especially while maximizing the likelihood separately for each tree-topology. Even though MLE is known to be a dependable estimator under this model, our work proves its effectiveness with mathematical certainty.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106172"},"PeriodicalIF":0.9,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140632639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Augmented projection Wasserstein distances: Multi-dimensional projection with neural surface","authors":"Miyu Sugimoto , Ryo Okano , Masaaki Imaizumi","doi":"10.1016/j.jspi.2024.106185","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106185","url":null,"abstract":"<div><p>The Wasserstein distance is a fundamental tool for comparing probability distributions and has found broad applications in various fields, including image generation using generative adversarial networks. Despite its useful properties, the performance of the Wasserstein distance decreases when data is high-dimensional, known as the curse of dimensionality. To mitigate this issue, an extension of the Wasserstein distance has been developed, such as the sliced Wasserstein distance using one-dimensional projection. However, such an extension loses information on the original data, due to the linear projection onto the one-dimensional space. In this paper, we propose novel distances named augmented projection Wasserstein distances (APWDs) to address these issues, which utilize multi-dimensional projection with a nonlinear surface by a neural network. The APWDs employ a two-step procedure; it first maps data onto a nonlinear surface by a neural network, then linearly projects the mapped data into a multidimensional space. We also give an algorithm to select a subspace for the multi-dimensional projection. The APWDs are computationally effective while preserving nonlinear information of data. We theoretically confirm that the APWDs mitigate the curse of dimensionality from data. Our experiments demonstrate the APWDs’ outstanding performance and robustness to noise, particularly in the context of nonlinear high-dimensional data.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106185"},"PeriodicalIF":0.9,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000429/pdfft?md5=d9eef2f8ec0fb76099ca4281dc2a0b63&pid=1-s2.0-S0378375824000429-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140632638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed optimal subsampling for quantile regression with massive data","authors":"Yue Chao, Xuejun Ma, Boya Zhu","doi":"10.1016/j.jspi.2024.106186","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106186","url":null,"abstract":"<div><p>Methods for reducing distributed subsample sizes have increasingly become popular statistical problems in the big data era. Existing works of optimal subsample selection on the massive linear and generalized linear models with distributed data sources have been solidly investigated and widely applied. Nevertheless, few studies have developed distributed optimal subsample selection procedures for quantile regression in massive data. In such settings, the distributed optimal subsampling probabilities and subset sizes selection criteria need to be established simultaneously. In this work, we propose a distributed subsampling technique for the quantile regression models. The estimation approach is based on a two-step algorithm for the distributed subsampling procedures. Furthermore, the theoretical results, such as consistency and asymptotic normality of resultant estimators, are rigorously established under some regularity conditions. The empirical evaluation and performance of the proposed subsampling method are conducted in simulation experiments and real data applications.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106186"},"PeriodicalIF":0.9,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140638708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Entropic regularization of neural networks: Self-similar approximations","authors":"Amir R. Asadi, Po-Ling Loh","doi":"10.1016/j.jspi.2024.106181","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106181","url":null,"abstract":"<div><p>This paper focuses on entropic regularization and its multiscale extension in neural network learning. We leverage established results that characterize the optimizer of entropic regularization methods and their connection with generalization bounds. To avoid the significant computational complexity involved in sampling from the optimal multiscale Gibbs distributions, we describe how to make measured concessions in optimality by using self-similar approximating distributions. We study such scale-invariant approximations for linear neural networks and further extend the approximations to neural networks with nonlinear activation functions. We then illustrate the application of our proposed approach through empirical simulation. By navigating the interplay between optimization and computational efficiency, our research contributes to entropic regularization theory, proposing a practical method that embraces symmetry across scales.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106181"},"PeriodicalIF":0.9,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000387/pdfft?md5=fcc1f48fea9b9d957df56a1c168f3f74&pid=1-s2.0-S0378375824000387-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140643824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiplier subsample bootstrap for statistics of time series","authors":"Ruru Ma, Shibin Zhang","doi":"10.1016/j.jspi.2024.106183","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106183","url":null,"abstract":"<div><p>Block-based bootstrap, block-based subsampling and multiplier bootstrap are three common nonparametric tools for statistical inference under dependent observations. Combining the ideas of those three, a novel resampling approach, the multiplier subsample bootstrap (MSB), is proposed. Instead of generating a resample from the observations, the MSB imitates the statistic by weighting the block-based subsample statistics with independent standard Gaussian random variables. Given the asymptotic normality of the statistic, the bootstrap validity is established under some mild moment conditions. Involving the idea of MSB, the other resampling approach, the hybrid multiplier subsampling periodogram bootstrap (HMP), is developed for mimicking frequency-domain spectral mean statistics in the paper. A simulation study demonstrates that both the MSB and HMP achieve good performance.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106183"},"PeriodicalIF":0.9,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140607310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A unified Fourier slice method to derive ridgelet transform for a variety of depth-2 neural networks","authors":"Sho Sonoda , Isao Ishikawa , Masahiro Ikeda","doi":"10.1016/j.jspi.2024.106184","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106184","url":null,"abstract":"<div><p>To investigate neural network parameters, it is easier to study the distribution of parameters than to study the parameters in each neuron. The ridgelet transform is a pseudo-inverse operator that maps a given function <span><math><mi>f</mi></math></span> to the parameter distribution <span><math><mi>γ</mi></math></span> so that a network <span><math><mrow><mstyle><mi>N</mi><mi>N</mi></mstyle><mrow><mo>[</mo><mi>γ</mi><mo>]</mo></mrow></mrow></math></span> reproduces <span><math><mi>f</mi></math></span>, i.e. <span><math><mrow><mstyle><mi>N</mi><mi>N</mi></mstyle><mrow><mo>[</mo><mi>γ</mi><mo>]</mo></mrow><mo>=</mo><mi>f</mi></mrow></math></span>. For depth-2 fully-connected networks on a Euclidean space, the ridgelet transform has been discovered up to the closed-form expression, thus we could describe how the parameters are distributed. However, for a variety of modern neural network architectures, the closed-form expression has not been known. In this paper, we explain a systematic method using Fourier expressions to derive ridgelet transforms for a variety of modern networks such as networks on finite fields <span><math><msub><mrow><mi>F</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span>, group convolutional networks on abstract Hilbert space <span><math><mi>H</mi></math></span>, fully-connected networks on noncompact symmetric spaces <span><math><mrow><mi>G</mi><mo>/</mo><mi>K</mi></mrow></math></span>, and pooling layers, or the <span><math><mi>d</mi></math></span>-plane ridgelet transform.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106184"},"PeriodicalIF":0.9,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000417/pdfft?md5=98e3c89ff86925f67f13c56d174f0109&pid=1-s2.0-S0378375824000417-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140618803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust nonparametric regression based on deep ReLU neural networks","authors":"Juntong Chen","doi":"10.1016/j.jspi.2024.106182","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106182","url":null,"abstract":"<div><p>In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized the importance of safeguarding estimation procedures against systematic contamination. We approach this statistical issue by shifting our focus towards estimating conditional distributions. To address it robustly, we introduce a novel estimation procedure based on <span><math><mi>ℓ</mi></math></span>-estimation. Under a mild model assumption, we establish general non-asymptotic risk bounds for the resulting estimators, showcasing their robustness against contamination, outliers, and model misspecification. We then delve into the application of our approach using deep ReLU neural networks. When the model is well-specified and the regression function belongs to an <span><math><mi>α</mi></math></span>-Hölder class, employing <span><math><mi>ℓ</mi></math></span>-type estimation on suitable networks enables the resulting estimators to achieve the minimax optimal rate of convergence. Additionally, we demonstrate that deep <span><math><mi>ℓ</mi></math></span>-type estimators can circumvent the curse of dimensionality by assuming the regression function closely resembles the composition of several Hölder functions. To attain this, new deep fully-connected ReLU neural networks have been designed to approximate this composition class. This approximation result can be of independent interest.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106182"},"PeriodicalIF":0.9,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000399/pdfft?md5=79a5bc36ebe3d6024d39b9f8adf1f910&pid=1-s2.0-S0378375824000399-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140649412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Convergence guarantees for forward gradient descent in the linear regression model","authors":"Thijs Bos , Johannes Schmidt-Hieber","doi":"10.1016/j.jspi.2024.106174","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106174","url":null,"abstract":"<div><p>Renewed interest in the relationship between artificial and biological neural networks motivates the study of gradient-free methods. Considering the linear regression model with random design, we theoretically analyze in this work the biologically motivated (weight-perturbed) forward gradient scheme that is based on random linear combination of the gradient. If <span><math><mi>d</mi></math></span> denotes the number of parameters and <span><math><mi>k</mi></math></span> the number of samples, we prove that the mean squared error of this method converges for <span><math><mrow><mi>k</mi><mo>≳</mo><msup><mrow><mi>d</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>log</mo><mrow><mo>(</mo><mi>d</mi><mo>)</mo></mrow></mrow></math></span> with rate <span><math><mrow><msup><mrow><mi>d</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>log</mo><mrow><mo>(</mo><mi>d</mi><mo>)</mo></mrow><mo>/</mo><mi>k</mi></mrow></math></span>. Compared to the dimension dependence <span><math><mi>d</mi></math></span> for stochastic gradient descent, an additional factor <span><math><mrow><mi>d</mi><mo>log</mo><mrow><mo>(</mo><mi>d</mi><mo>)</mo></mrow></mrow></math></span> occurs.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106174"},"PeriodicalIF":0.9,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000314/pdfft?md5=fc5918288c472da3301b467d899078ad&pid=1-s2.0-S0378375824000314-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140536571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Toward improved inference for Krippendorff’s Alpha agreement coefficient","authors":"John Hughes","doi":"10.1016/j.jspi.2024.106170","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106170","url":null,"abstract":"<div><p>In this article I recommend a better point estimator for Krippendorff’s Alpha agreement coefficient, and develop a jackknife variance estimator that leads to much better interval estimation than does the customary bootstrap procedure or an alternative bootstrap procedure. Having developed the new methodology, I analyze nominal data previously analyzed by Krippendorff, and two experimentally observed datasets: (1) ordinal data from an imaging study of congenital diaphragmatic hernia, and (2) United States Environmental Protection Agency air pollution data for the Philadelphia, Pennsylvania area. The latter two applications are novel. The proposed methodology is now supported in version 2.0 of my open source R package, <span>krippendorffsalpha</span>, which supports common and user-defined distance functions, and can accommodate any number of units, any number of coders, and missingness. Interval computation can be parallelized.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"233 ","pages":"Article 106170"},"PeriodicalIF":0.9,"publicationDate":"2024-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140549711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}