Bruno Ebner, Adrian Fischer, Norbert Henze, Celeste Mayer
{"title":"Goodness-of-fit tests for the Weibull distribution based on the Laplace transform and Stein’s method","authors":"Bruno Ebner, Adrian Fischer, Norbert Henze, Celeste Mayer","doi":"10.1007/s10463-023-00873-7","DOIUrl":"10.1007/s10463-023-00873-7","url":null,"abstract":"<div><p>We propose novel goodness-of-fit tests for the Weibull distribution with unknown parameters. These tests are based on an alternative characterizing representation of the Laplace transform related to the density approach in the context of Stein’s method. Asymptotic theory of the tests is derived, including the limit null distribution, the behaviour under contiguous alternatives, the validity of the parametric bootstrap procedure, and consistency of the tests against a large class of alternatives. A Monte Carlo simulation study shows the competitiveness of the new procedure. Finally, the procedure is applied to real data examples taken from the materials science.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"1011 - 1038"},"PeriodicalIF":1.0,"publicationDate":"2023-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42393906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of complier causal treatment effects with informatively interval-censored failure time data","authors":"Yuqing Ma, Peijie Wang, Jianguo Sun","doi":"10.1007/s10463-023-00874-6","DOIUrl":"10.1007/s10463-023-00874-6","url":null,"abstract":"<div><p>Estimation of compiler causal treatment effects has been discussed by many authors under different situations but only limited literature exists for interval-censored failure time data, which often occur in many areas such as longitudinal or periodical follow-up studies. Particularly it does not seem to exist a method that can deal with informative interval censoring, which can happen naturally and make the analysis much more challenging. Also, it has been shown that when the informative censoring exists, the analysis without taking it into account would yield biased or misleading results. To address this, we propose an estimated sieve maximum likelihood approach with the use of instrumental variables. The asymptotic properties of the resulting estimators of regression parameters are established, and a simulation study is performed and suggests that it works well. Finally, it is applied to a set of real data that motivated this study.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"1039 - 1062"},"PeriodicalIF":1.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50030664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of complier causal treatment effects with informatively interval-censored failure time data","authors":"Yuqing Ma, Peijie Wang, Jianguo Sun","doi":"10.1007/s10463-023-00874-6","DOIUrl":"https://doi.org/10.1007/s10463-023-00874-6","url":null,"abstract":"","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 1","pages":"1039 - 1062"},"PeriodicalIF":1.0,"publicationDate":"2023-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"52265172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust variable selection with exponential squared loss for partially linear spatial autoregressive models","authors":"Xiuli Wang, Jingchang Shao, Jingjing Wu, Qiang Zhao","doi":"10.1007/s10463-023-00870-w","DOIUrl":"10.1007/s10463-023-00870-w","url":null,"abstract":"<div><p>In this paper, we consider variable selection for a class of semiparametric spatial autoregressive models based on exponential squared loss (ESL). Using the orthogonal projection technique, we propose a novel orthogonality-based variable selection procedure that enables simultaneous model selection and parameter estimation, and identifies the significance of spatial effects. Under appropriate conditions, we show that the proposed procedure is consistent and the resulting estimator has oracle properties. Furthermore, some simulation studies and an analysis of the Boston housing price data are also carried out to examine the finite-sample performance of the proposed method.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"949 - 977"},"PeriodicalIF":1.0,"publicationDate":"2023-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-023-00870-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48247773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical inference using regularized M-estimation in the reproducing kernel Hilbert space for handling missing data","authors":"Hengfang Wang, Jae Kwang Kim","doi":"10.1007/s10463-023-00872-8","DOIUrl":"10.1007/s10463-023-00872-8","url":null,"abstract":"<div><p>Imputation is a popular technique for handling missing data. We address a nonparametric imputation using the regularized M-estimation techniques in the reproducing kernel Hilbert space. Specifically, we first use kernel ridge regression to develop imputation for handling item nonresponse. Although this nonparametric approach is potentially promising for imputation, its statistical properties are not investigated in the literature. Under some conditions on the order of the tuning parameter, we first establish the root-<i>n</i> consistency of the kernel ridge regression imputation estimator and show that it achieves the lower bound of the semiparametric asymptotic variance. A nonparametric propensity score estimator using the reproducing kernel Hilbert space is also developed by the linear expression of the projection estimator. We show that the resulting propensity score estimator is asymptotically equivalent to the kernel ridge regression imputation estimator. Results from a limited simulation study are also presented to confirm our theory. The proposed method is applied to analyze air pollution data measured in Beijing, China.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"911 - 929"},"PeriodicalIF":1.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-023-00872-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48637382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A goodness-of-fit test on the number of biclusters in a relational data matrix","authors":"Chihiro Watanabe, Taiji Suzuki","doi":"10.1007/s10463-023-00869-3","DOIUrl":"10.1007/s10463-023-00869-3","url":null,"abstract":"<div><p>Biclustering is a method for detecting homogeneous submatrices in a given matrix. Although there are many studies that estimate the underlying bicluster structure of a matrix, few have enabled us to determine the appropriate number of biclusters. Recently, a statistical test on the number of biclusters has been proposed for a regular-grid bicluster structure. However, when the latent bicluster structure does not satisfy such regular-grid assumption, the previous test requires a larger number of biclusters than necessary for the null hypothesis to be accepted, which is not desirable in terms of interpreting the accepted structure. In this study, we propose a new statistical test on the number of biclusters that does not require the regular-grid assumption and derive the asymptotic behavior of the proposed test statistic in both null and alternative cases. We illustrate the effectiveness of the proposed method by applying it to both synthetic and practical data matrices.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"979 - 1009"},"PeriodicalIF":1.0,"publicationDate":"2023-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42497102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gene–environment interaction analysis under the Cox model","authors":"Kuangnan Fang, Jingmao Li, Yaqing Xu, Shuangge Ma, Qingzhao Zhang","doi":"10.1007/s10463-023-00871-9","DOIUrl":"10.1007/s10463-023-00871-9","url":null,"abstract":"<div><p>For the survival of cancer and many other complex diseases, gene–environment (G-E) interactions have been established as having essential importance. G-E interaction analysis can be roughly classified as marginal and joint, depending on the number of G variables analyzed at a time. In this study, we focus on joint analysis, which can better reflect disease biology and is statistically more challenging. Many approaches have been developed for joint G-E interaction analysis for survival outcomes and led to important findings. However, without rigorous statistical development, quite a few methods have a weak theoretical ground. To fill this knowledge gap, in this article, we consider joint G-E interaction analysis under the Cox model. Sparse group penalization is adopted for regularizing estimation and selecting important main effects and interactions. The “main effects, interactions” variable selection hierarchy, which has been strongly advocated in recent literature, is satisfied. Significantly advancing from some published studies, we rigorously establish the consistency properties under high dimensionality. An effective computational algorithm is developed, simulation demonstrates competitive performance of the proposed approach, and analysis of The Cancer Genome Atlas (TCGA) data on stomach adenocarcinoma (STAD) further demonstrates its practical utility.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"931 - 948"},"PeriodicalIF":1.0,"publicationDate":"2023-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-023-00871-9.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42111901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parametric estimation of spatial–temporal point processes using the Stoyan–Grabarnik statistic","authors":"Conor Kresin, Frederic Schoenberg","doi":"10.1007/s10463-023-00866-6","DOIUrl":"10.1007/s10463-023-00866-6","url":null,"abstract":"<div><p>A novel estimator for the parameters governing spatial–temporal point processes is proposed. Unlike the maximum likelihood estimator, the proposed estimator is fast and easy to compute, and does not require the computation or approximation of a computationally expensive integral. This parametric estimator is based on the Stoyan–Grabarnik (sum of inverse intensity) statistic and is shown to be consistent, under quite general conditions. Simulations are presented demonstrating the performance of the estimator.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 6","pages":"887 - 909"},"PeriodicalIF":1.0,"publicationDate":"2023-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41313716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic data-based bin width selection for rose diagram","authors":"Yasuhito Tsuruta, Masahiko Sagae","doi":"10.1007/s10463-023-00868-4","DOIUrl":"10.1007/s10463-023-00868-4","url":null,"abstract":"<div><p>A rose diagram is a representation that circularly organizes data with the bin width as the central angle. This diagram is widely used to display and summarize circular data. Some studies have proposed the selector of bin width based on data. However, only a few papers have discussed the property of these selectors from a statistical perspective. Thus, this study aims to provide a data-based bin width selector for rose diagrams using a statistical approach. We consider that the radius of the rose diagram is a nonparametric estimator of the square root of two times the circular density. We derive the mean integrated square error of the rose diagram and its optimal bin width and propose two new selectors: normal reference rule and biased cross-validation. We show that biased cross-validation converges to its optimizer. Additionally, we propose a polygon rose diagram to enhance the rose diagram.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 5","pages":"855 - 886"},"PeriodicalIF":1.0,"publicationDate":"2023-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10463-023-00868-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47513957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixture of shifted binomial distributions for rating data","authors":"Shaoting Li, Jiahua Chen","doi":"10.1007/s10463-023-00865-7","DOIUrl":"10.1007/s10463-023-00865-7","url":null,"abstract":"<div><p>Rating data are a kind of ordinal categorical data routinely collected in survey sampling. The response value in such applications is confined to a finite number of ordered categories. Due to population heterogeneity, the respondents may have several different rating styles. A finite mixture model is thus most suitable to fit datasets of this nature. In this paper, we propose a two-component mixture of shifted binomial distributions for rating data. We show that this model is identifiable and propose a numerically stable penalized likelihood approach for parameter estimation. We adapt an expectation-maximization algorithm for the penalized maximum likelihood estimation. Our simulation results show that the penalized maximum likelihood estimator is consistent and effective. We fit the proposed model and other models in the literature to some real-world datasets and find the proposed model can have much better fits.</p></div>","PeriodicalId":55511,"journal":{"name":"Annals of the Institute of Statistical Mathematics","volume":"75 5","pages":"833 - 853"},"PeriodicalIF":1.0,"publicationDate":"2023-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43469802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}