TestPub Date : 2024-06-11DOI: 10.1007/s11749-024-00934-w
Ariane Marandon
{"title":"Conformal link prediction for false discovery rate control","authors":"Ariane Marandon","doi":"10.1007/s11749-024-00934-w","DOIUrl":"https://doi.org/10.1007/s11749-024-00934-w","url":null,"abstract":"<p>Most link prediction methods return estimates of the connection probability of missing edges in a graph. Such output can be used to rank the missing edges from most to least likely to be a true edge, but does not directly provide a classification into true and nonexistent. In this work, we consider the problem of identifying a set of true edges with a control of the false discovery rate (FDR). We propose a novel method based on high-level ideas from the literature on conformal inference. The graph structure induces intricate dependence in the data, which we carefully take into account, as this makes the setup different from the usual setup in conformal inference, where data exchangeability is assumed. The FDR control is empirically demonstrated for both simulated and real data.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"132 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-05-30DOI: 10.1007/s11749-024-00931-z
Jean-Pierre Florens, Elia Lapenta
{"title":"Partly linear instrumental variables regressions without smoothing on the instruments","authors":"Jean-Pierre Florens, Elia Lapenta","doi":"10.1007/s11749-024-00931-z","DOIUrl":"https://doi.org/10.1007/s11749-024-00931-z","url":null,"abstract":"<p>We consider a semiparametric partly linear model identified by instrumental variables. We propose an estimation method that does not smooth on the instruments and we extend the Landweber–Fridman regularization scheme to the estimation of this semiparametric model. We then show the asymptotic normality of the parametric estimator and obtain the convergence rate for the nonparametric estimator. Our estimator that does not smooth on the instruments coincides with a typical estimator that does smooth on the instruments but keeps the respective bandwidth fixed as the sample size increases. We propose a data driven method for the selection of the regularization parameter, and in a simulation study we show the attractive performance of our estimators.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"64 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A new sufficient dimension reduction method via rank divergence","authors":"Tianqing Liu, Danning Li, Fengjiao Ren, Jianguo Sun, Xiaohui Yuan","doi":"10.1007/s11749-024-00929-7","DOIUrl":"https://doi.org/10.1007/s11749-024-00929-7","url":null,"abstract":"<p>Sufficient dimension reduction is commonly performed to achieve data reduction and help data visualization. Its main goal is to identify functions of the predictors that are smaller in number than the predictors and contain the same information as the predictors for the response. In this paper, we are concerned with the linear functions of the predictors, which determine a central subspace that preserves sufficient information about the conditional distribution of a response given covariates. Many methods have been developed in the literature for the estimation of the central subspace. However, most of the existing sufficient dimension reduction methods are sensitive to outliers and require some strict restrictions on both covariates and response. To address this, we propose a novel dependence measure, rank divergence, and develop a rank divergence-based sufficient dimension reduction approach. The new method only requires some mild conditions on the covariates and response and is robust to outliers or heavy-tailed distributions. Moreover, it applies to both discrete or categorical covariates and multivariate responses. The consistency of the resulting estimator of the central subspace is established, and numerical studies suggest that it works well in practical situations.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"72 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141197326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-05-28DOI: 10.1007/s11749-024-00932-y
Liliana Forzani, Daniela Rodriguez, Mariela Sued
{"title":"Asymptotic results for nonparametric regression estimators after sufficient dimension reduction estimation","authors":"Liliana Forzani, Daniela Rodriguez, Mariela Sued","doi":"10.1007/s11749-024-00932-y","DOIUrl":"https://doi.org/10.1007/s11749-024-00932-y","url":null,"abstract":"<p>Prediction, in regression and classification, is one of the main aims in modern data science. When the number of predictors is large, a common first step is to reduce the dimension of the data. Sufficient dimension reduction (SDR) is a well-established paradigm of reduction that keeps all the relevant information in the covariates <i>X</i> that is necessary for the prediction of <i>Y</i>. In practice, SDR has been successfully used as an exploratory tool for modeling after estimation of the sufficient reduction. Nevertheless, even if the estimated reduction is a consistent estimator of the population, there is no theory supporting this step when nonparametric regression is used in the imputed estimator. In this paper, we show that the asymptotic distribution of the nonparametric regression estimator remains unchanged whether the true SDR or its estimator is used. This result allows making inferences, for example, computing confidence intervals for the regression function, thereby avoiding the curse of dimensionality.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"70 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141173537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-04-09DOI: 10.1007/s11749-024-00928-8
Zhijian Wang, Yunquan Song
{"title":"Privacy-preserving parametric inference for spatial autoregressive model","authors":"Zhijian Wang, Yunquan Song","doi":"10.1007/s11749-024-00928-8","DOIUrl":"https://doi.org/10.1007/s11749-024-00928-8","url":null,"abstract":"<p>Spatial regression models are important tools in dealing with spatially dependent data and are widely used in many fields such as spatial econometric and regional science. When the spatial data contain sensitive information, the privacy of the data will be compromised along with the release of the analysis if appropriate privacy-preserving measures are not in place. In this paper, we study the privacy-preserving parametric inference for the spatial autoregressive model and propose corresponding differentially private algorithm. We construct a differentially private spatial autoregression framework that takes graph data into account. We improve the functional mechanism to be more accurate under the same degree of privacy protection. Theoretical analysis establishes both the privacy guarantees of the algorithm and the asymptotic normality of the estimation. Simulation and real data studies show improvements of our approach.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"26 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140586664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-03-25DOI: 10.1007/s11749-024-00926-w
Wenbiao Zhao, Lixing Zhu, Falong Tan
{"title":"Multiple change point detection for high-dimensional data","authors":"Wenbiao Zhao, Lixing Zhu, Falong Tan","doi":"10.1007/s11749-024-00926-w","DOIUrl":"https://doi.org/10.1007/s11749-024-00926-w","url":null,"abstract":"<p>This research investigates the detection of multiple change points in high-dimensional data without particular sparse or dense structure, where the dimension can be of exponential order in relation to the sample size. The estimation approach proposed employs a signal statistic based on a sequence of signal screening-based local U-statistics. This technique avoids costly computations that exhaustive search algorithms require and mitigates false positives, which hypothesis testing-based methods need to control. Consistency of estimation can be achieved for both the locations and number of change points, even when the number of change points diverges at a certain rate as the sample size increases. Additionally, the visualization nature of the proposed approach makes plotting the signal statistic a useful tool to identify locations of change points, which distinguishes it from existing methods in the literature. Numerical studies are performed to evaluate the effectiveness of the proposed technique in finite sample scenarios, and a real data analysis is presented to illustrate its application.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"19 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140297606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-03-18DOI: 10.1007/s11749-024-00925-x
Yuexuan Wu, Chao Huang, Anuj Srivastava
{"title":"Rejoinder on: Shape-based functional data analysis","authors":"Yuexuan Wu, Chao Huang, Anuj Srivastava","doi":"10.1007/s11749-024-00925-x","DOIUrl":"https://doi.org/10.1007/s11749-024-00925-x","url":null,"abstract":"<p>We express our gratitude to the authors of five comment articles for their valuable contributions, feedback, and recommendations on our discussion document (Wu et al. Test, 2023). All the reviewers acknowledged the value of our proposed research direction, which focuses on shape-based functional data analysis. They also provided insightful suggestions to enhance and expand upon these ideas. In this response, we address their comments and provide further insights.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"22 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140166224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-03-14DOI: 10.1007/s11749-024-00923-z
Jack Prothero, Meilei Jiang, Jan Hannig, Quoc Tran-Dinh, Andrew Ackerman, J. S. Marron
{"title":"Data integration via analysis of subspaces (DIVAS)","authors":"Jack Prothero, Meilei Jiang, Jan Hannig, Quoc Tran-Dinh, Andrew Ackerman, J. S. Marron","doi":"10.1007/s11749-024-00923-z","DOIUrl":"https://doi.org/10.1007/s11749-024-00923-z","url":null,"abstract":"<p>Modern data collection in many data paradigms, including bioinformatics, often incorporates multiple traits derived from different data types (i.e., platforms). We call this data multi-block, multi-view, or multi-omics data. The emergent field of data integration develops and applies new methods for studying multi-block data and identifying how different data types relate and differ. One major frontier in contemporary data integration research is methodology that can identify partially shared structure between sub-collections of data types. This work presents a new approach: Data Integration Via Analysis of Subspaces (DIVAS). DIVAS combines new insights in angular subspace perturbation theory with recent developments in matrix signal processing and convex–concave optimization into one algorithm for exploring partially shared structure. Based on principal angles between subspaces, DIVAS provides built-in inference on the results of the analysis, and is effective even in high-dimension-low-sample-size (HDLSS) situations.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"31 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-02-28DOI: 10.1007/s11749-024-00922-0
{"title":"Testing covariance structures belonging to a quadratic subspace under a doubly multivariate model","authors":"","doi":"10.1007/s11749-024-00922-0","DOIUrl":"https://doi.org/10.1007/s11749-024-00922-0","url":null,"abstract":"<h3>Abstract</h3> <p>A hypothesis related to the block structure of a covariance matrix under the doubly multivariate normal model is studied. It is assumed that the block structure of the covariance matrix belongs to a quadratic subspace, and under the null hypothesis, each block of the covariance matrix also has a structure belonging to some quadratic subspace. The Rao score and the likelihood ratio test statistics are derived, and the exact distribution of the likelihood ratio test is determined. Simulation studies show the advantage of the Rao score test over the likelihood ratio test in terms of speed of convergence to the limiting chi-square distribution, while both proposed tests are competitive in terms of their power. The results are applied to both simulated and real-life example data. </p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"8 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140001677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
TestPub Date : 2024-02-26DOI: 10.1007/s11749-024-00920-2
Ryan P. Browne, Jeffrey L. Andrews
{"title":"The orthogonal skew model: computationally efficient multivariate skew-normal and skew-t distributions with applications to model-based clustering","authors":"Ryan P. Browne, Jeffrey L. Andrews","doi":"10.1007/s11749-024-00920-2","DOIUrl":"https://doi.org/10.1007/s11749-024-00920-2","url":null,"abstract":"<p>We introduce a parameterization for the multivariate skew normal and skew-<i>t</i> distributions, which enforces an orthogonal structure on the skewness parameter. This approach provides substantial benefits in computational efficiency during parameter estimation, resulting in a model which strikes an excellent balance between flexibility and model-fitting feasibility. We illustrate this primarily through implementing the proposed distributions in a mixture model-based clustering framework. We compare to competing skew distributions via both simulated and real data analyses, reporting both computation time and model-fit metrics.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"42 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139968590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}