Statistical Theory and Related Fields最新文献

筛选
英文 中文
A new result on recovery sparse signals using orthogonal matching pursuit 利用正交匹配追踪恢复稀疏信号的一个新结果
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-03-13 DOI: 10.1080/24754269.2022.2048445
Xueping Chen, Jianzhong Liu, Jiandong Chen
{"title":"A new result on recovery sparse signals using orthogonal matching pursuit","authors":"Xueping Chen, Jianzhong Liu, Jiandong Chen","doi":"10.1080/24754269.2022.2048445","DOIUrl":"https://doi.org/10.1080/24754269.2022.2048445","url":null,"abstract":"Orthogonal matching pursuit (OMP) algorithm is a classical greedy algorithm widely used in compressed sensing. In this paper, by exploiting the Wielandt inequality and some properties of orthogonal projection matrix, we obtained a new number of iterations required for the OMP algorithm to perform exact recovery of sparse signals, which improves significantly upon the latest results as we know.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"220 - 226"},"PeriodicalIF":0.5,"publicationDate":"2022-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43660484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A selective review of statistical methods using calibration information from similar studies 使用类似研究的校准信息的统计方法的选择性回顾
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-02-17 DOI: 10.1080/24754269.2022.2037201
J. Qin, Yukun Liu, Pengfei Li
{"title":"A selective review of statistical methods using calibration information from similar studies","authors":"J. Qin, Yukun Liu, Pengfei Li","doi":"10.1080/24754269.2022.2037201","DOIUrl":"https://doi.org/10.1080/24754269.2022.2037201","url":null,"abstract":"In the era of big data, divide-and-conquer, parallel, and distributed inference methods have become increasingly popular. How to effectively use the calibration information from each machine in parallel computation has become a challenging task for statisticians and computer scientists. Many newly developed methods have roots in traditional statistical approaches that make use of calibration information. In this paper, we first review some classical statistical methods for using calibration information, including simple meta-analysis methods, parametric likelihood, empirical likelihood, and the generalized method of moments. We further investigate how these methods incorporate summarized or auxiliary information from previous studies, related studies, or populations. We find that the methods based on summarized data usually have little or nearly no efficiency loss compared with the corresponding methods based on all-individual data. Finally, we review some recently developed big data analysis methods including communication-efficient distributed approaches, renewal estimation, and incremental inference as examples of the latest developments in methods using calibration information.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"175 - 190"},"PeriodicalIF":0.5,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42114372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Optimal model averaging estimator for multinomial logit models 多项式logit模型的最优模型平均估计
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-02-17 DOI: 10.1080/24754269.2022.2037204
Rongjie Jiang, Liming Wang, Yang Bai
{"title":"Optimal model averaging estimator for multinomial logit models","authors":"Rongjie Jiang, Liming Wang, Yang Bai","doi":"10.1080/24754269.2022.2037204","DOIUrl":"https://doi.org/10.1080/24754269.2022.2037204","url":null,"abstract":"In this paper, we study optimal model averaging estimators of regression coefficients in a multinomial logit model, which is commonly used in many scientific fields. A Kullback–Leibler (KL) loss-based weight choice criterion is developed to determine averaging weights. Under some regularity conditions, we prove that the resulting model averaging estimators are asymptotically optimal. When the true model is one of the candidate models, the averaged estimators are consistent. Simulation studies suggest the superiority of the proposed method over commonly used model selection criterions, model averaging methods, as well as some other related methods in terms of the KL loss and mean squared forecast error. Finally, the website phishing data is used to illustrate the proposed method.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"227 - 240"},"PeriodicalIF":0.5,"publicationDate":"2022-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41982683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Rejoinder on ‘A review of distributed statistical inference’ 对“分布式统计推理述评”的反驳
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-02-09 DOI: 10.1080/24754269.2022.2035304
Yuan Gao, Weidong Liu, Hansheng Wang, Xiaozhou Wang, Yibo Yan, Riquan Zhang
{"title":"Rejoinder on ‘A review of distributed statistical inference’","authors":"Yuan Gao, Weidong Liu, Hansheng Wang, Xiaozhou Wang, Yibo Yan, Riquan Zhang","doi":"10.1080/24754269.2022.2035304","DOIUrl":"https://doi.org/10.1080/24754269.2022.2035304","url":null,"abstract":"Yuan Gaoa, Weidong Liub, Hansheng Wangc, Xiaozhou Wanga, Yibo Yana and Riquan Zhanga aSchool of Statistics and Key Laboratory of Advanced Theory and Application in Statistics and Data Science – MOE, East China Normal University, Shanghai, People’s Republic of China; bSchool of Mathematical Sciences – School of Life Sciences and Biotechnology – MOE Key Lab of Artifcial Intelligence, Shanghai Jiao Tong University, Shanghai, People’s Republic of China; cGuanghua School of Management, Peking University, Beijing, People’s Republic of China","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"111 - 113"},"PeriodicalIF":0.5,"publicationDate":"2022-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46555795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model averaging for generalized linear models in fragmentary data prediction 片段数据预测中广义线性模型的模型平均
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-02-04 DOI: 10.1080/24754269.2022.2105486
Chao-Qun Yuan, Yang Wu, Fang Fang
{"title":"Model averaging for generalized linear models in fragmentary data prediction","authors":"Chao-Qun Yuan, Yang Wu, Fang Fang","doi":"10.1080/24754269.2022.2105486","DOIUrl":"https://doi.org/10.1080/24754269.2022.2105486","url":null,"abstract":"ABSTRACT Fragmentary data is becoming more and more popular in many areas which brings big challenges to researchers and data analysts. Most existing methods dealing with fragmentary data consider a continuous response while in many applications the response variable is discrete. In this paper, we propose a model averaging method for generalized linear models in fragmentary data prediction. The candidate models are fitted based on different combinations of covariate availability and sample size. The optimal weight is selected by minimizing the Kullback–Leibler loss in the completed cases and its asymptotic optimality is established. Empirical evidences from a simulation study and a real data analysis about Alzheimer disease are presented.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"344 - 352"},"PeriodicalIF":0.5,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48024239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Discussion on ‘A review of distributed statistical inference’ 关于“分布式统计推断综述”的讨论
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-02-04 DOI: 10.1080/24754269.2022.2030107
Yang Yu, Guang Cheng
{"title":"Discussion on ‘A review of distributed statistical inference’","authors":"Yang Yu, Guang Cheng","doi":"10.1080/24754269.2022.2030107","DOIUrl":"https://doi.org/10.1080/24754269.2022.2030107","url":null,"abstract":"We congratulate the authors on an impressive team effort to comprehensively review various statistical estimation and inference methods in distributed frameworks. This paper is an excellent resource for anyone wishing to understand why distributed inference is important in the era of big data, what the challenges of conducting distributed inference instead of centralized inference are, and how statisticians propose solutions to overcome these challenges. First, we notice that this paper focuses mainly on distributed estimation, and we would like to point out several other works on distributed inference. For smooth loss functions, Jordan et al. (2018) established asymptotic normality for their multi-round distributed estimator, which yields two communication-efficient approaches to constructing confidence regions using a sandwiched covariance matrix. For non-smooth loss functions, Chen et al. (2021) similarly proposed a sandwich-type confidence interval based on the asymptotic normality of their distributed estimator. More generic inference approaches, such as bootstrap, have also been studied in the massive data setting including the distributed framework. The authors reviewed the Bag of Little Bootstraps (BLB) method proposed by Kleiner et al. (2014), which is to repeatedly resample and refit the model at each local machine and finally aggregate the bootstrap statistics. Considering the huge computational cost of BLB, Sengupta et al. (2016) proposed the Subsampled Double Bootstrap (SDB) method, which has higher computational efficiency but requires a large number of local machines to maintain statistical accuracy. In addition to distributed samples, the dimensionality can also become large in the big data era, and in this case researchers may be more interested in simultaneous inference onmultiple parameters. In the centralized setting, bootstrap is one of the solutions to the simultaneous inference problems (Zhang & Cheng, 2017). In a distributed framework where the dimensionality grows, Yu et al. (2020) proposed distributed bootstrap methods for simultaneous inference, which not only are efficient in terms of both communication and","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"102 - 103"},"PeriodicalIF":0.5,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48788970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discussion of: a review of distributed statistical inference 讨论:分布式统计推理综述
IF 0.5
Statistical Theory and Related Fields Pub Date : 2022-01-12 DOI: 10.1080/24754269.2021.2022998
Zheng-Chu Guo
{"title":"Discussion of: a review of distributed statistical inference","authors":"Zheng-Chu Guo","doi":"10.1080/24754269.2021.2022998","DOIUrl":"https://doi.org/10.1080/24754269.2021.2022998","url":null,"abstract":"Analysing and processing massive data is becoming ubiquitous in the era of big data. Distributed learning based on divide-and-conquer approach has attracted increasing interest in recent years, since it not only reduces computational complexity and storage requirements, but also protects the data privacy when data subsets are distributively stored on different local machines. This paper provides a comprehensive review for distributed learning with parametric models, nonparametric models and other popular models. As mentioned in this paper, nonparametric regression in reproducing kernel Hilbert spaces is popular in machine learning; however, theoretical analysis for distributed learning algorithms in reproducing kernel Hilbert spaces mainly focuses on the least-square loss functions, and results for some other loss functions are limited; it would be interesting to conduct error analysis for distributed regression with general loss functions and distributed classification in reproducing kernel Hilbert spaces. In distributed learning, a standard assumption is that the data are identically and independently drawn from some unknown probability distribution; however, this assumption may not hold in practice since data are usually collected asynchronously throughout time. It is of great interest to study distributed learning algorithms with non-i.i.d. data. Recently, Sun and Lin (2020) considered distributed kernel ridge regression for strong mixing sequences. The mixing conditions are very common assumptions in the stochastic processes and the mixing coefficients can be estimated in some cases such as Gaussian and Markov processes. In the community of machine learning, the strong mixing conditions are used to quantify the dependence of samples. It is assumed in Sun and Lin (2020) that Dk (1 ≤ k ≤ m) is a strong mixing sequence with α-mixing coefficient αj, and there exists a suitable arrangement of D1,D2, . . . ,Dm such that D = ⋃mk=1 Dk is also a strong mixing sequence with α-mixing coefficient αj; in addition, under some mild conditions on the regression function and the hypothesis spaces, it is shown in Sun and Lin (2020) that as long as the number of the local machines is not too large, an almost optimal convergence rate can be derived, which is comparable to the result under i.i.d. assumptions.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"104 - 104"},"PeriodicalIF":0.5,"publicationDate":"2022-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48277971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discussion of: ‘A review of distributed statistical inference’ 讨论:“分布式统计推断综述”
IF 0.5
Statistical Theory and Related Fields Pub Date : 2021-12-28 DOI: 10.1080/24754269.2021.2015868
Shaogao Lv, Xingcai Zhou
{"title":"Discussion of: ‘A review of distributed statistical inference’","authors":"Shaogao Lv, Xingcai Zhou","doi":"10.1080/24754269.2021.2015868","DOIUrl":"https://doi.org/10.1080/24754269.2021.2015868","url":null,"abstract":"First of all, we would like to congratulate Dr Gao et al. for their excellent paper, which provides a comprehensive overview of amounts of existing work on distributed estimation (learning). Different from related work Gu et al. (2019); Liu et al. (2021); Verbraeken et al. (2020) that focus on computing, storage and communication architecture, the current paper leverages how to guarantee statistical efficiency of a given distributed method from a statistical viewpoint. In the following, we divide our discussion into three parts:","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"105 - 107"},"PeriodicalIF":0.5,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49138265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized fiducial methods for testing quantitative trait locus effects in genetic backcross studies 遗传回交研究中检验数量性状基因座效应的广义基准方法
IF 0.5
Statistical Theory and Related Fields Pub Date : 2021-12-28 DOI: 10.1080/24754269.2021.1984636
Pengcheng Ren, Guanfu Liu, X. Pu, Yan Li
{"title":"Generalized fiducial methods for testing quantitative trait locus effects in genetic backcross studies","authors":"Pengcheng Ren, Guanfu Liu, X. Pu, Yan Li","doi":"10.1080/24754269.2021.1984636","DOIUrl":"https://doi.org/10.1080/24754269.2021.1984636","url":null,"abstract":"In this paper, we propose generalized fiducial methods and construct four generalized p-values to test the existence of quantitative trait locus effects under phenotype distributions from a location-scale family. Compared with the likelihood ratio test based on simulation studies, our methods perform better at controlling type I errors while retaining comparable power in cases with small or moderate sample sizes. The four generalized fiducial methods support varied scenarios: two of them are more aggressive and powerful, whereas the other two appear more conservative and robust. A real data example involving mouse blood pressure is used to illustrate our proposed methods.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"148 - 160"},"PeriodicalIF":0.5,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49314125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Discussion of the paper ‘A review of distributed statistical inference’ 关于“分布式统计推断综述”一文的讨论
IF 0.5
Statistical Theory and Related Fields Pub Date : 2021-12-28 DOI: 10.1080/24754269.2021.2017544
Heng Lian
{"title":"Discussion of the paper ‘A review of distributed statistical inference’","authors":"Heng Lian","doi":"10.1080/24754269.2021.2017544","DOIUrl":"https://doi.org/10.1080/24754269.2021.2017544","url":null,"abstract":"The authors should be congratulated on their timely contribution to this emerging field with a comprehensive review, which will certainly attract more researchers into this area. In the simplest one-shot approach, the entire dataset is distributed on multiple machines, and each machine computes a local estimate based on local data only, and a central machine performs an aggregation calculation as a final processing step. In more complicated settings, multiple communications are carried out, typically passing also first-order information (gradient) and/or second-order information (Hession matrix) between local machines and the central machine. This review clearly separates the existing works in this area into several sections, considering parameter regression, nonparametric regression, and other models including principal component analysis and variable screening. In this discussion, I will consider some possible future directions that can be entertained in this area, based on my own personal experience. The first problem is a combination of divide-and-conquer estimation with some efficient local algorithm not used in traditional statistical analysis. This is motivated by that, due to the stringent constraint on the number of machines that can be used either practically or in theory (for example, when using a one-shot approach, the number ofmachines that can be used isO( √ N)), the sample size on each worker machine can still be large. In other words, even after partitioning, the local sample sizemay still be too large to be processed by traditional algorithms. In such a case, a more efficient algorithm (one that possibly approximates the exact solution) should be used on each local machine. The important question here is whether the optimal statistical properties can be retained using such an algorithm. One such attempt with an affirmative answer is recently reported in Lian et al. (2021). In this work, we use random sketches (random projection) for kernel regression in anRKHS framework for nonparametric regression. Use of random sketches reduces the computational complexity on each worker machine, and at the same time still retains the optimal statistical convergence rate. We expect combinations along such a direction can be useful in various settings, and for different settings different efficient algorithms to compute some approximate solution are called for. The second problem is to extend the studies beyond the worker-server model. Most of the existing methods in the statistics literature are focused on the centralized system where there is a single special machine that communicates with all others and coordinates computation and communication. However, in many modern applications, such systems are rare and unreliable since the failure of the central machine would be disastrous. Consideration of statistical inference in a decentralized system, synchronous or asynchronous, where there is no such specialized central machine, would be an intere","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"6 1","pages":"100 - 101"},"PeriodicalIF":0.5,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43053347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信