{"title":"Discrete variations of the fractional Brownian motion in the presence of outliers and an additive noise","authors":"S. Achard, Jean‐François Coeurjolly","doi":"10.1214/09-SS059","DOIUrl":"https://doi.org/10.1214/09-SS059","url":null,"abstract":"This paper gives an overview of the problem of estimating the Hurst parameter of a fractional Brownian motion when the data are observed with outliers and/or with an additive noise by using methods based on discrete variations. We show that the classical estimation procedure based on the log-linearity of the variogram of dilated series is made more robust to outliers and/or an additive noise by considering sample quantiles and trimmed means of the squared series or differences of empirical variances. These different procedures are compared and discussed through a large simulation study and are implemented in the texttt{R} package texttt{dvfBm}.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"135 1","pages":"117-147"},"PeriodicalIF":3.3,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78534891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules.","authors":"Michael P Fay, Michael A Proschan","doi":"10.1214/09-SS051","DOIUrl":"https://doi.org/10.1214/09-SS051","url":null,"abstract":"<p><p>In a mathematical approach to hypothesis tests, we start with a clearly defined set of hypotheses and choose the test with the best properties for those hypotheses. In practice, we often start with less precise hypotheses. For example, often a researcher wants to know which of two groups generally has the larger responses, and either a t-test or a Wilcoxon-Mann-Whitney (WMW) test could be acceptable. Although both t-tests and WMW tests are usually associated with quite different hypotheses, the decision rule and p-value from either test could be associated with many different sets of assumptions, which we call perspectives. It is useful to have many of the different perspectives to which a decision rule may be applied collected in one place, since each perspective allows a different interpretation of the associated p-value. Here we collect many such perspectives for the two-sample t-test, the WMW test and other related tests. We discuss validity and consistency under each perspective and discuss recommendations between the tests in light of these many different perspectives. Finally, we briefly discuss a decision rule for testing genetic neutrality where knowledge of the many perspectives is vital to the proper interpretation of the decision rule.</p>","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"4 ","pages":"1-39"},"PeriodicalIF":3.3,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/09-SS051","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28940092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Navigating Random Forests and related advances in algorithmic modeling","authors":"David S. Siroky","doi":"10.1214/07-SS033","DOIUrl":"https://doi.org/10.1214/07-SS033","url":null,"abstract":"This article addresses current methodological research on nonparametric Random Forests. It provides a brief intellectual history of Random Forests that covers CART, boosting and bagging methods. It then introduces the primary methods by which researchers can visualize results, the relationships between covariates and responses, and the out-of-bag test set error. In addition, the article considers current research on universal consistency and importance tests in Random Forests. Finally, several uses for Random Forests are discussed, and available software is identified. AMS 2000 subject classifications: 62-02, 62-04, 62G08, 62G09, 62H30, 93E25, 62M99, 62N99.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"30 1","pages":"147-163"},"PeriodicalIF":3.3,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85196235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey of cross-validation procedures for model selection","authors":"Sylvain Arlot, Alain Celisse","doi":"10.1214/09-SS054","DOIUrl":"https://doi.org/10.1214/09-SS054","url":null,"abstract":"Used to estimate the risk of an estimator or to perform model selection, cross-validation is a widespread strategy because of its simplicity and its apparent universality. Many results exist on the model selection performances of cross-validation procedures. This survey intends to relate these results to the most recent advances of model selection theory, with a particular emphasis on distinguishing empirical statements from rigorous theoretical results. As a conclusion, guidelines are provided for choosing the best cross-validation procedure according to the particular features of the problem in hand.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"56 1","pages":"40-79"},"PeriodicalIF":3.3,"publicationDate":"2009-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81493123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Causal inference in statistics: An overview","authors":"J. Pearl","doi":"10.1214/09-SS057","DOIUrl":"https://doi.org/10.1214/09-SS057","url":null,"abstract":"This review presents empiricalresearcherswith recent advances in causal inference, and stresses the paradigmatic shifts that must be un- dertaken in moving from traditionalstatistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that un- derly all causal inferences, the languages used in formulating those assump- tions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coher- ent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: (1) queries about the effects of potential interven- tions, (also called \"causal effects\" or \"policy evaluation\") (2) queries about probabilities of counterfactuals, (including assessment of \"regret,\" \"attri- bution\" or \"causes of effects\") and (3) queries about direct and indirect effects (also known as \"mediation\"). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiotic analysis that uses the strong features of both.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"29 1","pages":"96-146"},"PeriodicalIF":3.3,"publicationDate":"2009-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87061617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical models: Conventional, penalized and hierarchical likelihood","authors":"D. Commenges","doi":"10.1214/08-SS039","DOIUrl":"https://doi.org/10.1214/08-SS039","url":null,"abstract":"We give an overview of statistical models and likelihood, together with two of its variants: penalized and hierarchical likelihood. The Kullback-Leibler divergence is referred to repeatedly in the literature, for defining the misspecification risk of a model and for grounding the likelihood and the likelihood cross-validation, which can be used for choosing weights in penalized likelihood. Families of penalized likelihood and particular sieves estimators are shown to be equivalent. The similarity of these likelihoods with a posteriori distributions in a Bayesian approach is considered","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"17 1","pages":"1-17"},"PeriodicalIF":3.3,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82981269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recent developments in nonregular fractional factorial designs","authors":"Hongquan Xu, F. Phoa, W. Wong","doi":"10.1214/08-SS040","DOIUrl":"https://doi.org/10.1214/08-SS040","url":null,"abstract":"Nonregular fractional factorial designs such as Plackett-Burman designs and other orthogonal arrays are widely used in various screening experiments for their run size economy and flexibility. The traditional analysis focuses on main e�ffects only. Hamada and Wu (1992) went beyond the traditional approach and proposed an analysis strategy to demonstrate that some interactions could be entertained and estimated beyond a few significant main effects. Their groundbreaking work stimulated much of the recent developments in design criterion creation, construction and analysis of nonregular designs. This paper reviews important developments in optimality criteria and comparison, including projection properties, generalized resolution, various generalized minimum aberration criteria, optimality results, construction methods and analysis strategies for nonregular designs.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"98 1","pages":"18-46"},"PeriodicalIF":3.3,"publicationDate":"2008-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86550080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse sampling: Spatial design for monitoring stream networks","authors":"Melissa J. Dobbie, B. Henderson, D. Stevens","doi":"10.1214/07-SS032","DOIUrl":"https://doi.org/10.1214/07-SS032","url":null,"abstract":"Spatial designs for monitoring stream networks, especially ephemeral systems, are typically non-standard, `sparse' and can be very complex, reflecting the complexity of the ecosystem being monitored, the scale of the population, and the competing multiple monitoring objectives. The main purpose of this paper is to present a review of approaches to spatial design to enable informed decisions to be made about developing practical and optimal spatial designs for future monitoring of streams.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"36 1","pages":"113-153"},"PeriodicalIF":3.3,"publicationDate":"2008-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81148148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic study for complex diseases","authors":"Yulan Liang, A. Kelemen","doi":"10.1214/07-SS026","DOIUrl":"https://doi.org/10.1214/07-SS026","url":null,"abstract":"Recent advances of information technology in biomedical sciences and other applied areas have created numerous large diverse data sets with a high dimensional feature space, which provide us a tremendous amount of information and new opportunities for improving the quality of human life. Meanwhile, great challenges are also created driven by the continuous arrival of new data that requires researchers to convert these raw data into scientific knowledge in order to benefit from it. Association studies of complex diseases using SNP data have become more and more popular in biomedical research in recent years. In this paper, we present a review of recent statistical advances and challenges for analyzing correlated high dimensional SNP data in genomic association studies for complex diseases. The review includes both general feature reduction approaches for high dimensional correlated data and more specific approaches for SNPs data, which include unsupervised haplotype mapping, tag SNP selection, and supervised SNPs selection using statistical testing/scoring, statistical modeling and machine learning methods with an emphasis on how to identify interacting loci.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"15 1","pages":"43-60"},"PeriodicalIF":3.3,"publicationDate":"2008-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83241006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Least angle and ℓ1 penalized regression: A review","authors":"T. Hesterberg, Nam-Hee Choi, L. Meier, C. Fraley","doi":"10.1214/08-SS035","DOIUrl":"https://doi.org/10.1214/08-SS035","url":null,"abstract":"Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (l1-penalized regression) and forward stagewise regression, and provides a fast imple- mentation of both. The idea has caught on rapidly, and sparked a great deal of research interest. In this paper, we give an overview of Least Angle Regression and the current state of related research.","PeriodicalId":46627,"journal":{"name":"Statistics Surveys","volume":"31 1","pages":"61-93"},"PeriodicalIF":3.3,"publicationDate":"2008-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78402311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}