{"title":"Tail aligned composite quantile estimator for bootstrapping of high quantiles","authors":"R. S. Jagtap, Mohan Kale, V. K. Gedam","doi":"10.1080/23737484.2021.1915900","DOIUrl":"https://doi.org/10.1080/23737484.2021.1915900","url":null,"abstract":"Abstract Reliable assessment of high quantiles, namely quantile with relatively low exceedance probability, based on available sample is of interest in hydrology, meteorology, finance and many other fields. Interval estimation of extreme quantities in real-world mechanisms is essential, but it is challenging due to complexities in underlying data-generating processes, small sample sizes, data are not normal, failure of the standard statistical assumptions etc. leading to huge stochastic uncertainties. A composite quantile function estimator aligned using tail of generalized extreme value distribution is employed to construct bootstrap confidence intervals for high-order quantiles. The proposed semi-parametric estimator is shown to be asymptotically unbiased and consistent. The utility of the proposed estimator in comparison with traditional nonparametric and parametric bootstrap in terms of coverage probability for small size and case study application to real-world precipitation datasets has been illustrated. Limitations posed in computations and scope for future work is highlighted.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"73 1","pages":"494 - 515"},"PeriodicalIF":0.0,"publicationDate":"2021-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74084020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"School motivation profiles of Dutch 9th graders","authors":"Denise M. Blom, M. Warrens, Meike Faber","doi":"10.1080/23737484.2021.1911719","DOIUrl":"https://doi.org/10.1080/23737484.2021.1911719","url":null,"abstract":"Abstract The aim of this study was to identify school motivation profiles of Dutch 9th grade students in a four-dimensional motivation space, including mastery, performance, social and extrinsic motivation. Multiple clustering methods (K-means, K-medoids, restricted latent profile analysis) and multiple indices for selecting the optimal number of clusters were applied. The statistical selection methods did not completely concur on the optimal number of clusters, but a clear common denominator was provided by the Calinski-Harabasz index and the minimum and mean Silhouette values. All three indices indicated two clusters as the optimal number, regardless of the clustering method used: one cluster of 9th graders with high average scores on all dimensions and one cluster with low mean scores on all dimensions. In addition, we explored the substantive interpretation of multiple cluster solutions. It was discovered that most students are in clusters that can be classified into one of three profile types that may differ in level: (1) approximately equal mean scores on all dimensions, (2) relative high mean scores on mastery and social motivation, and (3) a relatively low mean score on performance motivation. The latter profile type is believed to be a new discovery.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"58 2 1","pages":"359 - 381"},"PeriodicalIF":0.0,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76868487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Baghfalaki, M. Kamarehee, M. Ganjali, A. Shabbak, M. Khayamzadeh, M. Akbari
{"title":"Disease mapping of biomarkers for breast cancer in Tehran using spatial joint model: A Bayesian perspective","authors":"T. Baghfalaki, M. Kamarehee, M. Ganjali, A. Shabbak, M. Khayamzadeh, M. Akbari","doi":"10.1080/23737484.2021.1882354","DOIUrl":"https://doi.org/10.1080/23737484.2021.1882354","url":null,"abstract":"Abstract Breast cancer is one of the most important medical concerns that women face today. There are some biomarkers for detection of this cancer. Modeling these biomarkers, finding important factors that are associated with them and estimating the spatial pattern in disease risk across the areal units by disease mapping are the main foci of many studies. In this article, three binary biomarkers (the presence of estrogen receptors, the presence of progesterone receptors, and the absence of human epidermal growth factor receptor-2) are considered simultaneously for disease mapping of breast cancer. The association of these three biomarkers and spatial effects on them are jointly considered by using a convolution model. The proposed approach is applied to disease mapping of biomarkers of breast cancer in Tehran.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"3 1","pages":"289 - 314"},"PeriodicalIF":0.0,"publicationDate":"2021-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86470919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting time lag between a pair of time series using visibility graph algorithm","authors":"Majnu John, J. Ferbinteanu","doi":"10.1080/23737484.2021.1882356","DOIUrl":"https://doi.org/10.1080/23737484.2021.1882356","url":null,"abstract":"Abstract Estimating the time lag between a pair of time series is of significance in many practical applications. In this article, we introduce a method to quantify such lags by adapting the visibility graph algorithm, which converts time series into a mathematical graph. Currently widely used method to detect such lags is based on cross-correlations, which has certain limitations. We present simulated examples where the new method identifies the lag correctly and unambiguously while as the cross-correlation method does not. The article includes results from an extensive simulation study conducted to better understand the scenarios where the new method performed better or worse than the existing approach. We also present a likelihood based parametric modeling framework and consider frameworks for quantifying uncertainty and hypothesis testing for the new approach. We apply the current and new methods to two case studies, one from neuroscience and the other from environmental epidemiology, to illustrate the methods further.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"40 1","pages":"315 - 343"},"PeriodicalIF":0.0,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85748506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Bakouch, A. S. Nik, A. Asgharzadeh, Hugo S. Salinas
{"title":"A flexible probability model for proportion data: Unit-half-normal distribution","authors":"H. Bakouch, A. S. Nik, A. Asgharzadeh, Hugo S. Salinas","doi":"10.1080/23737484.2021.1882355","DOIUrl":"https://doi.org/10.1080/23737484.2021.1882355","url":null,"abstract":"Abstract A new class of unimodal asymmetric distributions is introduced to the unit interval and these distributions are useful for modeling data of percentages, proportions and fractions. Therefore, we propose the unit-half-normal distribution as a contribution to the earlier path and investigate some of its mathematical properties. The maximum likelihood estimator is obtained with a comprehensive inference. This new class of distributions belongs to the exponential family, hence the uniformly minimum variance unbiased estimator of the distribution parameter is obtained. The distribution represents a power alternative to the unit interval distributions, namely the beta, Kumaraswamy and other recent ones. We investigate a small simulation study to analyze the behavior of the obtained estimators for different sample sizes. Moreover, we illustrate the goodness of fit of the proposed model for image data. Lastly, we describe a procedure of incorporating covariates into regression analysis of the proposed distribution.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"51 1","pages":"271 - 288"},"PeriodicalIF":0.0,"publicationDate":"2021-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81790928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A double mixture autoregressive model of commodity prices","authors":"Gilbert Mbara","doi":"10.1080/23737484.2021.1882353","DOIUrl":"https://doi.org/10.1080/23737484.2021.1882353","url":null,"abstract":"Abstract Many commodity prices exhibit boom-bust type behavior: sustained periods of price increases, followed by sudden sharp collapses. Since around the year 2000, booms have become longer while busts have tended to be short but steep, suggesting a structural change in growth and persistence. We model these features of the data using a novel double mixture autoregression with two independent hidden Markov chains. One chain tracks shifts in mean growth rates that account for rising and falling prices, while a second chain tracks changes in volatility and lag-structure. While the two chains are independent, the persistence of price growth depends on the volatility state, which allows the lag-structure to vary across variance regimes. Estimation requires a two-stage Fisherian approach. Initially, location-related parameters are estimated while suppressing the underlying autoregressive structure. These parameters are then held fixed while the optimal lag-structure across variance regimes is determined. We apply the model to three industrial commodities price time series: Crude Oil, Aluminum, and Rubber. We find that in each case, the model captures boom and bust cycles, with data from more recent periods exhibiting higher volatility, longer price rallies, and steeper collapses.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"97 1","pages":"249 - 270"},"PeriodicalIF":0.0,"publicationDate":"2021-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79188194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tweedie, Bar-Lev, and Enis class of leptokurtic distributions as a candidate for modeling real data","authors":"S. Bar-Lev, A. Batsidis, P. Economou","doi":"10.1080/23737484.2021.1880988","DOIUrl":"https://doi.org/10.1080/23737484.2021.1880988","url":null,"abstract":"Abstract Modeling real life data is often a demanding task and plethora of distributional models have been proposed in the statistical literature in an attempt to describe different data sets in a better way than those used to describe them. In this article, we establish a broad pool of families of parametric distributions previously used in the literature. This pool, which includes 23 parametric models of distributions, is implemented to test the fit of its models to 17 data sets having different characteristics. In doing so, we will mainly pay attention to a three-parameter model that includes the class of natural exponential families generated by positive stable distributions. Indeed, this is the class we wish to pinpoint in this article and highlight its importance for modeling real data sets. The class is shown to be rather competitive alternative to some well-known parametric models in the pool especially when applied to leptokurtic data sets is available. Appropriate R codes which include all parametric models in the pool are provided in a supplementary file for further applications and implementations for other data sets. Supplemental data for this article is available online at https://doi.org/10.1080/23737484.2021.1880988","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"65 1","pages":"229 - 248"},"PeriodicalIF":0.0,"publicationDate":"2021-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85802369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian inference for the Birnbaum–Saunders autoregressive conditional duration model with application to high-frequency financial data","authors":"Nascimento Fernando, Leao Jeremias, H. Saulo","doi":"10.1080/23737484.2021.1874571","DOIUrl":"https://doi.org/10.1080/23737484.2021.1874571","url":null,"abstract":"Abstract Autoregressive conditional duration (ACD) models have been preponderant when the subject is the modeling of high-frequency financial data. A prominent model that has demonstrated great adjustment capacity is the ACD model based on the Birnbaum–Saunders distribution (BS-ACD). Recent works have shown that this model outperforms the existing models in the literature. Nevertheless, these works explore only classical estimation approaches. In this article, we perform a Bayesian approach of the BS-ACD model. The scale parameter was modeled considering a dynamic linear model. Estimation of posterior distribution of parameters was approximated through Markov chain Monte Carlo methods. A simulation study is conducted to evaluate the performance of Bayesian estimators and two applications to real high frequency data illustrate the proposed methodology.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"90 3 1","pages":"215 - 228"},"PeriodicalIF":0.0,"publicationDate":"2021-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82563076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Geographically weighted Poisson regression models with different kernels: Application to road traffic accident data","authors":"Ghanim Al-Hasani, M. Asaduzzaman, A. Soliman","doi":"10.1080/23737484.2020.1869628","DOIUrl":"https://doi.org/10.1080/23737484.2020.1869628","url":null,"abstract":"Abstract Geographically weighted Poisson regression (GWPR) models are the class of spatial count regression models that capture the localization effect on various influencing factors on the dependent variable. The main challenge with the GWPR models is to set appropriate kernel function to give weights for each neighboring point during the model calibration. In this article, we consider GWPR models for many different kernel functions, including box-car, bi-square, tri-cube, exponential, and Gaussian function. Likelihood function, parameter estimation, and model selection criteria have been shown in details. We applied the model formulation to the road traffic accident (RTA) data in Oman as the country is one of the largest RTA-prone countries in the Gulf region. Akaike information criterion, corrected Akaike information criterion, and geographically weighted deviance have been used to assess the model fitting. The model with the exponential kernel weighted function provides the best fit for the data and captures the spatial heterogeneity and factors better with the exponential kernel weighting function.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"70 1","pages":"166 - 181"},"PeriodicalIF":0.0,"publicationDate":"2021-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86905651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling and forecasting life expectancy in India: A systematic approach","authors":"Abhishek Singh, S. Hasija","doi":"10.1080/23737484.2020.1869630","DOIUrl":"https://doi.org/10.1080/23737484.2020.1869630","url":null,"abstract":"Abstract In this study, the autoregressive integrated moving average models were used to fit yearly life expectancy data collected from the official website of the World Bank. The results show a steady growth of life expectancy at birth for males, females, and the total population during 2017–2030. Moreover, this study also attempted to examine the risk factors associated with life expectancy at birth in India. The results of the multiple linear regression showed that the employment rate, school enrollment rate, and healthcare expenditure were significant risk factors associated with life expectancy at birth for males, females, and the total population.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"82 1","pages":"200 - 214"},"PeriodicalIF":0.0,"publicationDate":"2021-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83407844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}