{"title":"Multiple Taxicab Correspondence Analysis of a Survey Related to Health Services","authors":"V. Choulakian, J. Allard, B. Simonetti","doi":"10.6339/JDS.2013.11(2).1113","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(2).1113","url":null,"abstract":"We present an analysis of a health survey data by multiple cor- respondence analysis (MCA) and multiple taxicab correspondence analysis (MTCA), MTCA being a robust L1 variant of MCA. The survey has one passive item, gender, and 22 active substantive items representing health services oered by municipal authorities; each active item has four answer categories: this service is used, never tried, tried with no access, non re- sponse. We show that the rst principal MTCA factor is perfectly charac- terized by the sum score of the category this service is used over all service items. Further, we prove that such a sum score characterization always exists for any survey data.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42197777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Three-Parameter Weibull Distribution Shape Parameter Estimation","authors":"M. Teimouri, Arjun K. Gupta","doi":"10.6339/JDS.2013.11(3).1110","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(3).1110","url":null,"abstract":"The Weibull distribution has received much interest in reliability theory. The well-known maximum likelihood estimators (MLE) of this fam- ily are not available in closed form expression. In this work, we propose a consistent and closed form estimator for shape parameter of three-parameter Weibull distribution. Apart from high degree of performance, the derived estimator is location and scale-invariant.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45452764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comparative Study of Shared Frailty Models for Kidney Infection Data with Generalized Exponential Baseline Distribution","authors":"David D. Hanagal, Alok D. Dabade","doi":"10.6339/JDS.2013.11(1).1126","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(1).1126","url":null,"abstract":"Shared frailty models are often used to model heterogeneity in survival analysis. The most common shared frailty model is a model in which hazard function is a product of random factor (frailty) and baseline hazard function which is common to all individuals. There are certain as- sumptions about the baseline distribution and distribution of frailty. Mostly assumption of gamma distribution is considered for frailty distribution. To compare the results with gamma frailty model, we introduce three shared frailty models with generalized exponential as baseline distribution. The other three shared frailty models are inverse Gaussian shared frailty model, compound Poisson shared frailty model and compound negative binomial shared frailty model. We t these models to a real life bivariate survival data set of McGilchrist and Aisbett (1991) related to kidney infection using Markov Chain Monte Carlo (MCMC) technique. Model comparison is made using Bayesian model selection criteria and a better model is suggested for the data.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41846796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Achcar, Gian Franco Napa, Roberto Molina de Souza
{"title":"A Bayesian Analysis of the Spherical Distribution in Presence of Covariates","authors":"J. Achcar, Gian Franco Napa, Roberto Molina de Souza","doi":"10.6339/jds.201310_11(4).0008","DOIUrl":"https://doi.org/10.6339/jds.201310_11(4).0008","url":null,"abstract":"In this paper we introduce a Bayesian analysis of a spherical distribution applied to rock joint orientation data in presence or not of a vector of covariates, where the response variable is given by the angle from the mean and the covariates are the components of the normal upwards vector. Standard simulation MCMC (Markov Chain Monte Carlo) methods have been used to obtain the posterior summaries of interest obtained from WinBugs software. Illustration of the proposed methodology are given using a simulated data set and a real rock spherical data set from a hydroelectrical site.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42477547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonparametric Assessment of Aftershock Clusters of the Maule Earthquake Mw = 8.8","authors":"Javier E. Contreras-Reyes","doi":"10.6339/jds.201310_11(4).0001","DOIUrl":"https://doi.org/10.6339/jds.201310_11(4).0001","url":null,"abstract":"so Abstract: We study the spatial distribution of clusters associated to the aftershocks of the megathrust Maule earthquake MW 8.8 of 27 February 2010. We used a recent clustering method which hinges on a nonparametric estimation of the underlying probability density function to detect subsets of points forming clusters associated with high density areas. In addition, we estimate the probability density function using a nonparametric kernel method for each of these clusters. This allows us to identify a set of regions where there is an association between frequency of events and coseismic slip. Our results suggest that high coseismic slip is spatially related to high aftershock frequency.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44937847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On Choosing a Mixture Model for Clustering","authors":"J. Ngatchou-Wandji, J. Bulla, E. Lorraine","doi":"10.6339/JDS.2013.11(1).1135","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(1).1135","url":null,"abstract":"2 Universit e de Caen Abstract: Two methods for clustering data and choosing a mixture model are proposed. First, we derive a new classication algorithm based on the classication likelihood. Then, the likelihood conditional on these clusters is written as the product of likelihoods of each cluster, and AIC- respectively BIC-type approximations are applied. The resulting criteria turn out to be the sum of the AIC or BIC relative to each cluster plus an entropy term. The performance of our methods is evaluated by Monte-Carlo methods and on a real data set, showing in particular that the iterative estimation algorithm converges quickly in general, and thus the computational load is rather low.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43648301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marcelino A. R. Pascoa, Claudia M. M. de Paiva, G. Cordeiro, E. Ortega
{"title":"The Log-Kumaraswamy Generalized Gamma Regression Model with Application to Chemical Dependency Data","authors":"Marcelino A. R. Pascoa, Claudia M. M. de Paiva, G. Cordeiro, E. Ortega","doi":"10.6339/JDS.2013.11(4).1131","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(4).1131","url":null,"abstract":"The ve parameter Kumaraswamy generalized gamma model (Pas- coa et al., 2011) includes some important distributions as special cases and it is very useful for modeling lifetime data. We propose an extended version of this distribution by assuming that a shape parameter can take negative values. The new distribution can accommodate increasing, decreasing, bath- tub and unimodal shaped hazard functions. A second advantage is that it also includes as special models reciprocal distributions such as the recipro- cal gamma and reciprocal Weibull distributions. A third advantage is that it can represent the error distribution for the log-Kumaraswamy general- ized gamma regression model. We provide a mathematical treatment of the new distribution including explicit expressions for moments, generating function, mean deviations and order statistics. We obtain the moments of the log-transformed distribution. The new regression model can be used more eectively in the analysis of survival data since it includes as sub- models several widely-known regression models. The method of maximum likelihood and a Bayesian procedure are used for estimating the model pa- rameters for censored data. Overall, the new regression model is very useful to the analysis of real data.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42116600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Methods in Event Studies: Empirical Evidence and Theoretical Implications","authors":"N. Sorokina, David E. Booth, John H. Thornton","doi":"10.6339/JDS.2013.11(3).1166","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(3).1166","url":null,"abstract":"We apply methodology robust to outliers to an existing event study of the eect of U.S. nancial reform on the stock markets of the 10 largest world economies, and obtain results that dier from the original OLS results in important ways. This nding underlines the importance of han- dling outliers in event studies. We further review closely the population of outliers identied using Cook's distance and nd that many of the out- liers lie within the event windows. We acknowledge that those data points lead to inaccurate regression tting; however, we cannot remove them since they carry valuable information regarding the event eect. We study further the residuals of the outliers within event windows and nd that the resid- uals change with application of M-estimators and MM-estimators; in most cases they became larger, meaning the main prediction equation is pulled back towards the main data population and further from the outliers and indicating more proper tting. We support our empirical results by pseudo- simulation experiments and nd signicant improvement in determination of both types of the event eect abnormal returns and change in systematic risk. We conclude that robust methods are important for obtaining accurate measurement of event eects in event studies.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47109351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adapted Autoregressive Model and Volatility Model with Application","authors":"Naisheng Wang, Yan Lu","doi":"10.6339/JDS.2013.11(4).1165","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(4).1165","url":null,"abstract":"Price limits are applied to control risks in various futures mar- kets. In this research, we proposed an adapted autoregressive model for the observed futures return by introducing dummy variables that represent limit moves. We also proposed a stochastic volatility model with dummy variables. These two models are used to investigate the existence of price de- layed discovery eect and volatility spillover eect from price limits. We give an empirical study of the impact of price limits on copper and natural rubble futures in Shanghai Futures Exchange (SHFE) by using MCMC method. It is found that price limits are ecient in controlling copper futures price, but the rubber futures price is distorted signicantly. This implies that the eects of price limits are signicant for products with large uctuation and frequent limits hit.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44998235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A New Procedure of Clustering Based on Multivariate Outlier Detection","authors":"Grégory David, S. Jayakumar, B. Thomas","doi":"10.6339/JDS.2013.11(1).1091","DOIUrl":"https://doi.org/10.6339/JDS.2013.11(1).1091","url":null,"abstract":"Clustering is an extremely important task in a wide variety of ap- plication domains especially in management and social science research. In this paper, an iterative procedure of clustering method based on multivariate outlier detection was proposed by using the famous Mahalanobis distance. At rst, Mahalanobis distance should be calculated for the entire sample, then using T 2 -statistic x a UCL. Above the UCL are treated as outliers which are grouped as outlier cluster and repeat the same procedure for the remaining inliers, until the variance-covariance matrix for the variables in the last cluster achieved singularity. At each iteration, multivariate test of mean used to check the discrimination between the outlier clusters and the inliers. Moreover, multivariate control charts also used to graphically visual- izes the iterations and outlier clustering process. Finally multivariate test of means helps to rmly establish the cluster discrimination and validity. This paper employed this procedure for clustering 275 customers of a famous two- wheeler in India based on 19 dierent attributes of the two wheeler and its company. The result of the proposed technique conrms there exist 5 and 7 outlier clusters of customers in the entire sample at 5% and 1% signicance level respectively.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49148312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}