Communications in Statistics Case Studies Data Analysis and Applications最新文献_第6页

A robust approach for outlier imputation: Singular spectrum decomposition 一种鲁棒的离群值归算方法:奇异谱分解

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-12-28 DOI: 10.1080/23737484.2021.2017810

Maryam Movahedifar, Hossein Hassani, M. Yarmohammadi, M. Kalantari, Rangan Gupta

{"title":"A robust approach for outlier imputation: Singular spectrum decomposition","authors":"Maryam Movahedifar, Hossein Hassani, M. Yarmohammadi, M. Kalantari, Rangan Gupta","doi":"10.1080/23737484.2021.2017810","DOIUrl":"https://doi.org/10.1080/23737484.2021.2017810","url":null,"abstract":"Abstract Singular spectrum analysis (SSA) is a nonparametric method for separating time series data into a sum of small numbers of interpretable components (signal + noise). One of the steps of the SSA method, which is referenced to Embedding, is extremely sensitive to contamination of outliers which are often founded in time series analysis. To reduce the effect of outliers, SSA based on Singular Spectrum Decomposition (SSD) method is proposed. In this article, the ability of SSA based on SSD and basic SSA are compared in time series reconstruction in the presence of outliers. It is noteworthy that the matrix norm used in Basic SSA is the Frobenius norm or L 2-norm. There is a newer version of SSA that is based on L 1-norm and called L 1-SSA. It was confirmed that L 1-SSA is robust against outliers. In this regard, this research is also introduced a new version of SSD based on L 1-norm which is called L 1-SSD. A wide empirical study on both simulated and real data verifies the efficiency of basic SSA based on SSD and L 1-norm in reconstructing the time series where polluted by outliers.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"34 1","pages":"234 - 250"},"PeriodicalIF":0.0,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88992819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Modeling serially correlated heavy-tailed data with some missing response values using stochastic EM algorithm 利用随机电磁算法对响应值缺失的重尾序列数据进行建模

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-12-22 DOI: 10.1080/23737484.2021.2017808

U. Nduka, I. Iwueze, C. Nwaigwe

{"title":"Modeling serially correlated heavy-tailed data with some missing response values using stochastic EM algorithm","authors":"U. Nduka, I. Iwueze, C. Nwaigwe","doi":"10.1080/23737484.2021.2017808","DOIUrl":"https://doi.org/10.1080/23737484.2021.2017808","url":null,"abstract":"Abstract The linear regression model is a popular tool used by almost all in different areas of research. The model relies mainly on the assumption of uncorrelated errors from a Gaussian distribution. However, many datasets in practice violate this basic assumption, making inference in such cases invalid. Therefore, the linear regression model with structured errors driven by heavy-tailed innovations are preferred in practice. Another issue that occur frequently with real-life data is missing values, due to some reasons such as system breakdown and labor unrest. Despite the challenge these two issues pose to practitioners, there is scarcity of literature where they have jointly been studied. Hence, this article considers these two issues jointly, for the first time, and develops an efficient parameter estimation procedure for Student’s-t autoregressive regression model for time series with missing values of the response variable. The procedure is based on a stochastic approximation expectation–maximization algorithm coupled with a Markov chain Monte Carlo technique. The procedure gives efficient closed-form expressions for the parameters of the model, which are very easy to compute. Simulations and real-life data analysis show that the method is efficient for use with incomplete time series data.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"58 1","pages":"81 - 104"},"PeriodicalIF":0.0,"publicationDate":"2021-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76012816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Could significant regression be treated as insignificant: An anomaly in statistics? 显著回归是否可以被视为不显著:统计学中的异常?

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-11-08 DOI: 10.1080/23737484.2021.1986171

Yushan Cheng, Yongchang Hui, Shuangzhe Liu, Wing-Keung Wong

{"title":"Could significant regression be treated as insignificant: An anomaly in statistics?","authors":"Yushan Cheng, Yongchang Hui, Shuangzhe Liu, Wing-Keung Wong","doi":"10.1080/23737484.2021.1986171","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986171","url":null,"abstract":"Abstract Literature has found that regression of independent (nearly) nonstationary time series could be spurious. We incorporate this idea to examine whether significant regression could be treated as insignificant in some situations. To do so, we conjecture that significant regression could appear significant in some cases but it could become insignificant in some other cases. To check whether our conjecture could hold, we set up a model in which both dependent and independent variables Yt and Xt are the sum of two variables, say and , in which and are independent and (nearly) nonstationary AR(1) time series such that and . Following this model-setup, we design some situations and the algorithm for our simulation to check whether our conjecture could hold. We find that on the one hand, our conjecture could hold that significant regression could appear significant in some cases when α 1 and α 2 are of different signs. On the other hand, our findings show that our conjecture does not hold and significant regression cannot be treated as insignificant when α 1 and α 2 are of the same signs. We note that as far as we know, our article is the first article to discover that significant regression can be treated as insignificant in some situations. Thus, the main contribution of our article is that our article is the first article to discover that significant regression can be treated as insignificant in some situations and remains significant in other situations. We believe that our discovery could be an anomaly in statistics. Our findings are useful for academics and practitioners in their data analysis in the way that if they find the regression is insignificant, they should investigate further whether their analysis falls into the problem studied in our article.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"273 1","pages":"133 - 151"},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74738609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Early detection of individual growing pigs’ sanitary challenges using functional data analysis of real-time feed intake patterns 利用实时采食模式的功能数据分析，早期发现个体生长猪的卫生问题

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-28 DOI: 10.1080/23737484.2021.1991855

Bernard Colin, Simon Germain, C. Pomar

引用次数: 1

Fuzzy theories and statistics—fuzzy data analysis 模糊理论与统计-模糊数据分析

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1991854

N. Watanabe

引用次数: 0

Minimax strategies for Bernoulli two-armed bandit on a moderate control horizon 中等控制水平下Bernoulli双臂土匪的极大极小策略

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1986170

A. Kolnogorov, Denis Grunev

{"title":"Minimax strategies for Bernoulli two-armed bandit on a moderate control horizon","authors":"A. Kolnogorov, Denis Grunev","doi":"10.1080/23737484.2021.1986170","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986170","url":null,"abstract":"ABSTRACT We consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data if there are two processing methods available with different a priori unknown efficiencies. One has to determine the most effective method and provide its predominant application. In contrast to big data processing for which several approaches have been developed, including batch processing, the optimization of moderate data processing is currently not well understood. We consider minimax approach and search for minimax strategy and minimax risk as Bayesian ones corresponding to the worst-case prior distribution for which Bayesian risk attains its maximal value. Close to the worst-case prior distribution and corresponding Bayesian risk are obtained by numerical methods. Calculations show that determined strategy provides the value of maximal regret close to determined Bayesian risk and, hence, is approximately minimax one. Results can be applied to big data processing if the data arises by batches of moderate size with approximately uniform properties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"24 1","pages":"536 - 544"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90082110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Special issue – Communications in statistics – Case studies and data analysis – 6th stochastic modeling techniques and data analysis international conference 特刊-统计中的通信-案例研究和数据分析-第六届随机建模技术和数据分析国际会议

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.2012013

C. Skiadas, Yiannis Dimotikalis, M. Caruana

{"title":"Special issue – Communications in statistics – Case studies and data analysis – 6th stochastic modeling techniques and data analysis international conference","authors":"C. Skiadas, Yiannis Dimotikalis, M. Caruana","doi":"10.1080/23737484.2021.2012013","DOIUrl":"https://doi.org/10.1080/23737484.2021.2012013","url":null,"abstract":"This Special Issue on Statistical Methods and Data Analysis contains eleven invited articles presented at the 6th Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2020). The invited articles, theoretical, experimental and observational, present new results that have applications in real-life problems. An important objective was to select articles that present new methods for analyzing real-life data and lead to the advancement of the related fields. The following articles are included in this Special Issue: Mark A. Caruana and Liam Grech present their work on “Automobile Insurance Fraud Detection.” They explore the risk of incurring financial losses from fraudulent claims concerning insurance companies. Alexander Kolnogorov and Denis Grunev in their paper on “Minimax Strategies for Bernoulli Two-Armed Bandit on a Moderate Control Horizon” consider a Bernoulli two-armed bandit problem on a moderate control horizon as applied to optimization of processing moderate amounts of data when there are two processing methods available with different a priori unknown efficiencies. Panagiota Giannouli, Alex Karagrigoriou, Christos Kountzakis and Kimon Ntotsis in their paper “Multilevel Dimension Reduction for Credit Scoring Modelling and Prediction: Empirical Evidence for Greece” propose an innovative approach to flexible and accurate credit scoring modeling with the use of not only financial but also credit behavioral characteristics. Norio Watanabe is discussing “Fuzzy Theories and Statistics – Fuzzy Data Analysis –” and introduces some statistical tools for analyzing fuzzy data. The fuzzy data analysis is important in the fields related to human sensitivity.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"47 1","pages":"517 - 519"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78411578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sibling rivalry within inverse Weibull family to predict the COVID-19 spread in South Africa 逆威布尔家族内的兄弟姐妹竞争预测COVID-19在南非的传播

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1979433

Farzane Hashemi, A. Bekker, Kirsten Smith, M. Arashi

引用次数: 0

What is in the “I” of the beholder: modeling the processing of consonant addition in a child’s pronoun 观察者的“我”中有什么:对儿童代词中辅音加法的加工建模

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1995912

E. Babatsouli

{"title":"What is in the “I” of the beholder: modeling the processing of consonant addition in a child’s pronoun","authors":"E. Babatsouli","doi":"10.1080/23737484.2021.1995912","DOIUrl":"https://doi.org/10.1080/23737484.2021.1995912","url":null,"abstract":"Abstract Phonological processing in child developmental speech has been a major topic of research offering insights into intervention methods for child disordered speech. The present paper investigates the processing of consonant addition to the English personal pronoun I, a monosyllable comprising the diphthong //. This phonological phenomenon has not been studied in the literature in monolingual or bilingual speech. Here, a child’s speech is elicited longitudinally from age 2;9 to 3;9 and additions to I are examined in terms of the phonological processes of anticipation and perseveration. Results reveal (i) decreasing additions with age, (ii) larger processing distance in perseveration than in anticipation between triggering consonant and added I, (iii) addition dominance of the sonorants n, l and of the voiceless alveolar plosive t, matching their target frequencies in the child’s speech, (iv) no correlation between probability of consonant addition occurrence and syllabic processing distance, and (v) strong and statistically significant correlation between the mean and standard deviation of processing distance across the child’s ages, meaning that one or the other can be used in practice instead of both. These findings offer insights into speech error processing with applications to intervention techniques in children with speech difficulties.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"117 1","pages":"670 - 694"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90714586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automobile insurance fraud detection 汽车保险欺诈检测

Communications in Statistics Case Studies Data Analysis and Applications Pub Date : 2021-10-02 DOI: 10.1080/23737484.2021.1986169

M. Caruana, Liam Grech

{"title":"Automobile insurance fraud detection","authors":"M. Caruana, Liam Grech","doi":"10.1080/23737484.2021.1986169","DOIUrl":"https://doi.org/10.1080/23737484.2021.1986169","url":null,"abstract":"Abstract The risk of incurring financial losses from fraudulent claims is an issue concerning all insurance companies. The detection of such claims is not an easy task. Moreover, a number of old-school methods have proven to be inefficient. Statistical techniques for predictive modelling have been applied to detect fraudulent claims. In this article, we compare two techniques: Artificial neural networks and the Naïve Bayes classifier. The theory underpinning both techniques is discussed and an application of these techniques to a dataset of labelled automobile insurance claims is then presented. Fraudulent claims only constitute a small percentage of the total number of claims. As a result, datasets tend to be unbalanced. This in turn causes a number of problems. To overcome such issues, techniques which deal with unbalanced datasets are also discussed. The suitability of Neural Networks and the Naïve Bayes classifier to the dataset is discussed and the results are compared and contrasted by using a number of performance measures including ROC curves, Accuracy, AUC, Precision, and Sensitivity. Both classification techniques gave comparable results with the Neural network giving slightly better results than the Naïve Bayes classifier on the training dataset. However, when applied to the test data, the Naïve Bayes classifier slightly outperformed the artificial neural network.","PeriodicalId":36561,"journal":{"name":"Communications in Statistics Case Studies Data Analysis and Applications","volume":"40 1","pages":"520 - 535"},"PeriodicalIF":0.0,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85082934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1