{"title":"Modi-Weibull Distribution: Inferential and Simulation Study","authors":"Harshita Kumawat, Kanak Modi, Pankaj Nagar","doi":"10.1007/s40745-023-00491-3","DOIUrl":"10.1007/s40745-023-00491-3","url":null,"abstract":"<div><p>This paper presents a study on a new family of distributions using the Weibull distribution and termed as Modi-Weibull distribution. This Modi-Weibull distribution is based on four parameters. To understand the behaviour of the distribution, some statistical characteristics have been derived, such as shapes of density and distribution function, hazard function, survival function, median, moments, order statistics etc. These parameters are estimated using classical maximum likelihood estimation method. Asymptotic confidence intervals for parameters of Modi-Weibull distribution are also obtained. A simulation study is carried out to investigate the bias, MSE of proposed maximum likelihood estimators along with coverage probability and average width of confidence intervals of parameters. Two applications to real data sets are discussed to illustrate the fitting of the proposed distribution and compared with some well-known distributions.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 6","pages":"1975 - 1999"},"PeriodicalIF":0.0,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-023-00491-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48772839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shubham Gupta, Gajendra K. Vishwakarma, A. M. Elsawah
{"title":"Shrinkage Estimation for Location and Scale Parameters of Logistic Distribution Under Record Values","authors":"Shubham Gupta, Gajendra K. Vishwakarma, A. M. Elsawah","doi":"10.1007/s40745-023-00492-2","DOIUrl":"10.1007/s40745-023-00492-2","url":null,"abstract":"<div><p>Logistic distribution (LogDis) is frequently used in many different applications, such as logistic regression, logit models, classification, neural networks, physical sciences, sports modeling, finance and health and disease studies. For instance, the distribution function of the LogDis has the same functional form as the derivative of the Fermi function that can be used to set the relative weight of various electron energies in their contributions to electron transport. The LogDis has wider tails than a normal distribution (NorDis), so it is more consistent with the underlying data and provides better insight into the likelihood of extreme events. For this reason the United States Chess Federation has switched its formula for calculating chess ratings from the NorDis to the LogDis. The outcomes of many real-life experiments are sequences of record-breaking data sets, where only observations that exceed (or only those that fall below) the current extreme value are recorded. The practice demonstrated that the widely used estimators of the scale and location parameters of logistic record values, such as the best linear unbiased estimators (BLUEs), have some defects. This paper investigates the shrinkage estimators of the location and scale parameters for logistic record values using prior information about their BLUEs. Theoretical and computational justifications for the accuracy and precision of the proposed shrinkage estimators are investigated via their bias and mean square error (MSE), which provide sufficient conditions for improving the proposed shrinkage estimators to get unbiased estimators with minimum MSE. The performance of the proposed shrinkage estimators is compared with the performances of the BLUEs. The results demonstrate that the resulting shrinkage estimators are shown to be remarkably efficient.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 4","pages":"1209 - 1224"},"PeriodicalIF":0.0,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46550442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ha Che-Ngoc, Thao Nguyen-Trang, Hieu Huynh-Van, Tai Vo-Van
{"title":"Improving Bayesian Classifier Using Vine Copula and Fuzzy Clustering Technique","authors":"Ha Che-Ngoc, Thao Nguyen-Trang, Hieu Huynh-Van, Tai Vo-Van","doi":"10.1007/s40745-023-00490-4","DOIUrl":"10.1007/s40745-023-00490-4","url":null,"abstract":"<div><p>Classification is a fundamental problem in statistics and data science, and it has garnered significant interest from researchers. This research proposes a new classification algorithm that builds upon two key improvements of the Bayesian method. First, we introduce a method to determine the prior probabilities using fuzzy clustering techniques. The prior probability is determined based on the fuzzy level of the classified element within the groups. Second, we develop the probability density function using Vine Copula. By combining these improvements, we obtain an automatic classification algorithm with several advantages. The proposed algorithm is presented with specific steps and illustrated using numerical examples. Furthermore, it is applied to classify image data, demonstrating its significant potential in various real-world applications. The numerical examples and applications highlight that the proposed algorithm outperforms existing methods, including traditional statistics and machine learning approaches.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 2","pages":"709 - 732"},"PeriodicalIF":0.0,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49453964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Majid Hashempour, Morad Alizadeh, Haitham M. Yousof
{"title":"A New Lindley Extension: Estimation, Risk Assessment and Analysis Under Bimodal Right Skewed Precipitation Data","authors":"Majid Hashempour, Morad Alizadeh, Haitham M. Yousof","doi":"10.1007/s40745-023-00485-1","DOIUrl":"10.1007/s40745-023-00485-1","url":null,"abstract":"<div><p>The objectives of this study are to propose a new two-parameter lifespan distribution and explain some of the most essential properties of that distribution. Through the course of this investigation, we will be able to achieve both of these objectives. For the aim of assessment, research is carried out that makes use of simulation, and for the same reason, a variety of various approaches are studied and taken into account for the purpose of evaluation. Making use of two separate data collections enables an analysis of the adaptability of the suggested distribution to a number of different contexts. The risk exposure in the context of asymmetric bimodal right-skewed precipitation data was further defined by using five essential risk indicators, such as value-at-risk, tail-value-at-risk, tail variance, tail mean–variance, and mean excess loss function. This was done in order to account for the right-skewed distribution of the data. In order to examine the data, several risk indicators were utilized. These risk indicators were used in order to achieve a more in-depth description of the risk exposure that was being faced.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 6","pages":"1919 - 1958"},"PeriodicalIF":0.0,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42711550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Analysis of Change Point Problems Using Conditionally Specified Priors","authors":"G. Shahtahmassebi, José María Sarabia","doi":"10.1007/s40745-023-00484-2","DOIUrl":"10.1007/s40745-023-00484-2","url":null,"abstract":"<div><p>In data analysis, change point problems correspond to abrupt changes in stochastic mechanisms generating data. The detection of change points is a relevant problem in the analysis and prediction of time series. In this paper, we consider a class of conjugate prior distributions obtained from conditional specification methodology for solving this problem. We illustrate the application of such distributions in Bayesian change point detection analysis with Poisson processes. We obtain the posterior distribution of model parameters using general bivariate distribution with gamma conditionals. Simulation from the posterior are readily implemented using a Gibbs sampling algorithm. The Gibbs sampling is implemented even when using conditional densities that are incompatible or only compatible with an improper joint density. The application of such methods will be demonstrated using examples of simulated and real data.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 6","pages":"1899 - 1918"},"PeriodicalIF":0.0,"publicationDate":"2023-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-023-00484-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47402928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shouhao Zhou, Xuelin Huang, Chan Shen, Hagop M. Kantarjian
{"title":"Bayesian Learning of Personalized Longitudinal Biomarker Trajectory","authors":"Shouhao Zhou, Xuelin Huang, Chan Shen, Hagop M. Kantarjian","doi":"10.1007/s40745-023-00486-0","DOIUrl":"10.1007/s40745-023-00486-0","url":null,"abstract":"<div><p>This work concerns the effective personalized prediction of longitudinal biomarker trajectory, motivated by a study of cancer targeted therapy for patients with chronic myeloid leukemia (CML). Continuous monitoring with a confirmed biomarker of residual disease is a key component of CML management for early prediction of disease relapse. However, the longitudinal biomarker measurements have highly heterogeneous trajectories between subjects (patients) with various shapes and patterns. It is believed that the trajectory is clinically related to the development of treatment resistance, but there was limited knowledge about the underlying mechanism. To address the challenge, we propose a novel Bayesian approach to modeling the distribution of subject-specific longitudinal trajectories. It exploits flexible Bayesian learning to accommodate complex changing patterns over time and non-linear covariate effects, and allows for real-time prediction of both in-sample and out-of-sample subjects. The generated information can help make clinical decisions, and consequently enhance the personalized treatment management of precision medicine.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"1031 - 1050"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46463104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applications of Reliability Test Plan for Logistic Rayleigh Distributed Quality Characteristic","authors":"Mahendra Saha, Harsh Tripathi, Anju Devi, Pratibha Pareek","doi":"10.1007/s40745-023-00473-5","DOIUrl":"10.1007/s40745-023-00473-5","url":null,"abstract":"<div><p>In this article, a reliability test plan under time truncated life test is considered for the logistic Rayleigh distribution (<span>(mathcal {LRD})</span>). A brief discussion over statistical properties and significance of the <span>(mathcal {LRD})</span> is placed in this present study. Larger the value of median—better is the quality of the lot is considered as quality characteristic for the proposed reliability test plan. Minimum sample sizes are placed in tabular form for different set up of specified consumer’s risk. Also operating characteristics (<span>(mathcal{O}mathcal{C})</span>) values are shown in tabular forms for the chosen set up and discussed the pattern of <span>(mathcal{O}mathcal{C})</span> values. A comparative analysis of the present study with some other reliability test plans is discussed based on the sample sizes. As an illustration, the performance of the proposed plan for the <span>(mathcal {LRD})</span> is shown through real-life examples.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1687 - 1703"},"PeriodicalIF":0.0,"publicationDate":"2023-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43684187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tarakashar Das, Sabrina Mobassirin, Syed Md. Minhaz Hossain, Aka Das, Anik Sen, Khaleque Md. Aashiq Kamal, Kaushik Deb
{"title":"Patient Questionnaires Based Parkinson’s Disease Classification Using Artificial Neural Network","authors":"Tarakashar Das, Sabrina Mobassirin, Syed Md. Minhaz Hossain, Aka Das, Anik Sen, Khaleque Md. Aashiq Kamal, Kaushik Deb","doi":"10.1007/s40745-023-00482-4","DOIUrl":"10.1007/s40745-023-00482-4","url":null,"abstract":"<div><p>Parkinson’s disease is one of the most prevalent and harmful neurodegenerative conditions (PD). Even today, PD diagnosis and monitoring remain pricy and inconvenient processes. With the unprecedented progress of artificial intelligence algorithms, there is an opportunity to develop a cost-effective system for diagnosing PD at an earlier stage. No permanent remedy has been established yet; however, an earlier diagnosis helps lead a better life. Probably, the three most responsible categories of symptoms for Parkinson’s Disease are tremors, rigidity, and body bradykinesia. Therefore, we investigate the 53 unique features of the Parkinson’s Progression Markers Initiative dataset to determine the significant symptoms, including three major categories. As feature selection is integral to developing a generalized model, we investigate including and excluding feature selection. Four feature selection methods are incorporated—low variance filter, Wilcoxon rank-sum test, principle component analysis, and Chi-square test. Furthermore, we utilize machine learning, ensemble learning, and artificial neural networks (ANN) for classification. Experimental evidence shows that not all symptoms are equally important, but no symptom can be completely eliminated. However, our proposed ANN model attains the best mean accuracy of 99.51%, 98.17% mean specificity, 0.9830 mean Kappa Score, 0.99 mean AUC, and 99.70% mean F1-score with all the features. The efficiency of our suggested technique on diverse data modalities is demonstrated by comparison with recent publications. Finally, we established a trade-off between classification time and accuracy.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 5","pages":"1821 - 1864"},"PeriodicalIF":0.0,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s40745-023-00482-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46611876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ishfaq S. Ahmad, Rameesa Jan, Poonam Nirwan, Peer Bilal Ahmad
{"title":"A New Class of Distribution Over Bounded Support and Its Associated Regression Model","authors":"Ishfaq S. Ahmad, Rameesa Jan, Poonam Nirwan, Peer Bilal Ahmad","doi":"10.1007/s40745-023-00483-3","DOIUrl":"10.1007/s40745-023-00483-3","url":null,"abstract":"<div><p>In this paper, a new two-parameter distribution over the bounded support (0,1) is introduced and studied in detail. Some of the interesting statistical properties like concavity, hazard rate function, mean residual life, moments and quantile function are discussed. The method of moments and maximum likelihood estimation methods are used to estimate unknown parameters of the proposed model. Besides, finite sample performance of estimation methods are evaluated through the Monte-Carlo simulation study. Application of the proposed distribution to the real data sets shows a better fit than many known two-parameter distributions on the unit interval. Moreover, a new regression model as an alternative to various unit interval regression models is introduced.\u0000</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 2","pages":"549 - 569"},"PeriodicalIF":0.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44336464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inception-UDet: An Improved U-Net Architecture for Brain Tumor Segmentation","authors":"Ilyasse Aboussaleh, Jamal Riffi, Adnane Mohamed Mahraz, Hamid Tairi","doi":"10.1007/s40745-023-00480-6","DOIUrl":"10.1007/s40745-023-00480-6","url":null,"abstract":"<div><p>Brain tumor segmentation is an important field and a sensitive task in tumor diagnosis. The treatment research in this area has helped specialists in detecting the tumor’s location in order to deal with it in its early stages. Numerous methods based on deep learning, have been proposed, including the symmetric U-Net architectures, which revealed great results in the medical imaging field, precisely brain tumor segmentation. In this paper, we proposed an improved U-Net architecture called Inception U-Det inspired by U-Det. This work aims at employing the inception block instead of the convolution one used in the bi-directional feature pyramid neural (Bi-FPN) network during the skip connection U-Det phase. Furthermore, a comparison study has been performed between our proposed approach and the three known architectures in medical imaging segmentation; U-Net, DC-Unet, and U-Det. Several segmentation metrics have been computed and then taken into account in these methods, by means of the publicly available BraTS datasets. Thus, our obtained results have showed promising results in terms of accuracy, dice similarity coefficient (DSC), and intersection–union ratio (IOU). Moreover, the proposed method has achieved a DSC of 87.9%, 85.5%, and 83.9% on BraTS2020, BraTS2018, and BraTS2017, respectively, calculated from the best fold in fourfold cross-validation employed in the present approach.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"11 3","pages":"831 - 853"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45364813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}