Journal of Statistical Theory and Applications最新文献

Modelling the Proportions with Excessive Endpoints Based on a Generalized Lindley Binomial Model. 基于广义Lindley二项模型的过度端点比例建模。

IF 1

Journal of Statistical Theory and Applications Pub Date : 2026-01-01 Epub Date: 2026-02-18 DOI: 10.1007/s44199-025-00161-8

Dianliang Deng, Xiaoqing Zhang

{"title":"Modelling the Proportions with Excessive Endpoints Based on a Generalized Lindley Binomial Model.","authors":"Dianliang Deng, Xiaoqing Zhang","doi":"10.1007/s44199-025-00161-8","DOIUrl":"https://doi.org/10.1007/s44199-025-00161-8","url":null,"abstract":"<p><p>This paper introduces the generalized Lindley binomial (GLB) distribution, a novel model for analyzing proportional data with excessive endpoint observations. The GLB distribution is derived by compounding the binomial distribution with a generalized three-parameter Lindley distribution, itself defined as a mixture of two gamma distributions with distinct rate parameters. We establish the probabilistic properties of the GLB distribution, including its probability mass function, factorial moments, mean, variance, moment generating function, and dispersion index, demonstrating its flexibility in modeling both under- and over-dispersed data as well as unimodal and bimodal shapes. Likelihood-based inference is developed for the GLB model, with and without covariates, using Fisher scoring and expectation-maximization (EM) algorithms. To improve estimation stability, a penalized EM algorithm incorporating Bayes-inspired penalties is proposed. Model diagnostics are addressed through Pearson and deviance residuals, as well as randomized quantile residual plots. Simulation studies are conducted to evaluate the performance of the estimation procedures under different scenarios. Finally, the practical utility of the GLB regression model is illustrated with the whitefly dataset, where it is shown to provide superior fit compared to existing endpoint-inflated binomial models.</p>","PeriodicalId":45080,"journal":{"name":"Journal of Statistical Theory and Applications","volume":"25 1","pages":"10"},"PeriodicalIF":1.0,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12917061/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147272422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction: Topp-Leone Cauchy Family of Distributions with Applications in Industrial Engineering 更正：Topp-Leone Cauchy 分布系列在工业工程中的应用

IF 1

Journal of Statistical Theory and Applications Pub Date : 2024-01-03 DOI: 10.1007/s44199-023-00069-1

Mintodê Nicodème Atchadé, Mahoulé Jude Bogninou, Aliou Moussa Djibril, Melchior N’bouké

引用次数: 0

Topp-Leone Cauchy Family of Distributions with Applications in Industrial Engineering Topp-Leone Cauchy分布族及其在工业工程中的应用

Journal of Statistical Theory and Applications Pub Date : 2023-11-13 DOI: 10.1007/s44199-023-00066-4

Mintodê Nicodème Atchadé, Mahoulé Jude Bogninou, Aliou Moussa Djibril, Melchior N’bouké

引用次数: 0

Zero to k Inflated Poisson Regression Models with Applications 0 ~ k膨胀泊松回归模型及其应用

Journal of Statistical Theory and Applications Pub Date : 2023-11-13 DOI: 10.1007/s44199-023-00067-3

Hadi Saboori, Mahdi Doostparast

{"title":"Zero to k Inflated Poisson Regression Models with Applications","authors":"Hadi Saboori, Mahdi Doostparast","doi":"10.1007/s44199-023-00067-3","DOIUrl":"https://doi.org/10.1007/s44199-023-00067-3","url":null,"abstract":"Abstract In the count data set, the frequency of some points may occur more than expected under the standard data analysis models. Indeed, in many situations, the frequencies of zero and of some other points tend to be higher than those of the Poisson. Adapting existing models for analyzing inflated observations has been studied in the literature. A method for modeling the inflated data is the inflated distribution. In this paper, we extend this inflated distribution. Indeed, if inflations occur in three or more of the support point, then the previous models are not suitable. We propose a model based on zero, one, $$ldots ,$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mo>…</mml:mo> <mml:mo>,</mml:mo> </mml:mrow> </mml:math> and k inflated points with probabilities $$w_{0},w_1,ldots ,$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:msub> <mml:mi>w</mml:mi> <mml:mn>0</mml:mn> </mml:msub> <mml:mo>,</mml:mo> <mml:msub> <mml:mi>w</mml:mi> <mml:mn>1</mml:mn> </mml:msub> <mml:mo>,</mml:mo> <mml:mo>…</mml:mo> <mml:mo>,</mml:mo> </mml:mrow> </mml:math> and $$w_{k},$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:msub> <mml:mi>w</mml:mi> <mml:mi>k</mml:mi> </mml:msub> <mml:mo>,</mml:mo> </mml:mrow> </mml:math> respectively. By choosing the appropriate values for the weights $$w_{0},ldots ,w_{k},$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:msub> <mml:mi>w</mml:mi> <mml:mn>0</mml:mn> </mml:msub> <mml:mo>,</mml:mo> <mml:mo>…</mml:mo> <mml:mo>,</mml:mo> <mml:msub> <mml:mi>w</mml:mi> <mml:mi>k</mml:mi> </mml:msub> <mml:mo>,</mml:mo> </mml:mrow> </mml:math> various inflated distributions, such as the zero-inflated, zero–one-inflated, and zero– k -inflated distributions, are derived as special cases of the proposed model in this paper. Various illustrative examples and real data sets are analyzed using the obtained results.","PeriodicalId":45080,"journal":{"name":"Journal of Statistical Theory and Applications","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136347301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Class of Estimators for Estimation of Population Mean Under Random Non-response in Two Phase Successive Sampling 两期连续抽样随机无响应下总体均值估计的一类估计量

Journal of Statistical Theory and Applications Pub Date : 2023-10-30 DOI: 10.1007/s44199-023-00065-5

Zeeshan Basit, Saadia Masood, Ishaq Bhatti

引用次数: 0

Predictive Estimation of Finite Population Mean in Case of Missing Data Under Two-phase Sampling 两阶段抽样下数据缺失情况下有限总体均值的预测估计

Journal of Statistical Theory and Applications Pub Date : 2023-10-19 DOI: 10.1007/s44199-023-00064-6

Lovleen Kumar Grover, Anchal Sharma

{"title":"Predictive Estimation of Finite Population Mean in Case of Missing Data Under Two-phase Sampling","authors":"Lovleen Kumar Grover, Anchal Sharma","doi":"10.1007/s44199-023-00064-6","DOIUrl":"https://doi.org/10.1007/s44199-023-00064-6","url":null,"abstract":"Abstract The present paper deals with the problem of estimation of finite population mean of study variable using two auxiliary variables in two-phase sampling scheme using predictive approach in case of missing values of the study variable and unknown population mean of first auxiliary variable. Four classes of such estimators have been proposed using this predictive approach. The expressions of bias and mean square errors are derived up to first order of approximation. The optimal values of the constants involved in the proposed classes of estimators have been obtained and thus minimum mean square errors of the proposed classes are obtained in this study. The empirical and graphical comparisons with regression type estimators (under single phase and double phase sampling scheme) and also among themselves have been made for evaluating the performance of the proposed classes for different choices of non-responding units. Five real data sets and three simulated data sets following normal distribution have been used to evaluate the performance of the proposed classes. Numerical findings confirm the theoretical results obtained regarding superiority of proposed classes of estimators over the conventional regression type estimators in terms of percent relative efficiencies.","PeriodicalId":45080,"journal":{"name":"Journal of Statistical Theory and Applications","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135732024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An AI-driven Predictive Model for Pancreatic Cancer Patients Using Extreme Gradient Boosting 基于极端梯度增强的胰腺癌患者ai预测模型

Journal of Statistical Theory and Applications Pub Date : 2023-09-11 DOI: 10.1007/s44199-023-00063-7

Aditya Chakraborty, Chris P. Tsokos

{"title":"An AI-driven Predictive Model for Pancreatic Cancer Patients Using Extreme Gradient Boosting","authors":"Aditya Chakraborty, Chris P. Tsokos","doi":"10.1007/s44199-023-00063-7","DOIUrl":"https://doi.org/10.1007/s44199-023-00063-7","url":null,"abstract":"Abstract Pancreatic cancer is one of the deadliest carcinogenic diseases affecting people all over the world. The majority of patients are usually detected at Stage III or Stage IV, and the chances of survival are very low once detected at the late stages. This study focuses on building an efficient data-driven analytical predictive model based on the associated risk factors and identifying the most contributing factors influencing the survival times of patients diagnosed with pancreatic cancer using the XGBoost (eXtreme Gradient Boosting) algorithm. The grid-search mechanism was implemented to compute the optimum values of the hyper-parameters of the analytical model by minimizing the root mean square error (RMSE). The optimum hyperparameters of the final analytical model were selected by comparing the values with 243 competing models. To check the validity of the model, we compared the model’s performance with ten deep neural network models, grown sequentially with different activation functions and optimizers. We also constructed an ensemble model using Gradient Boosting Machine (GBM). The proposed XGBoost model outperformed all competing models we considered with regard to root mean square error (RMSE). After developing the model, the individual risk factors were ranked according to their individual contribution to the response predictions, which is extremely important for pancreatic research organizations to spend their resources on the risk factors causing/influencing the particular type of cancer. The three most influencing risk factors affecting the survival of pancreatic cancer patients were found to be the age of the patient, current BMI, and cigarette smoking years with contributing percentages of 35.5%, 24.3%, and 14.93%, respectively. The predictive model is approximately 96.42% accurate in predicting the survival times of the patients diagnosed with pancreatic cancer and performs excellently on test data. The analytical methodology of developing the model can be utilized for prediction purposes. It can be utilized to predict the time to death related to a specific type of cancer, given a set of numeric, and non-numeric features.","PeriodicalId":45080,"journal":{"name":"Journal of Statistical Theory and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135980459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Smoothed Dirichlet Distribution 平滑狄利克雷分布

Journal of Statistical Theory and Applications Pub Date : 2023-09-11 DOI: 10.1007/s44199-023-00062-8

Lahiru Wickramasinghe, Alexandre Leblanc, Saman Muthukumarana

引用次数: 1

Correction: Estimation of Reliability in Multicomponent Set-up when Stress and Strength are Non-identical 修正:应力和强度不相同时多构件装置可靠性的估计

Journal of Statistical Theory and Applications Pub Date : 2023-09-04 DOI: 10.1007/s44199-023-00061-9

Anupam Pathak, Anoop Chaturvedi, Taruna Kumari

引用次数: 0

Estimation of Reliability in Multicomponent Set-up when Stress and Strength are Non-identical 应力与强度不相等时多构件装置可靠性的估计

IF 1

Journal of Statistical Theory and Applications Pub Date : 2023-07-24 DOI: 10.1007/s44199-023-00060-w