{"title":"An empirical Bayes approach for constructing confidence intervals for clonality and entropy.","authors":"Zhongren Chen, Lu Tian, Richard A Olshen","doi":"10.1080/02664763.2025.2496724","DOIUrl":"10.1080/02664763.2025.2496724","url":null,"abstract":"<p><p>This paper is motivated by the need to quantify human immune responses to environmental challenges. Specifically, the genome of the selected cell population from a blood sample is amplified by the PCR process, producing a large number of reads. Each read corresponds to a particular rearrangement of so-called V(D)J sequences. The observed data consist of a set of integers, representing numbers of reads corresponding to different V(D)J sequences. The underlying relative frequencies of distinct V(D)J sequences can be summarized by a probability vector, with the cardinality being the number of distinct V(D)J rearrangements. The statistical question is to make inferences on a summary parameter of this probability vector based on a multinomial-type observation of a large dimension. Popular summaries of the diversity include clonality and entropy. A point estimator of the clonality based on multiple replicates from the same blood sample has been proposed previously. Therefore, the remaining challenge is to construct confidence intervals of the parameters to reflect their uncertainty. In this paper, we propose to couple the Empirical Bayes method with a resampling-based calibration procedure to construct a robust confidence interval for different population diversity parameters. The method is illustrated via extensive numerical studies and real data examples.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12435542/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145075083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantile regression model for interval-censored data with competing risks.","authors":"Amirah Afiqah Binti Che Ramli, Yang-Jin Kim","doi":"10.1080/02664763.2025.2474627","DOIUrl":"https://doi.org/10.1080/02664763.2025.2474627","url":null,"abstract":"<p><p>Our interest is to provide the methodology for estimating quantile regression model for interval-censored competing risk data. Lee and Kim [<i>Analysis of interval censored competing risk data via nonparametric multiple imputation</i>. Stat. Biopharm. Res. 13 (2020), pp. 367-374.] applied a censoring complete data concept suggested by Ruan and Gray [<i>Analyses of cumulative incidence function via non-parametric multiple imputation</i>. Sta. Med. 27 (2008), pp. 5709-5724.] to recover a missing information related with competing events. In this paper, we also applied it to a quantile regression model. The simulated censoring times of the competing events are generated with a multiple imputation technique and the survival function of right censoring times. The performance of suggested methods is evaluated by comparing with the result of a simple imputation method under several distributions and sample sizes. The AIDS dataset is analyzed to estimate the effect of several covariates on the quantiles of cause-specific CIF as a real data analysis.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2438-2447"},"PeriodicalIF":1.1,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490390/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic analytics for the mixed Poisson INGARCH model with applications.","authors":"Wenjie Dang, Fukang Zhu, Nuo Xu, Shuangzhe Liu","doi":"10.1080/02664763.2025.2476658","DOIUrl":"https://doi.org/10.1080/02664763.2025.2476658","url":null,"abstract":"<p><p>In statistical diagnosis and sensitivity analysis, the local influence method plays a crucial role and is sometimes more advantageous than other methods. The mixed Poisson integer-valued generalized autoregressive conditional heteroscedastic (INGARCH) model is built on a flexible family of mixed Poisson distributions. It not only encompasses the negative binomial INGARCH model but also allows for the introduction of the Poisson-inverse Gaussian INGARCH model and the Poisson generalized hyperbolic secant INGARCH model. This paper applies the local influence analysis method to count time series data within the framework of the mixed Poisson INGARCH model. For parameter estimation, the Expectation-Maximization algorithm is utilized. In the context of local influence analysis, two global influence methods (generalized Cook distance and Q-distance) and four perturbations-case weights perturbation, data perturbation, additive perturbation, and scale perturbation-are considered to identify influential points. Finally, the feasibility and effectiveness of the proposed methods are demonstrated through simulations and analysis of a real data set.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2495-2523"},"PeriodicalIF":1.1,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490395/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving the within-node estimation of survival trees while retaining interpretability.","authors":"Haolin Li, Yiyang Fan, Jianwen Cai","doi":"10.1080/02664763.2025.2473535","DOIUrl":"https://doi.org/10.1080/02664763.2025.2473535","url":null,"abstract":"<p><p>In statistical learning for survival data, survival trees are favored for their capacity to detect complex relationships beyond parametric and semiparametric models. Despite this, their prediction accuracy is often suboptimal. In this paper, we propose a new method based on super learning to improve the within-node estimation and overall survival prediction accuracy, while preserving the interpretability of the survival tree. Simulation studies reveal the proposed method's superior finite sample performance compared to conventional approaches for within-node estimation in survival trees. Furthermore, we apply this method to analyze the North Central Cancer Treatment Group Lung Cancer Data, cardiovascular medical records from the Faisalabad Institute of Cardiology, and the integrated genomic data of ovarian carcinoma with The Cancer Genome Atlas project.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2544-2558"},"PeriodicalIF":1.1,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490394/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating an executive summary of a time series: the tendency.","authors":"Caio Alves, Juan M Restrepo, Jorge M Ramirez","doi":"10.1080/02664763.2025.2475351","DOIUrl":"10.1080/02664763.2025.2475351","url":null,"abstract":"<p><p>In this paper, we revisit the problem of decomposing a signal into a tendency and a residual. The tendency describes an executive summary of a signal that encapsulates its notable characteristics while disregarding seemingly random, less interesting aspects. Building upon the Intrinsic Time Decomposition (ITD) and information-theoretical analysis, we introduce two alternative procedures for selecting the tendency from the ITD baselines. The first is based on the maximum extrema prominence, namely the maximum difference between extrema within each baseline. Specifically this method selects the tendency as the baseline from which an ITD step would produce the largest decline of the maximum prominence. The second method uses the rotations from the ITD and selects the tendency as the last baseline for which the associated rotation is statistically stationary. We delve into a comparative analysis of the information content and interpretability of the tendencies obtained by our proposed methods and those obtained through conventional low-pass filtering schemes, particularly the Hodrik-Prescott (HP) filter. Our findings underscore a fundamental distinction in the nature and interpretability of these tendencies, highlighting their context-dependent utility with emphasis in multi-scale signals. Through a series of real-world applications, we demonstrate the computational robustness and practical utility of our proposed tendencies, emphasizing their adaptability and relevance in diverse time series contexts.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2478-2494"},"PeriodicalIF":1.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490379/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Bayesian models for small area estimation with GB2 distribution.","authors":"Binod Manandhar, Balgobin Nandram","doi":"10.1080/02664763.2025.2475349","DOIUrl":"https://doi.org/10.1080/02664763.2025.2475349","url":null,"abstract":"<p><p>We present predictive hierarchical Bayesian models to fit continuous, and positively skewed size data from small areas with the generalized beta of the second kind (GB2) distribution. We discuss three different GB2 mixture models. In the models, we have implemented the technique of small areas estimation. The posterior distributions of these models are complex. We have used Taylor series approximations, grid sampling and Metropolis samplers to fit the models. We have applied our models to the per-capita consumption size data from the second Nepal Living Standards Survey. We choose the best fitted model from the three GB2 mixture models. With the best fitted model, we provide small area estimation of poverty indicators by linking the survey data with the census data. A simulation study is provided.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2448-2477"},"PeriodicalIF":1.1,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490410/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maximilian Linde, Jorge N Tendeiro, Don van Ravenzwaaij
{"title":"Bayes factors for two-group comparisons in Cox regression with an application for reverse-engineering raw data from summary statistics.","authors":"Maximilian Linde, Jorge N Tendeiro, Don van Ravenzwaaij","doi":"10.1080/02664763.2025.2472150","DOIUrl":"10.1080/02664763.2025.2472150","url":null,"abstract":"<p><p>The use of Cox proportional hazards regression to analyze time-to-event data is ubiquitous in biomedical research. Typically, the frequentist framework is used to draw conclusions about whether hazards are different between patients in an experimental and a control condition. We offer a procedure to compute Bayes factors for simple Cox models, both for the scenario where the full data are available and for the scenario where only summary statistics are available. The procedure is implemented in our 'baymedr' R package. The usage of Bayes factors remedies some shortcomings of frequentist inference and has the potential to save scarce resources.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2413-2437"},"PeriodicalIF":1.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490364/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adel Ahmadi Nadi, Ali Yeganeh, Sandile Charles Shongwe, Alireza Shadman
{"title":"An integrated change point detection and online monitoring approach for the ratio of two variables using clustering-based control charts.","authors":"Adel Ahmadi Nadi, Ali Yeganeh, Sandile Charles Shongwe, Alireza Shadman","doi":"10.1080/02664763.2025.2455625","DOIUrl":"10.1080/02664763.2025.2455625","url":null,"abstract":"<p><p>Online monitoring of the ratio of two random characteristics rather than monitoring their individual behaviors has many applications. For this aim, there are various control charts, known as RZ charts in the literature, e.g. Shewhart, memory-type and adaptive monitoring schemes, have been designed to detect the ratio's abnormal patterns as soon as possible. Most of the existing RZ charts rely on two assumptions about the process: (<i>i</i>) both individual characteristics are normally distributed, and (<i>ii</i>) the direction (upward or downward) of the RZ's deviation from its in-control (IC) state to an out-of-control (OC) condition is known. However, these assumptions can be violated in many practical situations. In recent years, applying the machine learning (ML) models in the Statistical Process Monitoring (SPM) area has provided several contributions compared to traditional statistical methods. However, ML-based control charts have not yet been discussed in the RZ monitoring literature. To this end, this study introduces a novel clustering-based control chart for monitoring RZ in Phase II. This method avoids making any assumptions about the direction of RZ's deviation and does not need to assume a specific distribution for the two random characteristics. Furthermore, it can estimate the Change Point (CP) in the process.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 11","pages":"2060-2093"},"PeriodicalIF":1.1,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12404067/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144992745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isaac Manring, Honglang Wang, George Mohler, Xenia Miscouridou
{"title":"BSTPP: a python package for Bayesian spatiotemporal point processes.","authors":"Isaac Manring, Honglang Wang, George Mohler, Xenia Miscouridou","doi":"10.1080/02664763.2025.2462969","DOIUrl":"https://doi.org/10.1080/02664763.2025.2462969","url":null,"abstract":"<p><p>Spatiotemporal point process models have a rich history of effectively modeling event data in space and time. However, they are sometimes neglected due to the difficulty of implementing them. There is a lack of packages with the ability to perform inference for these models, particularly in python. Thus we present BSTPP a python package for Bayesian inference on spatiotemporal point processes. It offers three different kinds of models: space-time separable Log Gaussian Cox, Hawkes, and Cox Hawkes. Users may employ the predefined trigger parameterizations for the Hawkes models, or they may implement their own trigger functions with the extendable Trigger module. For the Cox models, posterior inference on the Gaussian processes is sped up with a pre-trained Variational Auto Encoder (VAE). The package includes a new flexible pre-trained VAE. We validate the model through simulation studies and then explore it by applying it to shooting data in Chicago.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 13","pages":"2524-2543"},"PeriodicalIF":1.1,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490397/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145232624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tapio Nummi, Jyrki Möttönen, Pasi Väkeväinen, Janne Salonen, Timothy E O'Brien
{"title":"On the improved estimation of the normal mixture components for longitudinal data.","authors":"Tapio Nummi, Jyrki Möttönen, Pasi Väkeväinen, Janne Salonen, Timothy E O'Brien","doi":"10.1080/02664763.2025.2459293","DOIUrl":"10.1080/02664763.2025.2459293","url":null,"abstract":"<p><p>When analyzing real data sets, statisticians often face the question that the data are heterogeneous and it may not necessarily be possible to model this heterogeneity directly. One natural option in this case is to use the methods based on finite mixtures. The key question in these techniques often is what is the best number of mixtures or, depending on the focus of the analysis, the best number of sub-populations when the model is otherwise fixed. Moreover, when the distribution of the response variable deviates from meeting the assumptions, it's common to employ an appropriate transformation to align the distribution with the model's requirements. To solve the problem in the mixture regression context we propose a technique based on the scaled Box-Cox transformation for normal mixtures. The specific focus here is on mixture regression for longitudinal data, the so-called trajectory analysis. We present interesting practical results as well as simulation experiments to demonstrate that our method yields reasonable results. Associated R-programs are also provided.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 12","pages":"2271-2290"},"PeriodicalIF":1.1,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12416014/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145029994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}