{"title":"Deep learning for the joint analysis of item-level longitudinal and survival data.","authors":"Jeffrey Lin, Sheng Luo","doi":"10.1080/02664763.2026.2625122","DOIUrl":"https://doi.org/10.1080/02664763.2026.2625122","url":null,"abstract":"<p><p>Patients with neurodegenerative diseases are often assessed using rating scales containing a number of items, where each item is scored based on an ordinal scale of 0 to <math><mi>K</mi></math> (0 indicating normal function and <math><mi>K</mi></math> indicating severe impairment). The total score, calculated as the total sum of the items, is commonly used for subsequent analysis due to its simplicity. However, the total score is treated as a continuous value and does not respect the ordinal nature of the item-level data. In addition, the total score may lead to information loss as neurodegenerative diseases are multi-faceted, and using a single numeric value may not effectively represent the disease progression. In this article, we propose a convolutional neural network (CNN) designed to take longitudinal ordinal items as input and predict patients' future survival trajectories. We demonstrate that using the item-level data improves the predictive performance in comparison to traditional joint models using the total score. These advantages are shown through both a simulation study and real data application to a Parkinson's disease study.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2026-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13128146/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianzheng Wu, Danielle N Christifano, Susan E Carlson, Byron J Gajewski
{"title":"Estimating Baseline Cutoffs for DHA Dosage in Preterm Birth Prevention: A Bayesian Personalized Change-Point Analysis.","authors":"Jianzheng Wu, Danielle N Christifano, Susan E Carlson, Byron J Gajewski","doi":"10.1080/02664763.2026.2625118","DOIUrl":"10.1080/02664763.2026.2625118","url":null,"abstract":"<p><p>Preterm birth (PTB, <37 weeks gestation) is the leading cause of infant mortality and significant health and socioeconomic burdens that affects millions of newborns and families. While docosahexaenoic acid (DHA) supplementation has shown promise in reducing PTB risk, its effectiveness at reducing the most consequential early PTB (ePTB, <34 weeks gestation) depends on baseline DHA levels, with lower DHA levels and intake linked to a higher risk of PTB and ePTB that can be reduced by high-dose DHA supplementation. Given the higher costs of high-dose DHA, personalized treatment strategies based on baseline DHA levels are needed. We proposed a novel Bayesian personalized change-point model to optimize DHA supplementation strategies based on individual baseline DHA intake. By incorporating Bayesian change-point, dynamic linear, and normal mixture models, our approach estimates optimal DHA baseline thresholds and distribution. We applied this model to real-world data and simulated trials to demonstrate its ability to improve secondary analysis and trial design by adjusting for baseline DHA heterogeneity. This personalized approach can help clinicians identify optimal DHA supplementation doses for individual patients, and it can be applied to other trial studies where the heterogenous characteristics of patients can be quantified.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12959820/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147365525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runqiu Wang, Ran Dai, Hongying Dai, Evan French, Cheng Zheng
{"title":"Controlling FDR in selecting group-level simultaneous signals from multiple data sources with application to the National COVID Collaborative Cohort data.","authors":"Runqiu Wang, Ran Dai, Hongying Dai, Evan French, Cheng Zheng","doi":"10.1080/02664763.2025.2606238","DOIUrl":"https://doi.org/10.1080/02664763.2025.2606238","url":null,"abstract":"<p><p>One challenge in exploratory association studies using observational data is that the associations between the predictors and the outcome are potentially weak and rare, and the candidate predictors have complex correlation structures. False discovery rate (FDR) controlling procedures can provide important statistical guarantees for replicability in predictor identification in exploratory research. In the recently established National COVID Collaborative Cohort (N3C), electronic health record (EHR) data on the same set of grouped candidate predictors are independently collected in multiple different data contributing sites, offering opportunities to identify true associations by combining information from different sources. One challenge is to handle the heterogeneous data types for the same clinical endpoint from the multiple sites. This paper addresses this challenge by presenting a general knockoff-based variable selection algorithm to identify associations from unions of group-level conditional independence tests (simultaneous signals) with exact FDR control guarantees under finite sample settings. This algorithm can work with general regression settings, allowing heterogeneity of both the predictors and the outcomes across multiple data sources. We demonstrate the performance of this method with extensive numerical studies and an application to the N3C data.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13120768/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147772600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Farrar, Renette Blignaut, Retha Luus, Sarel Steel
{"title":"A review and comparison of methods of testing for heteroskedasticity in the linear regression model.","authors":"Thomas Farrar, Renette Blignaut, Retha Luus, Sarel Steel","doi":"10.1080/02664763.2025.2575038","DOIUrl":"https://doi.org/10.1080/02664763.2025.2575038","url":null,"abstract":"<p><p>This study reviews inferential methods for diagnosing heteroskedasticity in the linear regression model, classifying the methods into four types: deflator tests, auxiliary design tests, omnibus tests, and portmanteau tests. A Monte Carlo simulation experiment is used to compare the performance of deflator tests and the performance of auxiliary design and omnibus tests, using the metric of average excess power over size. Certain lesser-known tests (that are not included with some standard statistical software) are found to outperform better-known tests. For instance, the best-performing deflator test was the Evans-King test, and the best-performing auxiliary design and omnibus tests were Verbyla's test and the Cook-Weisberg test, and not standard methods such as White's test and the Breusch-Pagan-Koenker test.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 16","pages":"3121-3150"},"PeriodicalIF":1.1,"publicationDate":"2025-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12683758/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145714436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mena C R Whalen, Gregory J Matthews, Brian M Mills
{"title":"Empirical determination of baseball eras: multivariate change point analysis in major league baseball.","authors":"Mena C R Whalen, Gregory J Matthews, Brian M Mills","doi":"10.1080/02664763.2025.2552723","DOIUrl":"https://doi.org/10.1080/02664763.2025.2552723","url":null,"abstract":"<p><p>We use multivariate change point analysis methods to identify not only mean shifts but also changes in variance across a wide array of statistical time series. Our primary objective is to empirically discern distinct eras in the evolution of baseball, shedding light on significant transformations in team performance and management strategies. We employ baseball statistics from the late 1800s to 2021, spanning over a century of the sport's history. Results confirm previous historical research, pinpointing well-known baseball eras, such as the Dead Ball Era, Integration Era, Steroid Era, and Post-Steroid Era. Moreover, the study investigates changes in team performance, effectively identifying periods of both dynasties and collapses within a team's history. The multivariate change point analysis proves to be a valuable tool for understanding the dynamics of baseball's evolution. The method offers a data-driven approach to unveil structural shifts in the sport's historical landscape, providing fresh insights into the impact of rule changes, player strategies, and external factors on baseball's evolution. This not only enhances our comprehension of baseball, showing more robust identification of eras than past univariate time series work, but also showcases the broader applicability of multivariate change point analysis in the domain of sports research and beyond.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"53 6","pages":"1158-1179"},"PeriodicalIF":1.1,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13134745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M Remedios Sillero-Denamiel, J Miguel Marín, Pepa Ramírez-Cobo, Fabrizio Ruggeri, Michael P Wiper
{"title":"Bayesian semi-parametric approaches to normal/independent and elliptical distributions.","authors":"M Remedios Sillero-Denamiel, J Miguel Marín, Pepa Ramírez-Cobo, Fabrizio Ruggeri, Michael P Wiper","doi":"10.1080/02664763.2025.2552728","DOIUrl":"https://doi.org/10.1080/02664763.2025.2552728","url":null,"abstract":"<p><p>This article introduces a novel, Bayesian, semi-parametric approach to inference for both elliptical and normal/independent distributions. The location and scale parameters are modelled parametrically and a suitable transformation of the modular variable is modelled using Dirichlet process mixtures. A feature of our approach is that the partial lack of identifiability inherent in both elliptical and normal/independent distributions can be accounted for by incorporating a restriction on the diagonal elements of the scale matrix. Posterior computation is carried out using a Markov chain Monte Carlo algorithm.A novel technique for model selection, based on an approximation of the deviation information criterion, is introduced. As shown by a numerical study based on simulation, the approach can be used to discriminate between elliptical, and normal/independent distributions. Finally, our methodology is illustrated with both simulated and real data.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"53 6","pages":"1130-1157"},"PeriodicalIF":1.1,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13134756/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"<i>APCanalysis</i>: an <i>R</i> package for identifying active factors using the APC method.","authors":"Abu Zar Md Shafiullah, Arden Miller","doi":"10.1080/02664763.2025.2540378","DOIUrl":"https://doi.org/10.1080/02664763.2025.2540378","url":null,"abstract":"<p><p>Unreplicated two-level designs play an important role in screening experiments. Notably, Plackett-Burman designs (PBDs) and regular fractional factorial designs ( <math><msup><mn>2</mn> <mrow><mi>k</mi> <mo>-</mo> <mi>p</mi></mrow> </msup> </math> ) are commonly used for their flexible run sizes and orthogonal contrasts. However, classical methods such as <i>t</i>-tests are not applicable for identifying significant effects in saturated models due to insufficient degrees of freedom. The All Possible Comparisons (APC) method addresses this limitation by providing an objective framework to control false positive rates, specifically the individual error rate (IER) and the experiment-wise error rate (EER). In this article, we introduce APCanalysis, a user-friendly R package that implements the APC method using a tailored AIC-type model selection criterion, the APC-criterion. The package features an advanced penalty algorithm extending error control to the false discovery rate (FDR). It supports main effects screening in PBDs, active main effects and interactions in full factorial ( <math><msup><mn>2</mn> <mi>k</mi></msup> </math> ) and resolution-V ( <math><msubsup><mn>2</mn> <mi>V</mi> <mrow><mi>k</mi> <mo>-</mo> <mi>p</mi></mrow> </msubsup> </math> ) designs, and detects active alias strings in resolution-IV ( <math><msubsup><mn>2</mn> <mrow><mi>IV</mi></mrow> <mrow><mi>k</mi> <mo>-</mo> <mi>p</mi></mrow> </msubsup> </math> ) and resolution-III ( <math><msubsup><mn>2</mn> <mrow><mi>III</mi></mrow> <mrow><mi>k</mi> <mo>-</mo> <mi>p</mi></mrow> </msubsup> </math> ) designs. Through examples, simulations, and real-world data, we demonstrate that the APC-criterion reliably identifies active factors while maintaining user-specified error thresholds for IER, EER, or FDR. Benchmarking against Lenth's method in a validation study further confirms strong agreement in screening power and accuracy. This article provides practical guidance on applying APCanalysis, highlighting its advantages and limitations. The package is available via the Comprehensive R Archive Network (CRAN).</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"53 5","pages":"937-957"},"PeriodicalIF":1.1,"publicationDate":"2025-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13047717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147623023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian analysis on single server Markovian queueing model with impatient customers.","authors":"Gulab Singh Bura, Himanshi Sharma","doi":"10.1080/02664763.2025.2552722","DOIUrl":"https://doi.org/10.1080/02664763.2025.2552722","url":null,"abstract":"<p><p>This paper focuses on Bayesian inference of an M/M/1 queuing model with balking, a phenomenon in which customers choose not to join a queue due to long waiting line. In this paper, the balking probability is considered as a function of number of customers and their impatience level. The degree of impatience of customers plays a crucial role effecting the balking probability. Higher threshold of impatience implies that customers are more sensitive to queue length, i.e. they are less willing to join a queue when the queue is even slightly longer. Conversely, lower threshold of impatience indicates that customers will less balk. In this scenario, there is a higher probability that customers will opt to join the queue, even when it extends to a considerable length. This paper provides the Bayesian estimates for traffic intensity (<i>ρ</i>), employing various prior distributions such as beta, truncated gamma, and uniform prior distributions under a squared error loss function. Through the sampling importance re-sampling (SIR) technique, we obtained the posterior estimates, risk, and credible intervals that showcase the effectiveness of our methodology. Furthermore, simulation studies demonstrate the convergence of estimators, and our findings are further validated through analysis of real-life data.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"53 6","pages":"1098-1129"},"PeriodicalIF":1.1,"publicationDate":"2025-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13134755/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147815605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal ridge estimation in the restricted logistic semiparametric regression models using generalized cross-validation.","authors":"Mahdi Roozbeh","doi":"10.1080/02664763.2025.2541252","DOIUrl":"https://doi.org/10.1080/02664763.2025.2541252","url":null,"abstract":"<p><p>Binary logistic semiparametric regression analysis is a commonly used statistical technique when the dependent variable is dichotomous or binary. In this analysis, the relationship between the success probability and certain explanatory variables is assumed to have a linear form, while the relationship to other variables is unknown. Multicollinearity is a serious problem that arises when explanatory variables in logistic semiparametric regression are highly correlated. It is well known that the variance of the maximum likelihood estimator is inflated due to multicollinearity in the semiparametric logistic regression model. Therefore, a novel stochastic restricted iterative weighted ridge estimator for logistic semiparametric regression is introduced, and its statistical properties are extracted asymptotically. Moreover, an extension of the generalized cross validation (GCV) function is introduced and applied for choosing the best values of the ridge parameter and the bandwidth of the kernel smoother. Additionally, some theorems are developed to illustrate the convergence of the GCV mean. Ultimately, the Monte-Carlo simulation studies and an actual real-life data set are conducted to support our theoretical discussion, and the findings indicated that the new estimator outperformed the other estimators under consideration.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"53 5","pages":"874-893"},"PeriodicalIF":1.1,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13045183/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147623074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}