{"title":"Semiparametric regression analysis of panel binary data with a dependent failure time.","authors":"Lei Ge, Yang Li, Jianguo Sun","doi":"10.1080/02664763.2024.2428266","DOIUrl":"10.1080/02664763.2024.2428266","url":null,"abstract":"<p><p>In health and clinical research, panel binary data from recurrent events arise when subjects are surveyed to report occurrence statuses of recurrent events over fixed observation windows. In practice, such data can be cut short by a dependent failure event such as death. For the analysis of panel binary data, tools from generalized linear models overlook the recurrence nature of panel binary data, and other relevant literature does not accommodate the failure time. Motivated by the hospitalization data surveyed from the Health and Retirement Study, we propose a semiparametric joint-modeling-based procedure for analyzing panel binary data with a dependent failure time. For model fitting, we develop a computationally efficient EM algorithm and show the resulting estimates are consistent and asymptotically normal. Theoretical results are provided to enable valid inferences. Simulation studies have confirmed the performance of the proposed method in practical settings. The method is applied to assess important risk factors associated with incidences of hospitalization among the working elderly.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1423-1445"},"PeriodicalIF":1.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117875/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144179896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A bootstrap procedure to estimate the causal effect of a public policy, considering overlap and imperfect compliance.","authors":"Stefano Cabras","doi":"10.1080/02664763.2024.2428994","DOIUrl":"10.1080/02664763.2024.2428994","url":null,"abstract":"<p><p>This paper introduces a nonparametric bootstrap method for estimating the causal effects of public policy under the circumstances of imperfect compliance and overlap. It focuses on business investment subsidies in Sardinia by comparing firms eligible for the 1999 subsidies to those not, amid issues of imperfect compliance and overlapping programs. Bootstrap confidence intervals (CI) are proposed for the average effect of treatment on the sub-population of compliers. The obtained CIs are consistent across nominal levels and robust against data nonnormality; they show coverages of credible intervals close to nominal, suggesting effectiveness for assessing causal effects. Compared to other methods, the results of the new combination of a specific estimator for incompliance and the bootstrap align with those of more modern approaches such as Bayesian Additive Regression Trees and Causal forest.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1470-1484"},"PeriodicalIF":1.2,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117864/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144179895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A defective cure rate quantile regression model for male breast cancer data.","authors":"Agatha Rodrigues, Patrick Borges, Bruno Santos","doi":"10.1080/02664763.2024.2428272","DOIUrl":"10.1080/02664763.2024.2428272","url":null,"abstract":"<p><p>In this article, we particularly address the problem of assessing the impact of different prognostic factors, such as clinical stage and age, on the specific survival times of men with breast cancer when cure is a possibility. To this end, we developed a quantile regression model for survival data in the presence of long-term survivors based on the generalized Gompertz distribution in a defective version, which is conveniently reparametrized in terms of the <i>q</i>-th quantile and then linked to covariates via a logarithm link function. This proposal allows us to obtain how each variable affects the survival times in different quantiles. In addition, we are able to study the effects of covariates on the cure rate as well. We consider Markov Chain Monte Carlo methods to develop a Bayesian analysis in the proposed model and we evaluate its performance through Monte Carlo simulation studies. Finally, we illustrate the application of our model in a data set about male breast cancer from Brazil analyzed for the very first time.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1485-1512"},"PeriodicalIF":1.2,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147491/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient non-parametric estimation of variable productivity Hawkes processes.","authors":"Sophie Phillips, Frederic Schoenberg","doi":"10.1080/02664763.2024.2426019","DOIUrl":"10.1080/02664763.2024.2426019","url":null,"abstract":"<p><p>Several approaches to estimating the productivity function for a Hawkes point process with variable productivity are discussed, improved upon, and compared in terms of their root-mean-squared error and computational efficiency for various data sizes, and for binned as well as unbinned data. We find that for unbinned data, a regularized version of the analytic maximum likelihood estimator proposed by Schoenberg is the most accurate but is computationally burdensome. The unregularized version of the estimator is faster to compute but has lower accuracy, though both estimators outperform empirical or binned least squares estimators in terms of root-mean-squared error, especially when the mean productivity is 0.2 or greater. For binned data, binned least squares estimates are highly efficient both in terms of computation time and root-mean-squared error. An extension to estimating transmission time density is discussed, and an application to estimating the productivity of Covid-19 in the United States as a function of time from January 2020 to July 2022 is provided.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1405-1422"},"PeriodicalIF":1.2,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117874/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144181444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marlon Fritz, Sarah Forstinger, Yuanhua Feng, Thomas Gries
{"title":"Forecasting economic growth by combining local linear and standard approaches.","authors":"Marlon Fritz, Sarah Forstinger, Yuanhua Feng, Thomas Gries","doi":"10.1080/02664763.2024.2424920","DOIUrl":"https://doi.org/10.1080/02664763.2024.2424920","url":null,"abstract":"<p><p>Today, developing economies are of major importance for global macroeconomic development. However, the empirical analysis and especially the forecasting of macroeconomic time series remain difficult due to a lack of sufficient data, data frequency, high volatility, and non-linear developments. These difficulties require more sophisticated approaches to obtain reliable forecasts. Therefore, we propose an improved forecasting method especially for growth data based on a data-driven local linear trend estimation with an extended iterative plug-in algorithm for determining the bandwidth endogenously. This approach allows a smooth trend estimation that takes care of temporary changes in trend processes. Further, the naïve random walk model is extended for forecasting by including a local linear, time-varying drift. We apply this method to GDP development for six developing and two advanced economies and compare different forecast combinations. The combinations that include the local linear approach and the random walk with a local linear trend improve forecasting accuracy and reduce variance.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1342-1360"},"PeriodicalIF":1.2,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117868/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144182001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A robust distance-based approach for detecting multidimensional outliers.","authors":"R Lakshmi, T A Sajesh","doi":"10.1080/02664763.2024.2422403","DOIUrl":"10.1080/02664763.2024.2422403","url":null,"abstract":"<p><p>Identifying outliers in data analysis is a critical task, as outliers can significantly influence the results and conclusions drawn from a dataset. This study explores the use of the Mahalanobis distance metric for detecting outliers in multivariate data, focusing on a novel approach inspired by the work of M. Falk, [<i>On mad and comedians</i>, Ann. Inst. Stat. Math. 49 (1997), pp. 615-644]. The proposed method is rigorously tested through extensive simulation analysis, where it demonstrates high True Positive Rates (TPR) and low False Positive Rates (FPR) when compared to other existing outlier detection techniques. Through extensive simulation analysis, we empirically evaluate the affine equivariance and breakdown properties of our proposed distance measure and it is evident from the outputs that our robust distance measure demonstrates effective results with respect to the measures FPR and TPR. The proposed method was applied to seven different datasets, showing promising true positive rates (TPR) and false positive rates (FPR), and it outperformed several well-known outlier identification approaches. We can effectively use our proposed distance measure in fields demanding outlier detection.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 6","pages":"1278-1298"},"PeriodicalIF":1.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035934/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144016593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juan F Díaz-Sepúlveda, Nicoletta D'Angelo, Giada Adelfio, Jonatan A González, Francisco J Rodríguez-Cortés
{"title":"Clustering in point processes on linear networks using nearest neighbour volumes.","authors":"Juan F Díaz-Sepúlveda, Nicoletta D'Angelo, Giada Adelfio, Jonatan A González, Francisco J Rodríguez-Cortés","doi":"10.1080/02664763.2024.2411214","DOIUrl":"10.1080/02664763.2024.2411214","url":null,"abstract":"<p><p>This study introduces a novel method specifically designed to detect clusters of points within linear networks. This method extends a classification approach used for point processes in spatial contexts. Unlike traditional methods that operate on planar spaces, our approach adapts to the unique geometric challenges of linear networks, where classical properties of point processes are altered, and intuitive data visualisation becomes more complex. Our method utilises the distribution of the <i>K</i>th nearest neighbour volumes, extending planar-based clustering techniques to identify regions of increased point density within a network. This approach is particularly effective for distinguishing overlapping Poisson processes within the same linear network. We demonstrate the practical utility of our method through applications to road traffic accident data from two Colombian cities, Bogota and Medellin. Our results reveal distinct clusters of high-density points in road segments where severe traffic accidents (resulting in injuries or fatalities) are most likely to occur, highlighting areas of increased risk. These clusters were primarily located on major arterial roads with high traffic volumes. In contrast, low-density points corresponded to areas with fewer accidents, likely due to lower traffic flow or other mitigating factors. Our findings provide valuable insights for urban planning and road safety management.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 5","pages":"993-1016"},"PeriodicalIF":1.2,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951330/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143752900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daisuke Yoneoka, Takayuki Kawashima, Yuta Tanoue, Shuhei Nomura, Akifumi Eguchi
{"title":"Robust estimation of the incubation period and the time of exposure using <i>γ</i>-divergence.","authors":"Daisuke Yoneoka, Takayuki Kawashima, Yuta Tanoue, Shuhei Nomura, Akifumi Eguchi","doi":"10.1080/02664763.2024.2420221","DOIUrl":"https://doi.org/10.1080/02664763.2024.2420221","url":null,"abstract":"<p><p>Estimating the exposure time to single infectious pathogens and the associated incubation period, based on symptom onset data, is crucial for identifying infection sources and implementing public health interventions. However, data from rapid surveillance systems designed for early outbreak warning often come with outliers originated from individuals who were not directly exposed to the initial source of infection (i.e. tertiary and subsequent infection cases), making the estimation of exposure time challenging. To address this issue, this study uses a three-parameter lognormal distribution and proposes a new <i>γ</i>-divergence-based robust approach for estimating the parameter corresponding to exposure time with a tailored optimization procedure using the majorization-minimization algorithm, which ensures the monotonic decreasing property of the objective function. Comprehensive numerical experiments and real data analyses suggest that our method is superior to conventional methods in terms of bias, mean squared error, and coverage probability of 95% confidence intervals.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 6","pages":"1239-1257"},"PeriodicalIF":1.2,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035932/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143992898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An optimal subsampling design for large-scale Cox model with censored data.","authors":"Shiqi Liu, Zilong Xie, Ming Zheng, Wen Yu","doi":"10.1080/02664763.2024.2423234","DOIUrl":"10.1080/02664763.2024.2423234","url":null,"abstract":"<p><p>Subsampling designs are useful for reducing computational load and storage cost for large-scale data analysis. For massive survival data with right censoring, we propose a class of optimal subsampling designs under the widely-used Cox model. The proposed designs utilize information from both the outcome and the covariates. Different forms of the design can be derived adaptively to meet various targets, such as optimizing the overall estimation accuracy or minimizing the variation of specific linear combination of the estimators. Given the subsampled data, the inverse probability weighting approach is employed to estimate the model parameters. The resultant estimators are shown to be consistent and asymptotically normally distributed. Simulation results indicate that the proposed subsampling design yields more efficient estimators than the uniform subsampling by using subsampled data of comparable sample sizes. Additionally, the subsampling estimation significantly reduces the computational load and storage cost relative to the full data estimation. An analysis of a real data example is provided for illustration.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1315-1341"},"PeriodicalIF":1.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123965/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144199240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhengxin Wang, Daniel B Rowe, Xinyi Li, D Andrew Brown
{"title":"Efficient fully Bayesian approach to brain activity mapping with complex-valued fMRI data.","authors":"Zhengxin Wang, Daniel B Rowe, Xinyi Li, D Andrew Brown","doi":"10.1080/02664763.2024.2422392","DOIUrl":"10.1080/02664763.2024.2422392","url":null,"abstract":"<p><p>Functional magnetic resonance imaging (fMRI) enables indirect detection of brain activity changes via the blood-oxygen-level-dependent (BOLD) signal. Conventional analysis methods mainly rely on the real-valued magnitude of these signals. In contrast, research suggests that analyzing both real and imaginary components of the complex-valued fMRI (cv-fMRI) signal provides a more holistic approach that can increase power to detect neuronal activation. We propose a fully Bayesian model for brain activity mapping with cv-fMRI data. Our model accommodates temporal and spatial dynamics. Additionally, we propose a computationally efficient sampling algorithm, which enhances processing speed through image partitioning. Our approach is shown to be computationally efficient via image partitioning and parallel computation while being competitive with state-of-the-art methods. We support these claims with both simulated numerical studies and an application to real cv-fMRI data obtained from a finger-tapping experiment.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 6","pages":"1299-1314"},"PeriodicalIF":1.2,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035935/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143998676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}