{"title":"Latent class profile model with time-dependent covariates: a study on symptom patterning of patients for head and neck cancer.","authors":"Jung Wun Lee, Hayley Dunnack Yackel","doi":"10.1080/02664763.2024.2435997","DOIUrl":"10.1080/02664763.2024.2435997","url":null,"abstract":"<p><p>The latent class profile model (LCPM) is a widely used technique for identifying distinct subgroups within a sample based on observations' longitudinal responses to categorical items. This paper proposes an expanded version of LCPM by embedding time-specific structures. Such development allows analysts to investigate associations between latent class memberships and time-dependent predictors at specific time points. We suggest a simultaneous estimation of latent class measurement parameters via the expectation-maximization (EM) algorithm, which yields valid point and interval estimators of associations between latent class memberships and covariates. We illustrate the validity of our estimation strategy via numerical studies. In addition, we demonstrate the novelty of the proposed model by analyzing the head and neck cancer data set.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1628-1648"},"PeriodicalIF":1.2,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dayna P Saldaña Zepeda, Richard Heerema, Ciro Velasco Cruz, William Giese, Joshua Sherman
{"title":"Delaying bud-break on pecan trees: a Bayesian longitudinal multinomial regression approach.","authors":"Dayna P Saldaña Zepeda, Richard Heerema, Ciro Velasco Cruz, William Giese, Joshua Sherman","doi":"10.1080/02664763.2024.2436007","DOIUrl":"10.1080/02664763.2024.2436007","url":null,"abstract":"<p><p>A multivariate Bayesian Probit model is adapted to analyze a longitudinal multiclass-ordinal response, with a linear plateau as the longitudinal model. Measurements on pecan bud growth were collected on irregular time intervals, about a week apart from late March to mid April, using a six-level ordinal scale. The data are from two randomized complete block designs with four blocks each. The experiments were setup and initiated in 2018 in a pecan orchard, at two different locations, to evaluate the effect of two sets of four treatments on delaying growth of recently broken pecan buds to minimize bud loss due to low temperatures. A simulation study was successfully carried out to validate the model implementation. Treatment 3 of Experiment 1 was associated with the greatest reduction in bud growth rate. In Experiment 2, Treatments 2 and 3 had some effect on delaying bud growth. Although treatment effects were not statistically different in either experiment, this paper presents a practical and efficient modeling technique for longitudinal multinomial ordinal data, a common data type in applied agricultural research studies.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1649-1669"},"PeriodicalIF":1.2,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A clustering approach to integrative analyses of multiomic cancer data.","authors":"Dongyan Yan, Subharup Guha","doi":"10.1080/02664763.2024.2431742","DOIUrl":"10.1080/02664763.2024.2431742","url":null,"abstract":"<p><p>Rapid technological advances have allowed for molecular profiling across multiple omics domains for clinical decision-making in many diseases, especially cancer. However, as tumor development and progression are biological processes involving composite genomic aberrations, key challenges are to effectively assimilate information from these domains to identify genomic signatures and druggable biological entities, develop accurate risk prediction profiles for future patients, and identify novel patient subgroups for tailored therapy and monitoring. We propose integrative frameworks for high-dimensional multiple-domain cancer data. These Bayesian mixture model-based approaches coherently incorporate dependence within and between domains to accurately detect tumor subtypes, thus providing a catalog of genomic aberrations associated with cancer taxonomy. The flexible and scalable Bayesian nonparametric strategy performs simultaneous bidirectional clustering of the tumor samples and genomic probes to achieve dimension reduction. We describe an efficient variable selection procedure that can identify relevant genomic aberrations and potentially reveal underlying drivers of disease. Although the work is motivated by lung cancer datasets, the proposed methods are broadly applicable in a variety of contexts involving high-dimensional data. The success of the methodology is demonstrated using artificial data and lung cancer omics profiles publicly available from The Cancer Genome Atlas.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1539-1560"},"PeriodicalIF":1.2,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhumengmeng Jin, Juan Sosa, Shangchen Song, Brenda Betancourt
{"title":"A robust Bayesian latent position approach for community detection in networks with continuous attributes.","authors":"Zhumengmeng Jin, Juan Sosa, Shangchen Song, Brenda Betancourt","doi":"10.1080/02664763.2024.2431736","DOIUrl":"10.1080/02664763.2024.2431736","url":null,"abstract":"<p><p>The increasing prevalence of multiplex networks has spurred a critical need to take into account potential dependencies across different layers, especially when the goal is community detection, which is a fundamental learning task in network analysis. We propose a full Bayesian mixture model for community detection in both single-layer and multi-layer networks. A key feature of our model is the joint modeling of the nodal attributes that often come with the network data as a spatial process over the latent space. In addition, our model for multi-layer networks allows layers to have different strengths of dependency in the unique latent position structure and assumes that the probability of a relation between two actors (in a layer) depends on the distances between their latent positions (multiplied by a layer-specific factor) and the difference between their nodal attributes. Under our prior specifications, the actors' positions in the latent space arise from a finite mixture of Gaussian distributions, each corresponding to a cluster. Simulated examples show that our model outperforms existing benchmark models and exhibits significantly greater robustness when handling datasets with missing values. The model is also applied to a real-world three-layer network of employees in a law firm.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1513-1538"},"PeriodicalIF":1.2,"publicationDate":"2024-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147515/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Parameter estimation for stable distributions and their mixture.","authors":"Omar Hajjaji, Solym Mawaki Manou-Abi, Yousri Slaoui","doi":"10.1080/02664763.2024.2434627","DOIUrl":"10.1080/02664763.2024.2434627","url":null,"abstract":"<p><p>In this paper, we consider estimating the parameters of univariate <i>α</i>-stable distributions and their mixtures. First, using a Gaussian kernel density distribution estimator, we propose an estimation method based on the characteristic function. The optimal bandwidth parameter was selected using a plug-in method. We highlight another estimation procedure for the Maximum Likelihood framework based on the False position algorithm to find a numerical root of the log-likelihood through the score functions. For mixtures of <i>α</i>-stable distributions, the EM algorithm and the Bayesian estimation method have been modified to propose an efficient and valuable tool for parameter estimation. The proposed methods can be generalised to multiple mixtures, although we have limited the mixture study to two components. A simulation study is carried out to evaluate the performance of our methods, which are then applied to real data. Our results appear to accurately estimate mixtures of <i>α</i>-stable distributions. Applications concern the estimation of the number of replicates in the Mayotte COVID-19 dataset and the distribution of the N-acetyltransferase activity of the Bechtel et al. data for a urinary caffeine metabolite implicated in carcinogens. We compare the proposed methods, together with a detailed discussion. We conclude with the limitations of this study, together with other forthcoming work and a future implementation of an R package or Python library for the proposed methods in data modelling.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1594-1627"},"PeriodicalIF":1.2,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147516/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A bivariate load-sharing model.","authors":"Debasis Kundu","doi":"10.1080/02664763.2024.2428267","DOIUrl":"10.1080/02664763.2024.2428267","url":null,"abstract":"<p><p>The motivation of this work came from a data set obtained from an experiment performed on diabetic patients, with diabetic retinopathy disorder. The aim of this experiment is to test whether there is any significant difference between two different treatments which are being used for this disease. The two eyes can be considered as a two-component load-sharing system. In a two-component load-sharing system after the failure of one component, the surviving component has to shoulder extra load. Hence, it is prone to failure at an earlier time than what is expected under the original model. It may also happen sometimes that the failure of one component may release extra resources to the survivor, thus delaying the failure. In most of the existing literature, it has been assumed that at the beginning the lifetime distributions of the two components are independently distributed, which may not be very reasonable in this case. In this paper, we have introduced a new bivariate load-sharing model where the independence assumptions of the lifetime distributions of the two components at the beginning have been relaxed. In this present model, they may be dependent. Further, there is a positive probability that the two components may fail simultaneously. If the two components do not fail simultaneously, it is assumed that the lifetime of the surviving component changes based on the tampered failure rate assumption. The proposed bivariate distribution has a singular component. The likelihood inference of the unknown parameters has been provided. Simulation results and the analysis of the data set have been presented to show the effectiveness of the proposed model.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1446-1469"},"PeriodicalIF":1.2,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123949/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144199239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A unit-level one-inflated beta model for small area prediction of seat-belt use rates.","authors":"Zirou Zhou, Emily Berg","doi":"10.1080/02664763.2024.2426016","DOIUrl":"10.1080/02664763.2024.2426016","url":null,"abstract":"<p><p>We develop a unit-level one-inflated beta model for the purpose of small area estimation. Our specific interest is in estimation of seat-belt use rates for Iowa counties using data from the Iowa Seat-Belt Use Survey. As a result of small county sample sizes, small area estimation methods are needed. We propose frequentist and Bayesian implementations of a unit-level one-inflated beta model. We compare the Bayesian and frequentist predictors to simpler alternatives through simulation. We apply the proposed Bayesian and frequentist procedures to data from the Iowa Seat-Belt Use Survey.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1381-1404"},"PeriodicalIF":1.2,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117865/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144180828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tests of covariate effects under finite Gaussian mixture regression models.","authors":"Chong Gan, Jiahua Chen, Zeny Feng","doi":"10.1080/02664763.2024.2433567","DOIUrl":"10.1080/02664763.2024.2433567","url":null,"abstract":"<p><p>Mixture of regression model is widely used to cluster subjects from a suspected heterogeneous population due to differential relationships between response and covariates over unobserved subpopulations. In such applications, statistical evidence pertaining to the significance of a hypothesis is important yet missing to substantiate the findings. In this case, one may wish to test hypotheses regarding the effect of a covariate such as its overall significance. If confirmed, a further test of whether its effects are different in different subpopulations might be performed. This paper is motivated by the analysis of Chiroptera dataset, in which, we are interested in knowing how forearm length development of bat species is influenced by precipitation within their habitats and living regions using finite Gaussian mixture regression (GMR) model. Since precipitation may have different effects on the evolutionary development of the forearm across the underlying subpopulations among bat species worldwide, we propose several testing procedures for hypotheses regarding the effect of precipitation on forearm length under finite GMR models. In addition to the real analysis of Chiroptera data, through simulation studies, we examine the performances of these testing procedures on their type I error rate, power, and consequently, the accuracy of clustering analysis.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1571-1593"},"PeriodicalIF":1.2,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147513/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mixture mean residual life model for competing risks data with mismeasured covariates.","authors":"Chyong-Mei Chen, Chih-Ching Lin, Chih-Cheng Wu, Jia-Ren Tsai","doi":"10.1080/02664763.2024.2426015","DOIUrl":"10.1080/02664763.2024.2426015","url":null,"abstract":"<p><p>This paper proposes a mixture regression model for competing risks data, where the logistic regression model is specified for the marginal probabilities of the failure types and the mean residual lifetime (MRL) model is assumed for the failure time given the failure of interest. The estimating equations (EEs) are derived to infer the logistic regression and MRL model separately. We further consider the situation where the covariates are subject to measurement error. The presence of measurement error imposes extra challenges for the analysis of complex time-to-event data. By using the above EEs as the correction-amenable original estimating functions, we propose a corrected score estimation, which does not require specifying the distributions for unobserved error-prone covariates. The proposed estimators are shown to be consistent and asymptotically normally distributed. The performance of the method is investigated by intensive simulation studies and two real examples are presented to illustrate the proposed methods.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 7","pages":"1361-1380"},"PeriodicalIF":1.2,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117869/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144180957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Change-point detection of the Kumaraswamy skew-t distribution based on modified information criterion.","authors":"Jun Wang, Wei Ning","doi":"10.1080/02664763.2024.2431743","DOIUrl":"https://doi.org/10.1080/02664763.2024.2431743","url":null,"abstract":"<p><p>In this paper, we study the change-point problem of the Kumaraswamy skew-t distribution. An approach based on the modified information criterion is proposed to detect the changes of the parameters of this distribution. Simulations have been conducted to investigate the performance of the proposed method. The proposed method is applied to real data to illustrate the detection procedure.</p>","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 8","pages":"1561-1570"},"PeriodicalIF":1.2,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12147483/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144266338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}