{"title":"The role of pairwise matching in experimental design for an incidence outcome","authors":"Adam Kapelner, Abba M. Krieger, David Azriel","doi":"10.1111/anzs.12403","DOIUrl":"https://doi.org/10.1111/anzs.12403","url":null,"abstract":"We consider the problem of evaluating designs for a two-arm randomised experiment with an incidence (binary) outcome under a non-parametric general response model. Our two main results are that the a priori pair matching design is (1) the optimal design as measured by mean squared error among all block designs which includes complete randomisation. And (2), this pair-matching design is minimax, that is, it provides the lowest mean squared error under an adversarial response model. Theoretical results are supported by simulations and clinical trial data where we demonstrate the superior performance of pairwise matching designs under realistic conditions.","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138517672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measurement errors in semi‐parametric generalised regression models","authors":"Mohammad W. Hattab, David Ruppert","doi":"10.1111/anzs.12400","DOIUrl":"https://doi.org/10.1111/anzs.12400","url":null,"abstract":"Summary Regression models that ignore measurement error in predictors may produce highly biased estimates leading to erroneous inferences. It is well known that it is extremely difficult to take measurement error into account in Gaussian non‐parametric regression. This problem becomes even more difficult when considering other families such as binary, Poisson and negative binomial regression. We present a novel method aiming to correct for measurement error when estimating regression functions. Our approach is sufficiently flexible to cover virtually all distributions and link functions regularly considered in generalised linear models. This approach depends on approximating the first and the second moment of the response after integrating out the true unobserved predictors in any semi‐parametric generalised regression model. By the latter is meant a model with both linear and non‐parametric effects that are connected to the mean response by a link function and with a response distribution in an exponential family or quasi‐likelihood model. Unlike previous methods, the method we now propose is not restricted to truncated splines and can utilise various basis functions. Moreover, it can operate without making any distributional assumption about the unobserved predictor. Through extensive simulation studies, we study the performance of our method under many scenarios.","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136212682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Gunawan, William E. Griffiths, D. Chotikapanich
{"title":"Comparisons of distributions of Australian mental health scores","authors":"D. Gunawan, William E. Griffiths, D. Chotikapanich","doi":"10.1111/anzs.12399","DOIUrl":"https://doi.org/10.1111/anzs.12399","url":null,"abstract":"Summary Bayesian non‐parametric estimates of Australian distributions of mental health scores are obtained to assess how the mental health status of the population has changed over time, and to compare the mental health status of female/male and Aboriginal/non‐Aboriginal population subgroups. First‐order and second‐order stochastic dominance are used to compare distributions, with results presented in terms of the posterior probability of dominance and the posterior probability of no dominance. If a criterion for dominance is satisfied, then, in terms of that criterion, the mental health status of the dominant population is superior to that of the dominated population. If neither distribution is dominant, then the mental health status of neither population is superior in the same sense. Our results suggest mental health has deteriorated in recent years, that males' mental health status is better than that of females, and that non‐Aboriginal health status is better than that of the Aboriginal population.","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136212528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Embedding latent class regression and latent class distal outcome models into cluster-weighted latent class analysis: a detailed simulation experiment","authors":"Roberto Di Mari, Antonio Punzo, Zsuzsa Bakk","doi":"10.1111/anzs.12396","DOIUrl":"https://doi.org/10.1111/anzs.12396","url":null,"abstract":"<p>Usually in latent class (LC) analysis, external predictors are taken to be cluster conditional probability predictors (LC models with external predictors), and/or score conditional probability predictors (LC regression models). In such cases, their distribution is not of interest. Class-specific distribution is of interest in the distal outcome model, when the distribution of the external variables is assumed to depend on LC membership. In this paper, we consider a more general formulation, that embeds both the LC regression and the distal outcome models, as is typically done in cluster-weighted modelling. This allows us to investigate (1) whether the distribution of the external variables differs across classes, (2) whether there are significant direct effects of the external variables on the indicators, by modelling jointly the relationship between the external and the latent variables. We show the advantages of the proposed modelling approach through a set of artificial examples, an extensive simulation study and an empirical application about psychological contracts among employees and employers in Belgium and the Netherlands.</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.12396","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50141418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Khue-Dung Dang, Louise M. Ryan, Tugba Akkaya Hocagil, Richard J. Cook, Gale A. Richardson, Nancy L. Day, Claire D. Coles, Heather Carmichael Olson, Sandra W. Jacobson, Joseph L. Jacobson
{"title":"Bayesian modelling of effects of prenatal alcohol exposure on child cognition based on data from multiple cohorts","authors":"Khue-Dung Dang, Louise M. Ryan, Tugba Akkaya Hocagil, Richard J. Cook, Gale A. Richardson, Nancy L. Day, Claire D. Coles, Heather Carmichael Olson, Sandra W. Jacobson, Joseph L. Jacobson","doi":"10.1111/anzs.12397","DOIUrl":"https://doi.org/10.1111/anzs.12397","url":null,"abstract":"<div>\u0000 \u0000 <p>High levels of prenatal alcohol exposure (PAE) result in significant cognitive deficits in children, but the exact nature of the dose-response relationship is less well understood. To investigate this relationship, data were assembled from six longitudinal birth cohort studies examining the effects of PAE on cognitive outcomes from early school age through adolescence. Structural equation models (SEMs) are a natural approach to consider, because of the way they conceptualise multiple observed outcomes as relating to an underlying latent variable of interest, which can then be modelled as a function of exposure and other predictors of interest. However, conventional SEMs could not be fitted in this context because slightly different outcome measures were used in the six studies. In this paper we propose a multi-group Bayesian SEM that maps the unobserved cognition variable to a broad range of observed outcomes. The relation between these variables and PAE is then examined while controlling for potential confounders via propensity score adjustment. By examining different possible dose-response functions, the proposed framework is used to investigate whether there is a threshold PAE level that results in minimal cognitive deficit.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50125566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qin Wu, Guo-Liang Tian, Tao Li, Man-Lai Tang, Chi Zhang
{"title":"The multivariate component zero-inflated Poisson model for correlated count data analysis","authors":"Qin Wu, Guo-Liang Tian, Tao Li, Man-Lai Tang, Chi Zhang","doi":"10.1111/anzs.12395","DOIUrl":"https://doi.org/10.1111/anzs.12395","url":null,"abstract":"<div>\u0000 \u0000 <p>Multivariate zero-inflated Poisson (ZIP) distributions are important tools for modelling and analysing correlated count data with extra zeros. Unfortunately, existing multivariate ZIP distributions consider only the overall zero-inflation while the component zero-inflation is not well addressed. This paper proposes a flexible multivariate ZIP distribution, called the multivariate component ZIP distribution, in which both the overall and component zero-inflations are taken into account. Likelihood-based inference procedures including the calculation of maximum likelihood estimates of parameters in the model without and with covariates are provided. Simulation studies indicate that the performance of the proposed methods on the multivariate component ZIP model is satisfactory. The Australia health care utilisation data set is analysed to demonstrate that the new distribution is more appropriate than the existing multivariate ZIP distributions.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50145271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Short-term forecasting with a computationally efficient nonparametric transfer function model","authors":"Jun. M. Liu","doi":"10.1111/anzs.12394","DOIUrl":"https://doi.org/10.1111/anzs.12394","url":null,"abstract":"<div>\u0000 \u0000 <p>In this paper a semi-parametric approach is developed to model non-linear relationships in time series data using polynomial splines. Polynomial splines require very little assumption about the functional form of the underlying relationship, so they are very flexible and can be used to model highly non-linear relationships. Polynomial splines are also computationally very efficient. The serial correlation in the data is accounted for by modelling the noise as an autoregressive integrated moving average (ARIMA) process, by doing so, the efficiency in nonparametric estimation is improved and correct inferences can be obtained. The explicit structure of the ARIMA model allows the correlation information to be used to improve forecasting performance. An algorithm is developed to automatically select and estimate the polynomial spline model and the ARIMA model through backfitting. This method is applied on a real-life data set to forecast hourly electricity usage. The non-linear effect of temperature on hourly electricity usage is allowed to be different at different hours of the day and days of the week. The forecasting performance of the developed method is evaluated in post-sample forecasting and compared with several well-accepted models. The results show the performance of the proposed model is comparable with a long short-term memory deep learning model.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50114984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asymptotics of M-estimator in multivariate linear regression models for a class of random errors","authors":"Yi Wu, Wei Yu, Xuejun Wang","doi":"10.1111/anzs.12393","DOIUrl":"https://doi.org/10.1111/anzs.12393","url":null,"abstract":"<div>\u0000 \u0000 <p>It is known that linear regression models have immense applications in various areas such as engineering technology, economics and social sciences. In this paper, we investigate the asymptotic properties of <i>M</i>-estimator in multivariate linear regression model based on a class of random errors satisfying a generalised Bernstein-type inequality. By using the generalised Bernstein-type inequality, we obtain a general result on almost sure convergence for a class of random variables and then obtain the strong consistency for the <i>M</i>-estimator in multivariate linear regression models under some mild conditions. The result extends or improves some existing ones in the literature. Moreover, we also consider the case when the dimension $p$ tends to infinity by establishing the rate of almost sure convergence for a class of random variables satisfying generalised Bernstein-type inequality. Some numerical simulations are also provided to verify the validity of the theoretical results.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50148711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fangyao Li, Christopher M. Triggs, Ciprian Doru Giurcăneanu
{"title":"On the selection of predictors by using greedy algorithms and information theoretic criteria","authors":"Fangyao Li, Christopher M. Triggs, Ciprian Doru Giurcăneanu","doi":"10.1111/anzs.12387","DOIUrl":"https://doi.org/10.1111/anzs.12387","url":null,"abstract":"<p>We discuss the use of the following greedy algorithms in the prediction of multivariate time series: Matching Pursuit Algorithm (MPA), Orthogonal Matching Pursuit (OMP), Relaxed Matching Pursuit (RMP), Frank–Wolfe Algorithm (FWA) and Constrained Matching Pursuit (CMP). The last two are known to be solvers for the lasso problem. Some of the algorithms are well-known (e.g. OMP), while others are less popular (e.g. RMP). We provide a unified presentation of all the algorithms, and evaluate their computational complexity for the high-dimensional case and for the big data case. We show how 12 information theoretic (IT) criteria can be used jointly with the greedy algorithms. As part of this effort, we derive new theoretical results that allow modification of the IT criteria such that to be compatible with RMP. The prediction capabilities are tested in experiments with two data sets. The first one involves air pollution data measured in Auckland (New Zealand) and the second one concerns the House Price Index in England (the United Kingdom).</p>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/anzs.12387","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50155532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikola Počuča, Michael P.B. Gallaugher, Katharine M. Clark, Paul D. McNicholas
{"title":"Visual assessment of matrix-variate normality","authors":"Nikola Počuča, Michael P.B. Gallaugher, Katharine M. Clark, Paul D. McNicholas","doi":"10.1111/anzs.12388","DOIUrl":"https://doi.org/10.1111/anzs.12388","url":null,"abstract":"<div>\u0000 \u0000 <p>In recent years, the analysis of three-way data has become ever more prevalent in the literature. It is becoming increasingly common to analyse such data by means of matrix-variate distributions, the most prevalent of which is the matrix-variate normal distribution. Although many methods exist for assessing multivariate normality, there is a relative paucity of approaches for assessing matrix-variate normality. Herein, a new visual method is proposed for assessing matrix-variate normality by means of a distance–distance plot. In addition, a testing procedure is discussed to be used in tandem with the proposed visual method. The proposed approach is illustrated via simulated data as well as an application on analysing handwritten digits.</p>\u0000 </div>","PeriodicalId":55428,"journal":{"name":"Australian & New Zealand Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":1.1,"publicationDate":"2023-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50151748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}