{"title":"Table inference for combinatorial origin‐destination choices in agent‐based population synthesis","authors":"Ioannis Zachos, Theodoros Damoulas, Mark Girolami","doi":"10.1002/sta4.656","DOIUrl":"https://doi.org/10.1002/sta4.656","url":null,"abstract":"A key challenge in agent‐based mobility simulations is the synthesis of individual agent socioeconomic profiles. Such profiles include locations of agent activities, which dictate the quality of the simulated travel patterns. These locations are typically represented in origin‐destination matrices that are sampled using coarse travel surveys. This is because fine‐grained trip profiles are scarce and fragmented due to privacy and cost reasons. The discrepancy between data and sampling resolutions renders agent traits nonidentifiable due to the combinatorial space of data‐consistent individual attributes. This problem is pertinent to any agent‐based inference setting where the latent state is discrete. Existing approaches have used continuous relaxations of the underlying location assignments and subsequent ad hoc discretisation thereof. We propose a framework to efficiently navigate this space offering improved reconstruction and coverage as well as linear‐time sampling of the ground truth origin‐destination table. This allows us to avoid factorially growing rejection rates and poor summary statistic consistency inherent in discrete choice modelling. We achieve this by introducing joint sampling schemes for the continuous intensity and discrete table of agent trips, as well as Markov bases that can efficiently traverse this combinatorial space subject to summary statistic constraints. Our framework's benefits are demonstrated in multiple controlled experiments and a large‐scale application to agent work trip reconstruction in Cambridge, UK.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"105 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140056692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image registration for zooming: A statistically consistent local feature mapping approach","authors":"Sujay Das, Anik Roy, Partha Sarathi Mukherjee","doi":"10.1002/sta4.664","DOIUrl":"https://doi.org/10.1002/sta4.664","url":null,"abstract":"Image registration is a widely used tool for matching two images of the same scene with one another. In the literature, several image registration techniques are available to register rigid-body and non-rigid-body transformations. One such important transformation is zooming. There are very few feature-based methods that address this particular problem. These methods fail miserably when there are only a limited number of point features available in the image. This paper proposes a feature-based approach that works with a feature that is readily available in almost all images, for registering two images of the same image object where one is a zoomed-in version of the other. In the proposed method, we first detect the possible edge points which we consider as features in both the reference and the zoomed image. Then, we map these features of the reference and the zoomed image with one another and find the relationship between them using a mathematical model. Finally, we use the relationship to register the zoomed-in image. This method outperforms some of the state-of-the-art methods in many occasions. Several numerical examples and some statistical properties justify that this method works well in many applications.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"63 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"D-optimal designs for multi-response linear models with two groups","authors":"Xin Liu, Lei He, Rong-Xian Yue","doi":"10.1002/sta4.665","DOIUrl":"https://doi.org/10.1002/sta4.665","url":null,"abstract":"In recent years, multi-response linear models have gained significant popularity in various statistical applications. However, the design aspects of multi-response linear models with group-wise considerations have received limited attention in the literature. This paper aims to thoroughly investigate <mjx-container aria-label=\"upper D\" ctxtmenu_counter=\"1\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/75327e92-2ca5-46c5-ae20-6902d6add7ab/sta4665-math-0003.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\">D</mi></mrow>$$ D $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>-optimal designs for such models. Specifically, we focus on scenarios involving two groups, where the proportions of observations for each group can be arbitrarily selected or not. Two equivalence theorems are presented to elaborate the characterization of <mjx-container aria-label=\"upper D\" ctxtmenu_counter=\"2\" ctxtmenu_oldtabindex=\"1\" jax=\"CHTML\" role=\"application\" sre-explorer- style=\"font-size: 103%; position: relative;\" tabindex=\"0\"><mjx-math aria-hidden=\"true\"><mjx-semantics><mjx-mrow><mjx-mi data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic- data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\"><mjx-c></mjx-c></mjx-mi></mjx-mrow></mjx-semantics></mjx-math><mjx-assistive-mml aria-hidden=\"true\" display=\"inline\" unselectable=\"on\"><math altimg=\"/cms/asset/ac956979-3a41-48e3-8773-e9144fe466ed/sta4665-math-0004.png\" xmlns=\"http://www.w3.org/1998/Math/MathML\"><semantics><mrow><mi data-semantic-=\"\" data-semantic-annotation=\"clearspeak:simple\" data-semantic-font=\"italic\" data-semantic-role=\"latinletter\" data-semantic-speech=\"upper D\" data-semantic-type=\"identifier\">D</mi></mrow>$$ D $$</annotation></semantics></math></mjx-assistive-mml></mjx-container>-optimal designs. Additionally, we delve into the admissibility of approximate designs and establish necessary conditions for a design to be deemed admissible. Several illustrative examples are addressed to demonstrate the application of the derived theoretical results.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"9 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Asymptotic behaviour of a non‐autonomous multispecies Holling type II model with a complex type of noises","authors":"Libai Xu, Xintong Ma, Yanyan Zhao","doi":"10.1002/sta4.667","DOIUrl":"https://doi.org/10.1002/sta4.667","url":null,"abstract":"The deterministic non‐autonomous multispecies Holling type II model and its stochastic version with a simple type of noise have been proposed to infer multispecies community structure. However, these models fail to account for complex types of noises, which may render the model overly simplistic. In this paper, a non‐autonomous multispecies Holling type II model with a complex type of noise has been proposed. We establish sufficient conditions for various mathematical properties of the solutions, including existence and uniqueness, stochastic permanence and extinction. Additionally, numerical simulation studies are provided to illustrate our theoretical findings.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"42 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140044633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marianne Huebner, Steven J. Pierce, Andrew J. Dennhardt, Hope Akaeze, Nicole Jess, Wenjuan Ma
{"title":"What matters to graduate students? Experiences at a statistical consulting center from pre‐ to post‐COVID‐19 pandemic","authors":"Marianne Huebner, Steven J. Pierce, Andrew J. Dennhardt, Hope Akaeze, Nicole Jess, Wenjuan Ma","doi":"10.1002/sta4.659","DOIUrl":"https://doi.org/10.1002/sta4.659","url":null,"abstract":"The COVID‐19 pandemic led to unprecedented changes in all levels of society, including the statistical consulting field. This paper focuses on the experiences of graduate student consultants and clients at our statistical consulting center (SCC) that operates all year independent of semesters. During the lockdown period, work continued without interruption and was conducted remotely, but there was a temporary reduction in utilization. Advice on statistical methods, help with data analysis and educational offerings are the main appeals to utilize SCC services. We describe our mentoring approach for graduate student research assistants (RAs) and how pandemic changes affected RAs and clients. Based on experiences during the pandemic, we offer practical suggestions for SCCs' approaches to research support, work characteristics and collaborations to improve the experiences of graduate students, both as consultants and clients. Most collaboration meetings are now virtual by request from clients. Telecommuting supports flexible personal schedules and needs. Online educational offerings provide easier access for participants and more opportunities for a wider range of topics and presenters. However, mentoring sessions for RAs are best conducted in‐person, and every effort should be made to encourage in‐person interactions and collaborations between staff members to advance the effectiveness of post‐pandemic SCCs.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"3 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140036512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Highly private large‐sample tests for contingency tables","authors":"Sungkyu Jung, Seung Woo Kwak","doi":"10.1002/sta4.658","DOIUrl":"https://doi.org/10.1002/sta4.658","url":null,"abstract":"Differential privacy is a foundational concept for safeguarding sensitive individual information when releasing data or statistical analysis results. In this study, we concentrate on the protection of privacy in the context of goodness‐of‐fit (GOF) and independence tests, utilizing perturbed contingency tables that adhere to Gaussian differential privacy within the high‐privacy regime, where the degrees of privacy protection increase as the sample size increases. We introduce private test procedures for GOF, independence of two variables and the equality of proportions in paired samples, similar to McNemar's test. For each of these hypothesis testing situations, we propose private test statistics based on the statistics and establish their asymptotic null distributions. We numerically confirm that Type I error rates of the proposed private test procedures are well controlled and have adequate power for larger sample sizes and effect sizes. The proposal is demonstrated in private inferences based on the American Time Use Survey data.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"109 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140009506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Machine collaboration","authors":"Qingfeng Liu, Yang Feng","doi":"10.1002/sta4.661","DOIUrl":"https://doi.org/10.1002/sta4.661","url":null,"abstract":"We propose a new ensemble framework for supervised learning, called <i>machine collaboration</i> (MaC), using a collection of possibly heterogeneous base learning methods (hereafter, base machines) for prediction tasks. Unlike bagging/stacking (a parallel and independent framework) and boosting (a sequential and top-down framework), MaC is a type of <i>circular</i> and <i>recursive</i> learning framework. The <i>circular</i> and <i>recursive</i> nature helps the base machines to transfer information circularly and update their structures and parameters accordingly. The theoretical result on the risk bound of the estimator from MaC reveals that the <i>circular</i> and <i>recursive</i> feature can help MaC reduce risk via a parsimonious ensemble. We conduct extensive experiments on MaC using both simulated data and 119 benchmark real datasets. The results demonstrate that in most cases, MaC performs significantly better than several other state-of-the-art methods, including classification and regression trees, neural networks, stacking, and boosting.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"166 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140008952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linear mixed models for complex survey data: Implementing and evaluating pairwise likelihood","authors":"Thomas Lumley, Xudong Huang","doi":"10.1002/sta4.657","DOIUrl":"https://doi.org/10.1002/sta4.657","url":null,"abstract":"As complex-survey data become more widely used in health and social science research, there is increasing interest in fitting a wider range of regression models. We describe an implementation of two-level linear mixed models in R using the pairwise composite likelihood approach of Rao and co-workers. We discuss the computational efficiency of pairwise composite likelihood and compare the estimator to the existing sequential pseudolikelihood estimator in simulations and in data from the Programme for International Student Assessment (PISA) educational survey.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"35 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139981455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A note about why deep learning is deep: A discontinuous approximation perspective","authors":"Yongxin Li, Haobo Qi, Hansheng Wang","doi":"10.1002/sta4.654","DOIUrl":"https://doi.org/10.1002/sta4.654","url":null,"abstract":"Deep learning has achieved unprecedented success in recent years. This approach essentially uses the composition of nonlinear functions to model the complex relationship between input features and output labels. However, a comprehensive theoretical understanding of why the hierarchical layered structure can exhibit superior expressive power is still lacking. In this paper, we provide an explanation for this phenomenon by measuring the approximation efficiency of neural networks with respect to discontinuous target functions. We focus on deep neural networks with rectified linear unit (ReLU) activation functions. We find that to achieve the same degree of approximation accuracy, the number of neurons required by a single‐hidden‐layer (SHL) network is exponentially greater than that required by a multi‐hidden‐layer (MHL) network. In practice, discontinuous points tend to contain highly valuable information (i.e., edges in image classification). We argue that this may be a very important reason accounting for the impressive performance of deep neural networks. We validate our theory in extensive experiments.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"30 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139948722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Camille J. Hochheimer, Grace N. Bosma, Lauren Gunn-Sandell, Mary D. Sammel
{"title":"Reproducible research practices: A tool for effective and efficient leadership in collaborative statistics","authors":"Camille J. Hochheimer, Grace N. Bosma, Lauren Gunn-Sandell, Mary D. Sammel","doi":"10.1002/sta4.653","DOIUrl":"https://doi.org/10.1002/sta4.653","url":null,"abstract":"With data and code sharing policies more common and version control more widely used in statistics, standards for reproducible research are higher than ever. Reproducible research practices must keep up with the fast pace of research. To do so, we propose combining modern practices of leadership with best practices for reproducible research in collaborative statistics as an effective tool for ensuring quality and accuracy while developing stewardship and autonomy in the people we lead. First, we establish a framework for expectations of reproducible statistical research. Then, we introduce Stephen M.R. Covey's theory of trusting and inspiring leadership. These two are combined as we show how stewardship agreements can be used to make reproducible coding a team norm. We provide an illustrative code example and highlight how this method creates a more collaborative rather than evaluative culture where team members hold themselves accountable. The goal of this manuscript is for statisticians to find this application of leadership theory useful and to inspire them to intentionally develop their personal approach to leadership.","PeriodicalId":56159,"journal":{"name":"Stat","volume":"3 1","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139756629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}