Owen G. Ward, Jing Wu, Tian Zheng, Anna L. Smith, James P. Curley
{"title":"Network Hawkes process models for exploring latent hierarchy in social animal interactions","authors":"Owen G. Ward, Jing Wu, Tian Zheng, Anna L. Smith, James P. Curley","doi":"10.1111/rssc.12581","DOIUrl":"10.1111/rssc.12581","url":null,"abstract":"<p>Group-based social dominance hierarchies are of essential interest in understanding social structure (DeDeo & Hobson in, Proceedings of the National Academy of Sciences 118(21), 2021). Recent animal behaviour research studies can record aggressive interactions observed over time. Models that can explore the underlying hierarchy from the observed temporal dynamics in behaviours are therefore crucial. Traditional ranking methods aggregate interactions across time into win/loss counts, equalizing dynamic interactions with the underlying hierarchy. Although these models have gleaned important behavioural insights from such data, they are limited in addressing many important questions that remain unresolved. In this paper, we take advantage of the observed interactions' timestamps, proposing a series of network point process models with latent ranks. We carefully design these models to incorporate important theories on animal behaviour that account for dynamic patterns observed in the interaction data, including the winner effect, bursting and pair-flip phenomena. Through iteratively constructing and evaluating these models we arrive at the final cohort Markov-modulated Hawkes process (C-MMHP), which best characterizes all aforementioned patterns observed in interaction data. As such, inference on our model components can be readily interpreted in terms of theories on animal behaviours. The probabilistic nature of our model allows us to estimate the uncertainty in our ranking. In particular, our model is able to provide insights into the distribution of power within the hierarchy which forms and the strength of the established hierarchy. We compare all models using simulated and real data. Using statistically developed diagnostic perspectives, we demonstrate that the C-MMHP model outperforms other methods, capturing relevant latent ranking structures that lead to meaningful predictions for real data.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82071302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marco Riani, Anthony C. Atkinson, Francesca Torti, Aldo Corbellini
{"title":"Robust correspondence analysis","authors":"Marco Riani, Anthony C. Atkinson, Francesca Torti, Aldo Corbellini","doi":"10.1111/rssc.12580","DOIUrl":"10.1111/rssc.12580","url":null,"abstract":"<p>Correspondence analysis is a method for the visual display of information from two-way contingency tables. We introduce a robust form of correspondence analysis based on minimum covariance determinant estimation. This leads to the systematic deletion of outlying rows of the table and to plots of greatly increased informativeness. Our examples are trade flows of clothes and consumer evaluations of the perceived properties of cars. The robust method requires that a specified proportion of the data be used in fitting. To accommodate this requirement we provide an algorithm that uses a subset of complete rows and one row partially, both sets of rows being chosen robustly. We prove the convergence of this algorithm.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/rssc.12580","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82808130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatiotemporal ETAS model with a renewal main-shock arrival process","authors":"Tom Stindl, Feng Chen","doi":"10.1111/rssc.12579","DOIUrl":"10.1111/rssc.12579","url":null,"abstract":"<p>We propose a spatiotemporal point process model that enhances the classical Epidemic-Type Aftershock Sequence (ETAS) model. This is achieved with the introduction of a renewal main-shock arrival process and we call this extension the renewal ETAS (RETAS) model. This modification is similar in spirit to the renewal Hawkes (RHawkes) process but the conditional intensity process supports a spatial component. It empowers the main-shock intensity to reset upon the arrival of main-shocks. This allows for heavier clustering of main-shocks than the classical spatiotemporal ETAS model. We introduce a likelihood evaluation algorithm for parameter estimation and provide a novel procedure to evaluate the fitted model's goodness-of-fit (GOF) based on a sequential application of the Rosenblatt transformation. A simulation algorithm for the RETAS model is outlined and used to validate the numerical performance of the likelihood evaluation algorithm and GOF test procedure. We illustrate the proposed model and methods on various earthquake catalogues around the world each with distinctly different seismic activity. These catalogues demonstrate the RETAS model's additional flexibility in comparison to the classical spatiotemporal ETAS model and emphasizes the potential for superior modelling and forecasting of seismicity.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/rssc.12579","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79644802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Specification analysis for technology use and teenager well-being: Statistical validity and a Bayesian proposal","authors":"Christoph Semken, David Rossell","doi":"10.1111/rssc.12578","DOIUrl":"10.1111/rssc.12578","url":null,"abstract":"A key issue in science is assessing robustness to data analysis choices, while avoiding selective reporting and providing valid inference. Specification Curve Analysis is a tool intended to prevent selective reporting. Alas, when used for inference it can create severe biases and false positives, due to wrongly adjusting for covariates, and mask important treatment effect heterogeneity. As our motivating application, it led an influential study to conclude there is no relevant association between technology use and teenager mental well‐being. We discuss these issues and propose a strategy for valid inference. Bayesian Specification Curve Analysis (BSCA) uses Bayesian Model Averaging to incorporate covariates and heterogeneous effects across treatments, outcomes and subpopulations. BSCA gives significantly different insights into teenager well‐being, revealing that the association with technology differs by device, gender and who assesses well‐being (teenagers or their parents).","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/rssc.12578","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83476610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Stival, M. Bernardi, Manuela Cattelan, P. Dellaportas
{"title":"Missing data patterns in runners’ careers: do they matter?","authors":"M. Stival, M. Bernardi, Manuela Cattelan, P. Dellaportas","doi":"10.1093/jrsssc/qlad009","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad009","url":null,"abstract":"\u0000 Predicting the future performance of young runners is an important research issue in experimental sports science and performance analysis. We analyse a dataset with annual seasonal best performances of male middle distance runners for a period of 14 years and provide a modelling framework that accounts for both the fact that each runner has typically run in 3 distance events (800, 1,500, and 5,000 m) and the presence of periods of no running activities. We propose a latent class matrix-variate state space model and we empirically demonstrate that accounting for missing data patterns in runners’ careers improves the out of sample prediction of their performances over time. In particular, we demonstrate that for this analysis, the missing data patterns provide valuable information for the prediction of runner’s performance.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80455942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Heterogeneous graphical model for non-negative and non-Gaussian \u0000 \u0000 \u0000 PM\u0000 2.5\u0000 \u0000 data","authors":"Jiaqi Zhang, Xinyan Fan, Yang Li, Shuangge Ma","doi":"10.1111/rssc.12575","DOIUrl":"10.1111/rssc.12575","url":null,"abstract":"<p>Studies on the conditional relationships between \u0000<math>\u0000 <mrow>\u0000 <msub>\u0000 <mtext>PM</mtext>\u0000 <mn>2.5</mn>\u0000 </msub>\u0000 </mrow></math> concentrations among different regions are of great interest for the joint prevention and control of air pollution. Because of seasonal changes in atmospheric conditions, spatial patterns of \u0000<math>\u0000 <mrow>\u0000 <msub>\u0000 <mtext>PM</mtext>\u0000 <mn>2.5</mn>\u0000 </msub>\u0000 </mrow></math> may differ throughout the year. Additionally, concentration data are both non-negative and non-Gaussian. These data features pose significant challenges to existing methods. This study proposes a heterogeneous graphical model for non-negative and non-Gaussian data via the score matching loss. The proposed method simultaneously clusters multiple datasets and estimates a graph for variables with complex properties in each cluster. Furthermore, our model involves a network that indicate similarity among datasets, and this network can have additional applications. In simulation studies, the proposed method outperforms competing alternatives in both clustering and edge identification. We also analyse the \u0000<math>\u0000 <mrow>\u0000 <msub>\u0000 <mtext>PM</mtext>\u0000 <mn>2.5</mn>\u0000 </msub>\u0000 </mrow></math> concentrations' spatial correlations in Taiwan's regions using data obtained in year 2019 from 67 air-quality monitoring stations. The 12 months are clustered into four groups: January–March, April, May–September and October–December, and the corresponding graphs have 153, 57, 86 and 167 edges respectively. The results show obvious seasonality, which is consistent with the meteorological literature. Geographically, the \u0000<math>\u0000 <mrow>\u0000 <msub>\u0000 <mtext>PM</mtext>\u0000 <mn>2.5</mn>\u0000 </msub>\u0000 </mrow></math> concentrations of north and south Taiwan regions correlate more respectively. These results can provide valuable information for developing joint air-quality control strategies.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82229817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}