{"title":"应用于血迹模式分析的定向线性数据的狄利克特过程模型","authors":"Tong Zou, Hal S. Stern","doi":"10.1016/j.csda.2024.108093","DOIUrl":null,"url":null,"abstract":"<div><div>Directional data require specialized models because of the non-Euclidean nature of their domain. When a directional variable is observed jointly with linear variables, modeling their dependence adds an additional layer of complexity. A Bayesian nonparametric approach is introduced to analyze directional-linear data. Firstly, the projected normal distribution is extended to model the joint distribution of linear variables and a directional variable with arbitrary dimension projected from a higher-dimensional augmented multivariate normal distribution. The new distribution is called the semi-projected normal distribution (SPN) and can be used as the mixture distribution in a Dirichlet process model to obtain a more flexible class of models for directional-linear data. Then, a conditional inverse-Wishart distribution is proposed as part of the prior distribution to address an identifiability issue inherited from the projected normal and preserve conjugacy with the SPN. The SPN mixture model shows superior performance in clustering on synthetic data compared to the semi-wrapped Gaussian model. The experiments show the ability of the SPN mixture model to characterize bloodstain patterns. A hierarchical Dirichlet process model with the SPN distribution is built to estimate the likelihood of bloodstain patterns under a posited causal mechanism for use in a likelihood ratio approach to the analysis of forensic bloodstain pattern evidence.</div></div>","PeriodicalId":55225,"journal":{"name":"Computational Statistics & Data Analysis","volume":"204 ","pages":"Article 108093"},"PeriodicalIF":1.5000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Dirichlet process model for directional-linear data with application to bloodstain pattern analysis\",\"authors\":\"Tong Zou, Hal S. Stern\",\"doi\":\"10.1016/j.csda.2024.108093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Directional data require specialized models because of the non-Euclidean nature of their domain. When a directional variable is observed jointly with linear variables, modeling their dependence adds an additional layer of complexity. A Bayesian nonparametric approach is introduced to analyze directional-linear data. Firstly, the projected normal distribution is extended to model the joint distribution of linear variables and a directional variable with arbitrary dimension projected from a higher-dimensional augmented multivariate normal distribution. The new distribution is called the semi-projected normal distribution (SPN) and can be used as the mixture distribution in a Dirichlet process model to obtain a more flexible class of models for directional-linear data. Then, a conditional inverse-Wishart distribution is proposed as part of the prior distribution to address an identifiability issue inherited from the projected normal and preserve conjugacy with the SPN. The SPN mixture model shows superior performance in clustering on synthetic data compared to the semi-wrapped Gaussian model. The experiments show the ability of the SPN mixture model to characterize bloodstain patterns. A hierarchical Dirichlet process model with the SPN distribution is built to estimate the likelihood of bloodstain patterns under a posited causal mechanism for use in a likelihood ratio approach to the analysis of forensic bloodstain pattern evidence.</div></div>\",\"PeriodicalId\":55225,\"journal\":{\"name\":\"Computational Statistics & Data Analysis\",\"volume\":\"204 \",\"pages\":\"Article 108093\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Statistics & Data Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167947324001774\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics & Data Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167947324001774","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
A Dirichlet process model for directional-linear data with application to bloodstain pattern analysis
Directional data require specialized models because of the non-Euclidean nature of their domain. When a directional variable is observed jointly with linear variables, modeling their dependence adds an additional layer of complexity. A Bayesian nonparametric approach is introduced to analyze directional-linear data. Firstly, the projected normal distribution is extended to model the joint distribution of linear variables and a directional variable with arbitrary dimension projected from a higher-dimensional augmented multivariate normal distribution. The new distribution is called the semi-projected normal distribution (SPN) and can be used as the mixture distribution in a Dirichlet process model to obtain a more flexible class of models for directional-linear data. Then, a conditional inverse-Wishart distribution is proposed as part of the prior distribution to address an identifiability issue inherited from the projected normal and preserve conjugacy with the SPN. The SPN mixture model shows superior performance in clustering on synthetic data compared to the semi-wrapped Gaussian model. The experiments show the ability of the SPN mixture model to characterize bloodstain patterns. A hierarchical Dirichlet process model with the SPN distribution is built to estimate the likelihood of bloodstain patterns under a posited causal mechanism for use in a likelihood ratio approach to the analysis of forensic bloodstain pattern evidence.
期刊介绍:
Computational Statistics and Data Analysis (CSDA), an Official Publication of the network Computational and Methodological Statistics (CMStatistics) and of the International Association for Statistical Computing (IASC), is an international journal dedicated to the dissemination of methodological research and applications in the areas of computational statistics and data analysis. The journal consists of four refereed sections which are divided into the following subject areas:
I) Computational Statistics - Manuscripts dealing with: 1) the explicit impact of computers on statistical methodology (e.g., Bayesian computing, bioinformatics,computer graphics, computer intensive inferential methods, data exploration, data mining, expert systems, heuristics, knowledge based systems, machine learning, neural networks, numerical and optimization methods, parallel computing, statistical databases, statistical systems), and 2) the development, evaluation and validation of statistical software and algorithms. Software and algorithms can be submitted with manuscripts and will be stored together with the online article.
II) Statistical Methodology for Data Analysis - Manuscripts dealing with novel and original data analytical strategies and methodologies applied in biostatistics (design and analytic methods for clinical trials, epidemiological studies, statistical genetics, or genetic/environmental interactions), chemometrics, classification, data exploration, density estimation, design of experiments, environmetrics, education, image analysis, marketing, model free data exploration, pattern recognition, psychometrics, statistical physics, image processing, robust procedures.
[...]
III) Special Applications - [...]
IV) Annals of Statistical Data Science [...]