{"title":"Data depth functions for non-standard data by use of formal concept analysis","authors":"Hannah Blocher, Georg Schollmeyer","doi":"10.1016/j.jmva.2024.105372","DOIUrl":null,"url":null,"abstract":"<div><div>In this article we introduce a notion of depth functions for data types that are not given in standard statistical data formats. We focus on data that cannot be represented by one specific data structure, such as normed vector spaces. This covers a wide range of different data types, which we refer to as non-standard data. Depth functions have been studied intensively for normed vector spaces. However, a discussion of depth functions for non-standard data is lacking. In this article, we address this gap by using formal concept analysis to obtain a unified data representation. Building on this representation, we then define depth functions for non-standard data. Furthermore, we provide a systematic basis by introducing structural properties using the data representation provided by formal concept analysis. Finally, we embed the generalised Tukey depth into our concept of data depth and analyse it using the introduced structural properties. Thus, this article presents the mathematical formalisation of centrality and outlyingness for non-standard data and increases the number of spaces in which centrality can be discussed. In particular, we provide a basis for defining further depth functions and statistical inference methods for non-standard data.</div></div>","PeriodicalId":16431,"journal":{"name":"Journal of Multivariate Analysis","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Multivariate Analysis","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0047259X24000794","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
In this article we introduce a notion of depth functions for data types that are not given in standard statistical data formats. We focus on data that cannot be represented by one specific data structure, such as normed vector spaces. This covers a wide range of different data types, which we refer to as non-standard data. Depth functions have been studied intensively for normed vector spaces. However, a discussion of depth functions for non-standard data is lacking. In this article, we address this gap by using formal concept analysis to obtain a unified data representation. Building on this representation, we then define depth functions for non-standard data. Furthermore, we provide a systematic basis by introducing structural properties using the data representation provided by formal concept analysis. Finally, we embed the generalised Tukey depth into our concept of data depth and analyse it using the introduced structural properties. Thus, this article presents the mathematical formalisation of centrality and outlyingness for non-standard data and increases the number of spaces in which centrality can be discussed. In particular, we provide a basis for defining further depth functions and statistical inference methods for non-standard data.
期刊介绍:
Founded in 1971, the Journal of Multivariate Analysis (JMVA) is the central venue for the publication of new, relevant methodology and particularly innovative applications pertaining to the analysis and interpretation of multidimensional data.
The journal welcomes contributions to all aspects of multivariate data analysis and modeling, including cluster analysis, discriminant analysis, factor analysis, and multidimensional continuous or discrete distribution theory. Topics of current interest include, but are not limited to, inferential aspects of
Copula modeling
Functional data analysis
Graphical modeling
High-dimensional data analysis
Image analysis
Multivariate extreme-value theory
Sparse modeling
Spatial statistics.