Advances in Data Analysis and Classification最新文献

筛选
英文 中文
A new model for counterfactual analysis for functional data 功能数据反事实分析的新模型
4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-10-25 DOI: 10.1007/s11634-023-00563-5
Emilio Carrizosa, Jasone Ramírez-Ayerbe, Dolores Romero Morales
{"title":"A new model for counterfactual analysis for functional data","authors":"Emilio Carrizosa, Jasone Ramírez-Ayerbe, Dolores Romero Morales","doi":"10.1007/s11634-023-00563-5","DOIUrl":"https://doi.org/10.1007/s11634-023-00563-5","url":null,"abstract":"Abstract Counterfactual explanations have become a very popular interpretability tool to understand and explain how complex machine learning models make decisions for individual instances. Most of the research on counterfactual explainability focuses on tabular and image data and much less on models dealing with functional data. In this paper, a counterfactual analysis for functional data is addressed, in which the goal is to identify the samples of the dataset from which the counterfactual explanation is made of, as well as how they are combined so that the individual instance and its counterfactual are as close as possible. Our methodology can be used with different distance measures for multivariate functional data and is applicable to any score-based classifier. We illustrate our methodology using two different real-world datasets, one univariate and another multivariate.","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135216055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Profile-based latent class distance association analyses for sparse tables:application to the attitude of European citizens towards sustainable tourism 稀疏表的基于轮廓的潜在类距离关联分析:应用于欧洲公民对可持续旅游的态度
4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-10-18 DOI: 10.1007/s11634-023-00559-1
Francesca Bassi, José Fernando Vera, Juan Antonio Marmolejo Martín
{"title":"Profile-based latent class distance association analyses for sparse tables:application to the attitude of European citizens towards sustainable tourism","authors":"Francesca Bassi, José Fernando Vera, Juan Antonio Marmolejo Martín","doi":"10.1007/s11634-023-00559-1","DOIUrl":"https://doi.org/10.1007/s11634-023-00559-1","url":null,"abstract":"Abstract Social and behavioural sciences often deal with the analysis of associations for cross-classified data. This paper focuses on the study of the patterns observed on European citizens regarding their attitude towards sustainable tourism, specifically their willingness to change travel and tourism habits to be more sustainable. The data collected the intention to comply with nine sustainable actions; answers to these questions generated individual profiles; moreover, European country belonging is reported. Therefore, unlike a variable-oriented approach, here we are interested in a person-oriented approach through profiles. Some traditional methods are limited in their performance when using profiles, for example, by sparseness of the contingency table. We removed many of these limitations by using a latent class distance association model, clustering the row profiles into classes and representing these together with the categories of the response variable in a low-dimensional space. We showed, furthermore, that an easy interpretation of associations between clusters’ centres and categories of a response variable can be incorporated in this framework in an intuitive way using unfolding. Results of the analyses outlined that citizens mostly committed to an environmentally friendly behavior live in Sweden and Romania; citizens less willing to change their habits towards a more sustainable behavior live in Belgium, Cyprus, France, Lithuania and the Netherlands. Citizens preparedness to change habits however depends also on their socio-demographic characteristics such as gender, age, occupation, type of community where living, household size, and the frequency of travelling before the Covid-19 pandemic.","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135884070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial for ADAC issue 4 of volume 17 (2023) ADAC第17卷第4期(2023年)社论
IF 1.6 4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-10-14 DOI: 10.1007/s11634-023-00564-4
Maurizio Vichi, Andrea Cerioli, Hans A. Kestler, Akinori Okada, Claus Weihs
{"title":"Editorial for ADAC issue 4 of volume 17 (2023)","authors":"Maurizio Vichi, Andrea Cerioli, Hans A. Kestler, Akinori Okada, Claus Weihs","doi":"10.1007/s11634-023-00564-4","DOIUrl":"10.1007/s11634-023-00564-4","url":null,"abstract":"","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50028115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discovering interpretable structure in longitudinal predictors via coefficient trees 通过系数树发现纵向预测器的可解释结构
4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-10-11 DOI: 10.1007/s11634-023-00562-6
Özge Sürer, Daniel W. Apley, Edward C. Malthouse
{"title":"Discovering interpretable structure in longitudinal predictors via coefficient trees","authors":"Özge Sürer, Daniel W. Apley, Edward C. Malthouse","doi":"10.1007/s11634-023-00562-6","DOIUrl":"https://doi.org/10.1007/s11634-023-00562-6","url":null,"abstract":"","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136209673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized Cramér’s coefficient via f-divergence for contingency tables 列联表的f散度广义cram<s:1>系数
4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-10-05 DOI: 10.1007/s11634-023-00560-8
Wataru Urasaki, Tomoyuki Nakagawa, Tomotaka Momozaki, Sadao Tomizawa
{"title":"Generalized Cramér’s coefficient via f-divergence for contingency tables","authors":"Wataru Urasaki, Tomoyuki Nakagawa, Tomotaka Momozaki, Sadao Tomizawa","doi":"10.1007/s11634-023-00560-8","DOIUrl":"https://doi.org/10.1007/s11634-023-00560-8","url":null,"abstract":"Abstract Various measures in two-way contingency table analysis have been proposed to express the strength of association between row and column variables in contingency tables. Tomizawa et al. (2004) proposed more general measures, including Cramér’s coefficient, using the power-divergence. In this paper, we propose measures using the f -divergence that has a wider class than the power-divergence. Unlike statistical hypothesis tests, these measures provide quantification of the association structure in contingency tables. The contribution of our study is proving that a measure applying a function that satisfies the condition of the f -divergence has desirable properties for measuring the strength of association in contingency tables. With this contribution, we can easily construct a new measure using a divergence that has essential properties for the analyst. For example, we conducted numerical experiments with a measure applying the $$theta$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mi>θ</mml:mi> </mml:math> -divergence. Furthermore, we can give further interpretation of the association between the row and column variables in the contingency table, which could not be obtained with the conventional one. We also show a relationship between our proposed measures and the correlation coefficient in a bivariate normal distribution of latent variables in the contingency tables.","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135481525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixture modeling with normalizing flows for spherical density estimation 用于球形密度估计的带有归一化流量的混合物建模
IF 1.4 4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-10-04 DOI: 10.1007/s11634-023-00561-7
Tin Lok James Ng, Andrew Zammit-Mangion
{"title":"Mixture modeling with normalizing flows for spherical density estimation","authors":"Tin Lok James Ng,&nbsp;Andrew Zammit-Mangion","doi":"10.1007/s11634-023-00561-7","DOIUrl":"10.1007/s11634-023-00561-7","url":null,"abstract":"<div><p>Normalizing flows are objects used for modeling complicated probability density functions, and have attracted considerable interest in recent years. Many flexible families of normalizing flows have been developed. However, the focus to date has largely been on normalizing flows on Euclidean domains; while normalizing flows have been developed for spherical and other non-Euclidean domains, these are generally less flexible than their Euclidean counterparts. To address this shortcoming, in this work we introduce a mixture-of-normalizing-flows model to construct complicated probability density functions on the sphere. This model provides a flexible alternative to existing parametric, semiparametric, and nonparametric, finite mixture models. Model estimation is performed using the expectation maximization algorithm and a variant thereof. The model is applied to simulated data, where the benefit over the conventional (single component) normalizing flow is verified. The model is then applied to two real-world data sets of events occurring on the surface of Earth; the first relating to earthquakes, and the second to terrorist activity. In both cases, we see that the mixture-of-normalizing-flows model yields a good representation of the density of event occurrence.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135548086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Parsimony and parameter estimation for mixtures of multivariate leptokurtic-normal distributions 多元畸变正态分布混合物的解析和参数估计
IF 1.4 4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-09-27 DOI: 10.1007/s11634-023-00558-2
Ryan P. Browne, Luca Bagnato, Antonio Punzo
{"title":"Parsimony and parameter estimation for mixtures of multivariate leptokurtic-normal distributions","authors":"Ryan P. Browne,&nbsp;Luca Bagnato,&nbsp;Antonio Punzo","doi":"10.1007/s11634-023-00558-2","DOIUrl":"10.1007/s11634-023-00558-2","url":null,"abstract":"<div><p>Mixtures of multivariate leptokurtic-normal distributions have been recently introduced in the clustering literature based on mixtures of elliptical heavy-tailed distributions. They have the advantage of having parameters directly related to the moments of practical interest. We derive two estimation procedures for these mixtures. The first one is based on the majorization-minimization algorithm, while the second is based on a fixed point approximation. Moreover, we introduce parsimonious forms of the considered mixtures and we use the illustrated estimation procedures to fit them. We use simulated and real data sets to investigate various aspects of the proposed models and algorithms.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11634-023-00558-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135535813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Theory of angular depth for classification of directional data 用于定向数据分类的角深度理论
IF 1.4 4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-09-23 DOI: 10.1007/s11634-023-00557-3
Stanislav Nagy, Houyem Demni, Davide Buttarazzi, Giovanni C. Porzio
{"title":"Theory of angular depth for classification of directional data","authors":"Stanislav Nagy,&nbsp;Houyem Demni,&nbsp;Davide Buttarazzi,&nbsp;Giovanni C. Porzio","doi":"10.1007/s11634-023-00557-3","DOIUrl":"10.1007/s11634-023-00557-3","url":null,"abstract":"<div><p>Depth functions offer an array of tools that enable the introduction of quantile- and ranking-like approaches to multivariate and non-Euclidean datasets. We investigate the potential of using depths in the problem of nonparametric supervised classification of directional data, that is classification of data that naturally live on the unit sphere of a Euclidean space. In this paper, we address the problem mainly from a theoretical side, with the final goal of offering guidelines on which angular depth function should be adopted in classifying directional data. A set of desirable properties of an angular depth is put forward. With respect to these properties, we compare and contrast the most widely used angular depth functions. Simulated and real data are eventually exploited to showcase the main implications of the discussed theoretical results, with an emphasis on potentials and limits of the often disregarded angular halfspace depth.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135966377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Co-clustering contaminated data: a robust model-based approach 对污染数据进行共聚类分析:基于模型的稳健方法
IF 1.4 4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-09-22 DOI: 10.1007/s11634-023-00549-3
Edoardo Fibbi, Domenico Perrotta, Francesca Torti, Stefan Van Aelst, Tim Verdonck
{"title":"Co-clustering contaminated data: a robust model-based approach","authors":"Edoardo Fibbi,&nbsp;Domenico Perrotta,&nbsp;Francesca Torti,&nbsp;Stefan Van Aelst,&nbsp;Tim Verdonck","doi":"10.1007/s11634-023-00549-3","DOIUrl":"10.1007/s11634-023-00549-3","url":null,"abstract":"<div><p>The exploration and analysis of large high-dimensional data sets calls for well-thought techniques to extract the salient information from the data, such as co-clustering. Latent block models cast co-clustering in a probabilistic framework that extends finite mixture models to the two-way setting. Real-world data sets often contain anomalies which could be of interest <i>per se</i> and may make the results provided by standard, non-robust procedures unreliable. Also estimation of latent block models can be heavily affected by contaminated data. We propose an algorithm to compute robust estimates for latent block models. Experiments on both simulated and real data show that our method is able to resist high levels of contamination and can provide additional insight into the data by highlighting possible anomalies.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11634-023-00549-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136061315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contamination transformation matrix mixture modeling for skewed data groups with heavy tails and scatter 针对具有重尾和散点的倾斜数据组的污染变换矩阵混合建模
IF 1.4 4区 计算机科学
Advances in Data Analysis and Classification Pub Date : 2023-09-13 DOI: 10.1007/s11634-023-00550-w
Xuwen Zhu, Yana Melnykov, Angelina S. Kolomoytseva
{"title":"Contamination transformation matrix mixture modeling for skewed data groups with heavy tails and scatter","authors":"Xuwen Zhu,&nbsp;Yana Melnykov,&nbsp;Angelina S. Kolomoytseva","doi":"10.1007/s11634-023-00550-w","DOIUrl":"10.1007/s11634-023-00550-w","url":null,"abstract":"<div><p>Model-based clustering is a popular application of the rapidly developing area of finite mixture modeling. While there is ample work focusing on clustering multivariate data, an increasing number of advancements have been aiming at the expansion of existing theory to the matrix-variate framework. Matrix-variate Gaussian mixtures are most popular in this setting despite the potential misfit for skewed and heavy-tailed data. To overcome this lack of flexibility, a new contaminated transformation matrix mixture model is proposed. We illustrate its utility in a series of experiments on simulated data and apply to a real-life data set containing COVID-related information. The performance of the developed model is promising in all considered settings.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":null,"pages":null},"PeriodicalIF":1.4,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135741082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信