Advances in Data Analysis and Classification最新文献_第7页

Contamination transformation matrix mixture modeling for skewed data groups with heavy tails and scatter 针对具有重尾和散点的倾斜数据组的污染变换矩阵混合建模

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-09-13 DOI: 10.1007/s11634-023-00550-w

Xuwen Zhu, Yana Melnykov, Angelina S. Kolomoytseva

引用次数: 0

An analytic strategy for data processing of multimode networks 多模网络数据处理分析策略

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-08-29 DOI: 10.1007/s11634-023-00556-4

Vincenzo Giuseppe Genova, Giuseppe Giordano, Giancarlo Ragozini, Maria Prosperina Vitale

{"title":"An analytic strategy for data processing of multimode networks","authors":"Vincenzo Giuseppe Genova, Giuseppe Giordano, Giancarlo Ragozini, Maria Prosperina Vitale","doi":"10.1007/s11634-023-00556-4","DOIUrl":"10.1007/s11634-023-00556-4","url":null,"abstract":"<div><p>Complex network data structures are considered to capture the richness of social phenomena and real-life data settings. Multipartite networks are an example in which various scenarios are represented by different types of relations, actors, or modes. Within this context, the present contribution aims at discussing an analytic strategy for simplifying multipartite networks in which different sets of nodes are linked. By considering the connection of multimode networks and hypergraphs as theoretical concepts, a three-step procedure is introduced to simplify, normalize, and filter network data structures. Thus, a model-based approach is introduced for derived bipartite weighted networks in order to extract statistically significant links. The usefulness of the strategy is demonstrated in handling two application fields, that is, intranational student mobility in higher education and research collaboration in European framework programs. Finally, both examples are explored using community detection algorithms to determine the presence of groups by mixing up different modes.\u0000</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"18 3","pages":"745 - 767"},"PeriodicalIF":1.4,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s11634-023-00556-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82739517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust gradient boosting for generalized additive models for location, scale and shape 位置、尺度和形状广义加性模型的鲁棒梯度增强

IF 1.6 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-08-26 DOI: 10.1007/s11634-023-00555-5

Jan Speller, C. Staerk, Francisco Gude, A. Mayr

引用次数: 0

Editorial for ADAC issue 3 of volume 17 (2023) ADAC第17卷第3期(2023年)社论

IF 1.6 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-08-03 DOI: 10.1007/s11634-023-00554-6

Maurizio Vichi, Andrea Cerioli, Hans A. Kestler, Akinori Okada, Claus Weihs

引用次数: 0

On the efficient implementation of classification rule learning 论分类规则学习的高效实施

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-07-27 DOI: 10.1007/s11634-023-00553-7

Michael Rapp, Johannes Fürnkranz, Eyke Hüllermeier

引用次数: 0

Model-based clustering using a new multivariate skew distribution 使用新的多元倾斜分布进行基于模型的聚类

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-07-22 DOI: 10.1007/s11634-023-00552-8

Salvatore D. Tomarchio, Luca Bagnato, Antonio Punzo

引用次数: 0

A topological data analysis based classifier 基于拓扑数据分析的分类器

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-07-01 DOI: 10.1007/s11634-023-00548-4

Rolando Kindelan, José Frías, Mauricio Cerda, Nancy Hitschfeld

{"title":"A topological data analysis based classifier","authors":"Rolando Kindelan, José Frías, Mauricio Cerda, Nancy Hitschfeld","doi":"10.1007/s11634-023-00548-4","DOIUrl":"10.1007/s11634-023-00548-4","url":null,"abstract":"<div><p>Topological Data Analysis (TDA) is an emerging field that aims to discover a dataset’s underlying topological information. TDA tools have been commonly used to create filters and topological descriptors to improve Machine Learning (ML) methods. This paper proposes a different TDA pipeline to classify balanced and imbalanced multi-class datasets without additional ML methods. Our proposed method was designed to solve multi-class and imbalanced classification problems with no data resampling preprocessing stage. The proposed TDA-based classifier (TDABC) builds a filtered simplicial complex on the dataset representing high-order data relationships. Following the assumption that a meaningful sub-complex exists in the filtration that approximates the data topology, we apply Persistent Homology (PH) to guide the selection of that sub-complex by considering detected topological features. We use each unlabeled point’s link and star operators to provide different-sized and multi-dimensional neighborhoods to propagate labels from labeled to unlabeled points. The labeling function depends on the filtration’s entire history of the filtered simplicial complex and it is encoded within the persistence diagrams at various dimensions. We select eight datasets with different dimensions, degrees of class overlap, and imbalanced samples per class to validate our method. The TDABC outperforms all baseline methods classifying multi-class imbalanced data with high imbalanced ratios and data with overlapped classes. Also, on average, the proposed method was better than K Nearest Neighbors (KNN) and weighted KNN and behaved competitively with Support Vector Machine and Random Forest baseline classifiers in balanced datasets.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"18 2","pages":"493 - 538"},"PeriodicalIF":1.4,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87127200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A link function specification test in the single functional index model 单函数索引模型中的链接函数规范测试

IF 1.6 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-06-22 DOI: 10.1007/s11634-023-00545-7

Lax Chan, L. Delsol, A. Goia

引用次数: 1

MLE for the parameters of bivariate interval-valued model 双变量区间值模型参数的 MLE

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-06-18 DOI: 10.1007/s11634-023-00546-6

S. Yaser Samadi, L. Billard, Jiin-Huarng Guo, Wei Xu

引用次数: 0

Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub 使用 "总和与份额 "和泊松对数正态混合模型进行多变量计数时间序列分割：利用多式联运枢纽内的人流进行比较研究

IF 1.4 4区计算机科学

Advances in Data Analysis and Classification Pub Date : 2023-05-29 DOI: 10.1007/s11634-023-00543-9

Paul de Nailly, Etienne Côme, Latifa Oukhellou, Allou Samé, Jacques Ferriere, Yasmine Merad-Boudia

{"title":"Multivariate count time series segmentation with “sums and shares” and Poisson lognormal mixture models: a comparative study using pedestrian flows within a multimodal transport hub","authors":"Paul de Nailly, Etienne Côme, Latifa Oukhellou, Allou Samé, Jacques Ferriere, Yasmine Merad-Boudia","doi":"10.1007/s11634-023-00543-9","DOIUrl":"10.1007/s11634-023-00543-9","url":null,"abstract":"<div><p>This paper deals with a clustering approach based on mixture models to analyze multidimensional mobility count time-series data within a multimodal transport hub. These time series are very likely to evolve depending on various periods characterized by strikes, maintenance works, or health measures against the Covid19 pandemic. In addition, exogenous one-off factors, such as concerts and transport disruptions, can also impact mobility. Our approach flexibly detects time segments within which the very noisy count data is synthesized into regular spatio-temporal mobility profiles. At the upper level of the modeling, evolving mixing weights are designed to detect segments properly. At the lower level, segment-specific count regression models take into account correlations between series and overdispersion as well as the impact of exogenous factors. For this purpose, we set up and compare two promising strategies that can address this issue, namely the “sums and shares” and “Poisson log-normal” models. The proposed methodologies are applied to actual data collected within a multimodal transport hub in the Paris region. Ticketing logs and pedestrian counts provided by stereo cameras are considered here. Experiments are carried out to show the ability of the statistical models to highlight mobility patterns within the transport hub. One model is chosen based on its ability to detect the most continuous segments possible while fitting the count time series well. An in-depth analysis of the time segmentation, mobility patterns, and impact of exogenous factors obtained with the chosen model is finally performed.</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"18 2","pages":"455 - 491"},"PeriodicalIF":1.4,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83868644","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0