{"title":"Zero-Inflated Time Series Clustering Via Ensemble Thick-Pen Transform.","authors":"Minji Kim, Hee-Seok Oh, Yaeji Lim","doi":"10.1007/s00357-023-09437-z","DOIUrl":null,"url":null,"abstract":"<p><p>This study develops a new clustering method for high-dimensional zero-inflated time series data. The proposed method is based on thick-pen transform (TPT), in which the basic idea is to draw along the data with a pen of a given thickness. Since TPT is a multi-scale visualization technique, it provides some information on the temporal tendency of neighborhood values. We introduce a modified TPT, termed 'ensemble TPT (e-TPT)', to enhance the temporal resolution of zero-inflated time series data that is crucial for clustering them efficiently. Furthermore, this study defines a modified similarity measure for zero-inflated time series data considering e-TPT and proposes an efficient iterative clustering algorithm suitable for the proposed measure. Finally, the effectiveness of the proposed method is demonstrated by simulation experiments and two real datasets: step count data and newly confirmed COVID-19 case data.</p>","PeriodicalId":50241,"journal":{"name":"Journal of Classification","volume":" ","pages":"1-25"},"PeriodicalIF":1.8000,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10258486/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Classification","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00357-023-09437-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
This study develops a new clustering method for high-dimensional zero-inflated time series data. The proposed method is based on thick-pen transform (TPT), in which the basic idea is to draw along the data with a pen of a given thickness. Since TPT is a multi-scale visualization technique, it provides some information on the temporal tendency of neighborhood values. We introduce a modified TPT, termed 'ensemble TPT (e-TPT)', to enhance the temporal resolution of zero-inflated time series data that is crucial for clustering them efficiently. Furthermore, this study defines a modified similarity measure for zero-inflated time series data considering e-TPT and proposes an efficient iterative clustering algorithm suitable for the proposed measure. Finally, the effectiveness of the proposed method is demonstrated by simulation experiments and two real datasets: step count data and newly confirmed COVID-19 case data.
期刊介绍:
To publish original and valuable papers in the field of classification, numerical taxonomy, multidimensional scaling and other ordination techniques, clustering, tree structures and other network models (with somewhat less emphasis on principal components analysis, factor analysis, and discriminant analysis), as well as associated models and algorithms for fitting them. Articles will support advances in methodology while demonstrating compelling substantive applications. Comprehensive review articles are also acceptable. Contributions will represent disciplines such as statistics, psychology, biology, information retrieval, anthropology, archeology, astronomy, business, chemistry, computer science, economics, engineering, geography, geology, linguistics, marketing, mathematics, medicine, political science, psychiatry, sociology, and soil science.