Dimension reduction and visualization of multiple time series data: a symbolic data analysis approach

IF 1 4区 数学 Q3 STATISTICS & PROBABILITY
Emily Chia-Yu Su, Han-Ming Wu
{"title":"Dimension reduction and visualization of multiple time series data: a symbolic data analysis approach","authors":"Emily Chia-Yu Su, Han-Ming Wu","doi":"10.1007/s00180-023-01440-7","DOIUrl":null,"url":null,"abstract":"<p>Exploratory analysis and visualization of multiple time series data are essential for discovering the underlying dynamics of a series before attempting modeling and forecasting. This study extends two dimension reduction methods - principal component analysis (PCA) and sliced inverse regression (SIR) - to multiple time series data. This is achieved through the innovative path point approach, a new addition to the symbolic data analysis framework. By transforming multiple time series data into time-dependent intervals marked by starting and ending values, each series is geometrically represented as successive directed segments with unique path points. These path points serve as the foundation of our novel representation approach. PCA and SIR are then applied to the data table formed by the coordinates of these path points, enabling visualization of temporal trajectories of objects within a reduced-dimensional subspace. Empirical studies encompassing simulations, microarray time series data from a yeast cell cycle, and financial data confirm the effectiveness of our path point approach in revealing the structure and behavior of objects within a 2D factorial plane. Comparative analyses with existing methods, such as the applied vector approach for PCA and SIR on time-dependent interval data, further underscore the strength and versatility of our path point representation in the realm of time series data.</p>","PeriodicalId":55223,"journal":{"name":"Computational Statistics","volume":"93 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s00180-023-01440-7","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Exploratory analysis and visualization of multiple time series data are essential for discovering the underlying dynamics of a series before attempting modeling and forecasting. This study extends two dimension reduction methods - principal component analysis (PCA) and sliced inverse regression (SIR) - to multiple time series data. This is achieved through the innovative path point approach, a new addition to the symbolic data analysis framework. By transforming multiple time series data into time-dependent intervals marked by starting and ending values, each series is geometrically represented as successive directed segments with unique path points. These path points serve as the foundation of our novel representation approach. PCA and SIR are then applied to the data table formed by the coordinates of these path points, enabling visualization of temporal trajectories of objects within a reduced-dimensional subspace. Empirical studies encompassing simulations, microarray time series data from a yeast cell cycle, and financial data confirm the effectiveness of our path point approach in revealing the structure and behavior of objects within a 2D factorial plane. Comparative analyses with existing methods, such as the applied vector approach for PCA and SIR on time-dependent interval data, further underscore the strength and versatility of our path point representation in the realm of time series data.

Abstract Image

多时间序列数据的降维与可视化:一种符号数据分析方法
在尝试建模和预测之前,对多个时间序列数据进行探索性分析和可视化对于发现序列的内在动态至关重要。本研究将两种降维方法--主成分分析(PCA)和切片反回归(SIR)--扩展到多时间序列数据。这是通过创新的路径点方法来实现的,该方法是对符号数据分析框架的新补充。通过将多个时间序列数据转换为以起始值和终止值为标志的时间相关区间,每个序列被几何表示为具有唯一路径点的连续有向线段。这些路径点是我们新颖表示方法的基础。然后,将 PCA 和 SIR 应用于由这些路径点坐标形成的数据表,从而在一个缩减维度的子空间内实现对象时间轨迹的可视化。包括模拟、酵母细胞周期微阵列时间序列数据和金融数据在内的实证研究证实了我们的路径点方法在揭示二维因子平面内对象的结构和行为方面的有效性。与现有方法的比较分析,如 PCA 的应用向量法和时间相关区间数据的 SIR,进一步强调了我们的路径点表示法在时间序列数据领域的优势和多功能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Statistics
Computational Statistics 数学-统计学与概率论
CiteScore
2.90
自引率
0.00%
发文量
122
审稿时长
>12 weeks
期刊介绍: Computational Statistics (CompStat) is an international journal which promotes the publication of applications and methodological research in the field of Computational Statistics. The focus of papers in CompStat is on the contribution to and influence of computing on statistics and vice versa. The journal provides a forum for computer scientists, mathematicians, and statisticians in a variety of fields of statistics such as biometrics, econometrics, data analysis, graphics, simulation, algorithms, knowledge based systems, and Bayesian computing. CompStat publishes hardware, software plus package reports.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信