Exploring multiprocessor approaches to time series analysis

IF 3.4 3区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Journal of Parallel and Distributed Computing Pub Date : 2024-02-08 DOI:10.1016/j.jpdc.2024.104855

Ricardo Quislant, Eladio Gutierrez, Oscar Plata

{"title":"Exploring multiprocessor approaches to time series analysis","authors":"Ricardo Quislant, Eladio Gutierrez, Oscar Plata","doi":"10.1016/j.jpdc.2024.104855","DOIUrl":null,"url":null,"abstract":"<div><p>Time series analysis is a key technique for extracting and predicting events in domains as diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, etc. <em>Matrix Profile</em>, a state-of-the-art algorithm to perform time series analysis, finds out the most similar and dissimilar subsequences in a time series in deterministic time and it is exact. Matrix Profile has low arithmetic intensity and it operates on large amounts of time series data, which can be an issue in terms of memory requirements. On the other hand, Hardware Transactional Memory (HTM) is an alternative optimistic synchronization method that executes transactions speculatively in parallel while keeping track of memory accesses to detect and resolve conflicts.</p><p>This work evaluates one of the best implementations of Matrix Profile exploring multiple multiprocessor variants and proposing new implementations that consider a variety of synchronization methods (HTM, locks, barriers), as well as algorithm organizations. We analyze these variants using real datasets, both short and large, in terms of speedup and memory requirements, the latter being a major issue when dealing with very large time series. The experimental evaluation shows that our proposals can achieve up to 100× speedup over the sequential algorithm for 128 threads, and up to 3× over the baseline, while keeping memory requirements low and even independent of the number of threads.</p></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"188 ","pages":"Article 104855"},"PeriodicalIF":3.4000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0743731524000194/pdfft?md5=a25b14cc13a327c9c4b6c5f9abde8126&pid=1-s2.0-S0743731524000194-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731524000194","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Time series analysis is a key technique for extracting and predicting events in domains as diverse as epidemiology, genomics, neuroscience, environmental sciences, economics, etc. Matrix Profile, a state-of-the-art algorithm to perform time series analysis, finds out the most similar and dissimilar subsequences in a time series in deterministic time and it is exact. Matrix Profile has low arithmetic intensity and it operates on large amounts of time series data, which can be an issue in terms of memory requirements. On the other hand, Hardware Transactional Memory (HTM) is an alternative optimistic synchronization method that executes transactions speculatively in parallel while keeping track of memory accesses to detect and resolve conflicts.

This work evaluates one of the best implementations of Matrix Profile exploring multiple multiprocessor variants and proposing new implementations that consider a variety of synchronization methods (HTM, locks, barriers), as well as algorithm organizations. We analyze these variants using real datasets, both short and large, in terms of speedup and memory requirements, the latter being a major issue when dealing with very large time series. The experimental evaluation shows that our proposals can achieve up to 100× speedup over the sequential algorithm for 128 threads, and up to 3× over the baseline, while keeping memory requirements low and even independent of the number of threads.

查看原文本刊更多论文

探索时间序列分析的多处理器方法

时间序列分析是提取和预测流行病学、基因组学、神经科学、环境科学、经济学等不同领域事件的关键技术。Matrix Profile 是一种最先进的时间序列分析算法，它能在确定的时间内找出时间序列中最相似和最不相似的子序列，而且是精确的。Matrix Profile 的运算强度较低，可处理大量的时间序列数据，这可能是内存需求方面的一个问题。另一方面，硬件事务内存（HTM）是另一种乐观的同步方法，它以并行方式推测性地执行事务，同时跟踪内存访问以检测和解决冲突。这项工作评估了 Matrix Profile 的最佳实现之一，探索了多个多处理器变体，并提出了考虑各种同步方法（HTM、锁、障碍）以及算法组织的新实现。我们使用真实的短期和长期数据集分析了这些变体的速度提升和内存需求，后者是处理超大时间序列时的一个主要问题。实验评估表明，在 128 个线程的情况下，我们的建议比顺序算法的速度提高了 100 倍，比基准算法的速度提高了 3 倍，同时保持了较低的内存需求，甚至与线程数无关。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Parallel and Distributed Computing 工程技术-计算机：理论方法

CiteScore

10.30

自引率

2.60%

发文量

172

审稿时长

12 months

期刊介绍： This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.