Evaluation is key: a survey on evaluation measures for synthetic time series

IF 6.4 2区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

Journal of Big Data Pub Date : 2024-05-07 DOI:10.1186/s40537-024-00924-7

Michael Stenger, Robert Leppich, Ian Foster, Samuel Kounev, André Bauer

{"title":"Evaluation is key: a survey on evaluation measures for synthetic time series","authors":"Michael Stenger, Robert Leppich, Ian Foster, Samuel Kounev, André Bauer","doi":"10.1186/s40537-024-00924-7","DOIUrl":null,"url":null,"abstract":"<p>Synthetic data generation describes the process of learning the underlying distribution of a given real dataset in a model, which is, in turn, sampled to produce new data objects still adhering to the original distribution. This approach often finds application where circumstances limit the availability or usability of real-world datasets, for instance, in health care due to privacy concerns. While image synthesis has received much attention in the past, time series are key for many practical (e.g., industrial) applications. To date, numerous different generative models and measures to evaluate time series syntheses have been proposed. However, regarding the defining features of high-quality synthetic time series and how to quantify quality, no consensus has yet been reached among researchers. Hence, we propose a comprehensive survey on evaluation measures for time series generation to assist users in evaluating synthetic time series. For one, we provide brief descriptions or - where applicable - precise definitions. Further, we order the measures in a taxonomy and examine applicability and usage. To assist in the selection of the most appropriate measures, we provide a concise guide for fast lookup. Notably, our findings reveal a lack of a universally accepted approach for an evaluation procedure, including the selection of appropriate measures. We believe this situation hinders progress and may even erode evaluation standards to a “do as you like”-approach to synthetic data evaluation. Therefore, this survey is a preliminary step to advance the field of synthetic data evaluation.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"28 1","pages":""},"PeriodicalIF":6.4000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s40537-024-00924-7","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Synthetic data generation describes the process of learning the underlying distribution of a given real dataset in a model, which is, in turn, sampled to produce new data objects still adhering to the original distribution. This approach often finds application where circumstances limit the availability or usability of real-world datasets, for instance, in health care due to privacy concerns. While image synthesis has received much attention in the past, time series are key for many practical (e.g., industrial) applications. To date, numerous different generative models and measures to evaluate time series syntheses have been proposed. However, regarding the defining features of high-quality synthetic time series and how to quantify quality, no consensus has yet been reached among researchers. Hence, we propose a comprehensive survey on evaluation measures for time series generation to assist users in evaluating synthetic time series. For one, we provide brief descriptions or - where applicable - precise definitions. Further, we order the measures in a taxonomy and examine applicability and usage. To assist in the selection of the most appropriate measures, we provide a concise guide for fast lookup. Notably, our findings reveal a lack of a universally accepted approach for an evaluation procedure, including the selection of appropriate measures. We believe this situation hinders progress and may even erode evaluation standards to a “do as you like”-approach to synthetic data evaluation. Therefore, this survey is a preliminary step to advance the field of synthetic data evaluation.

Abstract Image

查看原文本刊更多论文

评估是关键：关于合成时间序列评估措施的调查

合成数据生成描述了在一个模型中学习给定真实数据集的基本分布的过程，反过来，该模型被采样以生成新的数据对象，这些新的数据对象仍然遵循原始分布。这种方法通常适用于现实世界数据集的可用性或可用性受到限制的情况，例如，出于隐私考虑，在医疗保健领域。图像合成在过去受到了广泛关注，而时间序列则是许多实际（如工业）应用的关键。迄今为止，已经提出了许多不同的生成模型和评估时间序列合成的方法。然而，对于高质量合成时间序列的定义特征以及如何量化质量，研究人员尚未达成共识。因此，我们建议对时间序列生成的评估措施进行全面调查，以帮助用户评估合成时间序列。首先，我们提供了简要说明或（如适用）精确定义。此外，我们还以分类法的形式对评估指标进行排序，并研究其适用性和使用情况。为了帮助选择最合适的测量方法，我们提供了快速查找的简明指南。值得注意的是，我们的研究结果表明，在评估程序（包括选择适当的衡量标准）方面缺乏普遍接受的方法。我们认为，这种情况会阻碍进展，甚至会削弱评估标准，使合成数据评估变成一种 "随心所欲 "的方法。因此，本次调查是推动合成数据评估领域发展的第一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Big Data Computer Science-Information Systems

CiteScore

17.80

自引率

3.70%

发文量

105

审稿时长

13 weeks

期刊介绍： The Journal of Big Data publishes high-quality, scholarly research papers, methodologies, and case studies covering a broad spectrum of topics, from big data analytics to data-intensive computing and all applications of big data research. It addresses challenges facing big data today and in the future, including data capture and storage, search, sharing, analytics, technologies, visualization, architectures, data mining, machine learning, cloud computing, distributed systems, and scalable storage. The journal serves as a seminal source of innovative material for academic researchers and practitioners alike.