数据分割算法:单变量均值变化及其他

IF 2 Q2 ECONOMICS
Haeran Cho , Claudia Kirch
{"title":"数据分割算法:单变量均值变化及其他","authors":"Haeran Cho ,&nbsp;Claudia Kirch","doi":"10.1016/j.ecosta.2021.10.008","DOIUrl":null,"url":null,"abstract":"<div><p><span>Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis<span> and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. The first part reviews the existing literature on the </span></span><em>canonical data segmentation problem</em><span> which aims at detecting and localising multiple change points in the mean of univariate time series. An overview of popular methodologies is provided on their computational complexity and theoretical properties. In particular, the theoretical discussion focuses on the </span><em>separation rate</em> relating to which change points are detectable by a given procedure, and the <em>localisation rate</em><span> quantifying the precision of corresponding change point estimators, and a distinction is made whether a </span><em>homogeneous</em> or <em>multiscale</em><span> viewpoint has been adopted in their derivation. It is further highlighted that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms.</span></p><p>Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. The second part of this survey motivates the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. This point is illustrated with a range of examples showcasing the connections between complex distributional changes and those in the mean. Extensions towards high-dimensional change point problems are also discussed where it is demonstrated that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.</p></div>","PeriodicalId":54125,"journal":{"name":"Econometrics and Statistics","volume":"30 ","pages":"Pages 76-95"},"PeriodicalIF":2.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data segmentation algorithms: Univariate mean change and beyond\",\"authors\":\"Haeran Cho ,&nbsp;Claudia Kirch\",\"doi\":\"10.1016/j.ecosta.2021.10.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span>Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis<span> and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. The first part reviews the existing literature on the </span></span><em>canonical data segmentation problem</em><span> which aims at detecting and localising multiple change points in the mean of univariate time series. An overview of popular methodologies is provided on their computational complexity and theoretical properties. In particular, the theoretical discussion focuses on the </span><em>separation rate</em> relating to which change points are detectable by a given procedure, and the <em>localisation rate</em><span> quantifying the precision of corresponding change point estimators, and a distinction is made whether a </span><em>homogeneous</em> or <em>multiscale</em><span> viewpoint has been adopted in their derivation. It is further highlighted that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms.</span></p><p>Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. The second part of this survey motivates the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. This point is illustrated with a range of examples showcasing the connections between complex distributional changes and those in the mean. Extensions towards high-dimensional change point problems are also discussed where it is demonstrated that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.</p></div>\",\"PeriodicalId\":54125,\"journal\":{\"name\":\"Econometrics and Statistics\",\"volume\":\"30 \",\"pages\":\"Pages 76-95\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Econometrics and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2452306221001234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452306221001234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0

摘要

数据分割又称多变化点分析,因其在时间序列分析和信号处理中的重要性而受到广泛关注,并应用于自然科学、社会科学、医学、工程学和金融学等多个领域。第一部分回顾了关于典型数据分割问题的现有文献,该问题旨在检测和定位单变量时间序列均值中的多个变化点。概述了流行方法的计算复杂性和理论特性。特别是,理论讨论的重点是与给定程序可检测到哪些变化点有关的分离率,以及量化相应变化点估计器精度的定位率,并区分了在推导这些方法时采用的是同质观点还是多尺度观点。本文进一步强调,后一种观点为研究数据分割算法的最优性提供了最一般的设定。可以说,在过去几十年中,典型分割问题一直是提出新数据分割算法并研究其效率的最流行框架。本调查报告的第二部分强调了在较简单的单变量环境中深入了解变化点问题方法的优缺点的重要性,这是为更复杂的问题开发方法的垫脚石。我们通过一系列实例来说明这一点,这些实例展示了复杂分布变化与均值变化之间的联系。此外,还讨论了向高维变化点问题的扩展,证明高维性带来的挑战与处理多个变化点的挑战是正交的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data segmentation algorithms: Univariate mean change and beyond

Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. The first part reviews the existing literature on the canonical data segmentation problem which aims at detecting and localising multiple change points in the mean of univariate time series. An overview of popular methodologies is provided on their computational complexity and theoretical properties. In particular, the theoretical discussion focuses on the separation rate relating to which change points are detectable by a given procedure, and the localisation rate quantifying the precision of corresponding change point estimators, and a distinction is made whether a homogeneous or multiscale viewpoint has been adopted in their derivation. It is further highlighted that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms.

Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. The second part of this survey motivates the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. This point is illustrated with a range of examples showcasing the connections between complex distributional changes and those in the mean. Extensions towards high-dimensional change point problems are also discussed where it is demonstrated that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.10
自引率
10.50%
发文量
84
期刊介绍: Econometrics and Statistics is the official journal of the networks Computational and Financial Econometrics and Computational and Methodological Statistics. It publishes research papers in all aspects of econometrics and statistics and comprises of the two sections Part A: Econometrics and Part B: Statistics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信