{"title":"数据分割算法:单变量均值变化及其他","authors":"Haeran Cho , Claudia Kirch","doi":"10.1016/j.ecosta.2021.10.008","DOIUrl":null,"url":null,"abstract":"<div><p><span>Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis<span> and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. The first part reviews the existing literature on the </span></span><em>canonical data segmentation problem</em><span> which aims at detecting and localising multiple change points in the mean of univariate time series. An overview of popular methodologies is provided on their computational complexity and theoretical properties. In particular, the theoretical discussion focuses on the </span><em>separation rate</em> relating to which change points are detectable by a given procedure, and the <em>localisation rate</em><span> quantifying the precision of corresponding change point estimators, and a distinction is made whether a </span><em>homogeneous</em> or <em>multiscale</em><span> viewpoint has been adopted in their derivation. It is further highlighted that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms.</span></p><p>Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. The second part of this survey motivates the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. This point is illustrated with a range of examples showcasing the connections between complex distributional changes and those in the mean. Extensions towards high-dimensional change point problems are also discussed where it is demonstrated that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.</p></div>","PeriodicalId":54125,"journal":{"name":"Econometrics and Statistics","volume":"30 ","pages":"Pages 76-95"},"PeriodicalIF":2.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data segmentation algorithms: Univariate mean change and beyond\",\"authors\":\"Haeran Cho , Claudia Kirch\",\"doi\":\"10.1016/j.ecosta.2021.10.008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p><span>Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis<span> and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. The first part reviews the existing literature on the </span></span><em>canonical data segmentation problem</em><span> which aims at detecting and localising multiple change points in the mean of univariate time series. An overview of popular methodologies is provided on their computational complexity and theoretical properties. In particular, the theoretical discussion focuses on the </span><em>separation rate</em> relating to which change points are detectable by a given procedure, and the <em>localisation rate</em><span> quantifying the precision of corresponding change point estimators, and a distinction is made whether a </span><em>homogeneous</em> or <em>multiscale</em><span> viewpoint has been adopted in their derivation. It is further highlighted that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms.</span></p><p>Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. The second part of this survey motivates the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. This point is illustrated with a range of examples showcasing the connections between complex distributional changes and those in the mean. Extensions towards high-dimensional change point problems are also discussed where it is demonstrated that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.</p></div>\",\"PeriodicalId\":54125,\"journal\":{\"name\":\"Econometrics and Statistics\",\"volume\":\"30 \",\"pages\":\"Pages 76-95\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Econometrics and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2452306221001234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452306221001234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ECONOMICS","Score":null,"Total":0}
Data segmentation algorithms: Univariate mean change and beyond
Data segmentation a.k.a. multiple change point analysis has received considerable attention due to its importance in time series analysis and signal processing, with applications in a variety of fields including natural and social sciences, medicine, engineering and finance. The first part reviews the existing literature on the canonical data segmentation problem which aims at detecting and localising multiple change points in the mean of univariate time series. An overview of popular methodologies is provided on their computational complexity and theoretical properties. In particular, the theoretical discussion focuses on the separation rate relating to which change points are detectable by a given procedure, and the localisation rate quantifying the precision of corresponding change point estimators, and a distinction is made whether a homogeneous or multiscale viewpoint has been adopted in their derivation. It is further highlighted that the latter viewpoint provides the most general setting for investigating the optimality of data segmentation algorithms.
Arguably, the canonical segmentation problem has been the most popular framework to propose new data segmentation algorithms and study their efficiency in the last decades. The second part of this survey motivates the importance of attaining an in-depth understanding of strengths and weaknesses of methodologies for the change point problem in a simpler, univariate setting, as a stepping stone for the development of methodologies for more complex problems. This point is illustrated with a range of examples showcasing the connections between complex distributional changes and those in the mean. Extensions towards high-dimensional change point problems are also discussed where it is demonstrated that the challenges arising from high dimensionality are orthogonal to those in dealing with multiple change points.
期刊介绍:
Econometrics and Statistics is the official journal of the networks Computational and Financial Econometrics and Computational and Methodological Statistics. It publishes research papers in all aspects of econometrics and statistics and comprises of the two sections Part A: Econometrics and Part B: Statistics.