Frank Madrid, Shima Imani, Ryan Mercer, Zachary Schall-Zimmerman, N. S. Senobari, Eamonn J. Keogh
{"title":"矩阵配置文件XX:使用矩阵配置文件查找和可视化所有长度的时间序列图案","authors":"Frank Madrid, Shima Imani, Ryan Mercer, Zachary Schall-Zimmerman, N. S. Senobari, Eamonn J. Keogh","doi":"10.1109/ICBK.2019.00031","DOIUrl":null,"url":null,"abstract":"Many time series analytic tasks can be reduced to discovering and then reasoning about conserved structures, or time series motifs. Recently, the Matrix Profile has emerged as the state-of-the-art for finding time series motifs, allowing the community to efficiently find time series motifs in large datasets. The matrix profile reduced time series motif discovery to a process requiring a single parameter, the length of time series motifs we expect (or wish) to find. In many cases this is a reasonable limitation as the user may utilize out-of-band information or domain knowledge to set this parameter. However, in truly exploratory data mining, a poor choice of this parameter can result in failing to find unexpected and exploitable regularities in the data. In this work, we introduce the Pan Matrix Profile, a new data structure which contains the nearest neighbor information for all subsequences of all lengths. This data structure allows the first truly parameter-free motif discovery algorithm in the literature. The sheer volume of information produced by our representation may be overwhelming; thus, we also introduce a novel visualization tool called the motif-heatmap which allows the users to discover and reason about repeated structures at a glance. We demonstrate our ideas on a diverse set of domains including seismology, bioinformatics, transportation and biology.","PeriodicalId":383917,"journal":{"name":"2019 IEEE International Conference on Big Knowledge (ICBK)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":"{\"title\":\"Matrix Profile XX: Finding and Visualizing Time Series Motifs of All Lengths using the Matrix Profile\",\"authors\":\"Frank Madrid, Shima Imani, Ryan Mercer, Zachary Schall-Zimmerman, N. S. Senobari, Eamonn J. Keogh\",\"doi\":\"10.1109/ICBK.2019.00031\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many time series analytic tasks can be reduced to discovering and then reasoning about conserved structures, or time series motifs. Recently, the Matrix Profile has emerged as the state-of-the-art for finding time series motifs, allowing the community to efficiently find time series motifs in large datasets. The matrix profile reduced time series motif discovery to a process requiring a single parameter, the length of time series motifs we expect (or wish) to find. In many cases this is a reasonable limitation as the user may utilize out-of-band information or domain knowledge to set this parameter. However, in truly exploratory data mining, a poor choice of this parameter can result in failing to find unexpected and exploitable regularities in the data. In this work, we introduce the Pan Matrix Profile, a new data structure which contains the nearest neighbor information for all subsequences of all lengths. This data structure allows the first truly parameter-free motif discovery algorithm in the literature. The sheer volume of information produced by our representation may be overwhelming; thus, we also introduce a novel visualization tool called the motif-heatmap which allows the users to discover and reason about repeated structures at a glance. We demonstrate our ideas on a diverse set of domains including seismology, bioinformatics, transportation and biology.\",\"PeriodicalId\":383917,\"journal\":{\"name\":\"2019 IEEE International Conference on Big Knowledge (ICBK)\",\"volume\":\"73 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"31\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Big Knowledge (ICBK)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICBK.2019.00031\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Big Knowledge (ICBK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICBK.2019.00031","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Matrix Profile XX: Finding and Visualizing Time Series Motifs of All Lengths using the Matrix Profile
Many time series analytic tasks can be reduced to discovering and then reasoning about conserved structures, or time series motifs. Recently, the Matrix Profile has emerged as the state-of-the-art for finding time series motifs, allowing the community to efficiently find time series motifs in large datasets. The matrix profile reduced time series motif discovery to a process requiring a single parameter, the length of time series motifs we expect (or wish) to find. In many cases this is a reasonable limitation as the user may utilize out-of-band information or domain knowledge to set this parameter. However, in truly exploratory data mining, a poor choice of this parameter can result in failing to find unexpected and exploitable regularities in the data. In this work, we introduce the Pan Matrix Profile, a new data structure which contains the nearest neighbor information for all subsequences of all lengths. This data structure allows the first truly parameter-free motif discovery algorithm in the literature. The sheer volume of information produced by our representation may be overwhelming; thus, we also introduce a novel visualization tool called the motif-heatmap which allows the users to discover and reason about repeated structures at a glance. We demonstrate our ideas on a diverse set of domains including seismology, bioinformatics, transportation and biology.