{"title":"基于梯度的动态数据流趋势和异常值预测算法","authors":"Dawei Sun, Vincent C. S. Lee, Ye Lu","doi":"10.1109/ICIEA.2017.8283162","DOIUrl":null,"url":null,"abstract":"Trend and outlier are frequently used to derive early warning predictive signal to decision maker in order to achieve ultimate quality decision outcome in domain specific (e.g. commercial, scientific, biomedical and engineering, just to name a few) applications. We develop a gradient-based algorithm using sample entropy gradient(SEG) for trend and outlier prediction in high frequency time series data streams. L2 similarity measure (Euclidean distance between two linearized gradient curves is then computed and used to quantify the degree of similarity and compared with a threshold L2 value to judge the extend of dissimilarity that would be classified as outlier. SEG algorithm which circumvents the need to pre-specify tolerance parameter in those cross sample entropy (CSE)-based algorithms that invariably involve real domain expert to set the tolerance threshold. We conduct real data experiments on SEG algorithm to two application areas: dynamic wind speed data stream; and financial time series data. Our experiments demonstrated that SEG algorithm can be feasibly used in online implementation to derive predictive early warning signals to domain-specific decision maker.","PeriodicalId":443463,"journal":{"name":"2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA)","volume":"30 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A gradient-based algorithm for trend and outlier prediction in dynamic data streams\",\"authors\":\"Dawei Sun, Vincent C. S. Lee, Ye Lu\",\"doi\":\"10.1109/ICIEA.2017.8283162\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Trend and outlier are frequently used to derive early warning predictive signal to decision maker in order to achieve ultimate quality decision outcome in domain specific (e.g. commercial, scientific, biomedical and engineering, just to name a few) applications. We develop a gradient-based algorithm using sample entropy gradient(SEG) for trend and outlier prediction in high frequency time series data streams. L2 similarity measure (Euclidean distance between two linearized gradient curves is then computed and used to quantify the degree of similarity and compared with a threshold L2 value to judge the extend of dissimilarity that would be classified as outlier. SEG algorithm which circumvents the need to pre-specify tolerance parameter in those cross sample entropy (CSE)-based algorithms that invariably involve real domain expert to set the tolerance threshold. We conduct real data experiments on SEG algorithm to two application areas: dynamic wind speed data stream; and financial time series data. Our experiments demonstrated that SEG algorithm can be feasibly used in online implementation to derive predictive early warning signals to domain-specific decision maker.\",\"PeriodicalId\":443463,\"journal\":{\"name\":\"2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA)\",\"volume\":\"30 3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIEA.2017.8283162\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIEA.2017.8283162","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在特定领域(如商业、科学、生物医学和工程,仅举几例)的应用中,趋势和离群值经常被用来为决策者提供预警预测信号,以实现高质量的最终决策结果。我们利用样本熵梯度(SEG)开发了一种基于梯度的算法,用于预测高频时间序列数据流中的趋势和异常值。然后计算 L2 相似度量(两条线性化梯度曲线之间的欧氏距离),并将其用于量化相似度,然后与阈值 L2 值进行比较,以判断将被归类为离群值的相似度范围。SEG 算法避免了那些基于交叉样本熵(CSE)的算法中预先指定容差参数的需要,因为这些算法总是需要实际领域的专家来设置容差阈值。我们在两个应用领域对 SEG 算法进行了真实数据实验:动态风速数据流和金融时间序列数据。实验结果表明,SEG 算法可用于在线实施,为特定领域的决策者提供预测性预警信号。
A gradient-based algorithm for trend and outlier prediction in dynamic data streams
Trend and outlier are frequently used to derive early warning predictive signal to decision maker in order to achieve ultimate quality decision outcome in domain specific (e.g. commercial, scientific, biomedical and engineering, just to name a few) applications. We develop a gradient-based algorithm using sample entropy gradient(SEG) for trend and outlier prediction in high frequency time series data streams. L2 similarity measure (Euclidean distance between two linearized gradient curves is then computed and used to quantify the degree of similarity and compared with a threshold L2 value to judge the extend of dissimilarity that would be classified as outlier. SEG algorithm which circumvents the need to pre-specify tolerance parameter in those cross sample entropy (CSE)-based algorithms that invariably involve real domain expert to set the tolerance threshold. We conduct real data experiments on SEG algorithm to two application areas: dynamic wind speed data stream; and financial time series data. Our experiments demonstrated that SEG algorithm can be feasibly used in online implementation to derive predictive early warning signals to domain-specific decision maker.