{"title":"ε-匹配:实时处理噪声序列的事件","authors":"Zheng Li, Tingjian Ge, Cindy X. Chen","doi":"10.1145/2463676.2463715","DOIUrl":null,"url":null,"abstract":"Regular expression matching over sequences in real time is a crucial task in complex event processing on data streams. Given that such data sequences are often noisy and errors have temporal and spatial correlations, performing regular expression matching effectively and efficiently is a challenging task. Instead of the traditional approach of learning a distribution of the stream first and then processing queries, we propose a new approach that efficiently does the matching based on an error model. In particular, our algorithms are based on the realistic Markov chain error model, and report all matching paths to trace relevant basic events that trigger the matching. This is much more informative than a single matching path. We also devise algorithms to efficiently return only top-k matching paths, and to handle negations in an extended regular expression. Finally, we conduct a comprehensive experimental study to evaluate our algorithms using real datasets.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"ε-Matching: event processing over noisy sequences in real time\",\"authors\":\"Zheng Li, Tingjian Ge, Cindy X. Chen\",\"doi\":\"10.1145/2463676.2463715\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Regular expression matching over sequences in real time is a crucial task in complex event processing on data streams. Given that such data sequences are often noisy and errors have temporal and spatial correlations, performing regular expression matching effectively and efficiently is a challenging task. Instead of the traditional approach of learning a distribution of the stream first and then processing queries, we propose a new approach that efficiently does the matching based on an error model. In particular, our algorithms are based on the realistic Markov chain error model, and report all matching paths to trace relevant basic events that trigger the matching. This is much more informative than a single matching path. We also devise algorithms to efficiently return only top-k matching paths, and to handle negations in an extended regular expression. Finally, we conduct a comprehensive experimental study to evaluate our algorithms using real datasets.\",\"PeriodicalId\":87344,\"journal\":{\"name\":\"Proceedings. ACM-SIGMOD International Conference on Management of Data\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. ACM-SIGMOD International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2463676.2463715\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. ACM-SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2463676.2463715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
ε-Matching: event processing over noisy sequences in real time
Regular expression matching over sequences in real time is a crucial task in complex event processing on data streams. Given that such data sequences are often noisy and errors have temporal and spatial correlations, performing regular expression matching effectively and efficiently is a challenging task. Instead of the traditional approach of learning a distribution of the stream first and then processing queries, we propose a new approach that efficiently does the matching based on an error model. In particular, our algorithms are based on the realistic Markov chain error model, and report all matching paths to trace relevant basic events that trigger the matching. This is much more informative than a single matching path. We also devise algorithms to efficiently return only top-k matching paths, and to handle negations in an extended regular expression. Finally, we conduct a comprehensive experimental study to evaluate our algorithms using real datasets.