Tianchen Ding, Shiyou Qian, Jian Cao, Guangtao Xue, Minglu Li
{"title":"SCSL: Optimizing Matching Algorithms to Improve Real-time for Content-based Pub/Sub Systems","authors":"Tianchen Ding, Shiyou Qian, Jian Cao, Guangtao Xue, Minglu Li","doi":"10.1109/IPDPS47924.2020.00025","DOIUrl":null,"url":null,"abstract":"Although many matching algorithms have been proposed to improve the matching efficiency of the content-based publish/subscribe system, existing work seldom consider the real-time of event dissemination from the perspective of event matching. On the basis of two existing matching algorithms, in this paper, we propose a subscription-classifying and structure-layering (SCSL) optimization method for matching algorithms, aiming to improve real-time by shortening the determining time of matching subscriptions. The basic idea of SCSL is that subscriptions with high matching probabilities should be processed first in the process of event matching and their storage positions in the data structure should be adjusted in line with changing probabilities. One challenge of SCSL is the trade-off that needs to be made between the gains of improving real-time performance by identifying matching subscriptions earlier and the cost of increasing matching time due to subscription classification and adjustment. We design a concise scheme to classify subscriptions, establish a lightweight adjustment mechanism to deal with dynamics and propose an efficient greedy algorithm to compute the adjustment solution, which alleviates the impact of SCSL on matching performance. The experiment results show that the 95th percentile of the determining time of matching subscriptions is improved by about 70%. Furthermore, we integrate SCSL into Apache Kafka to augment it as a content-based publish/subscribe system and test the effect of SCSL based on real-world stock trace data, which witnesses about 40% improvement on the average event transfer latency and confirms that SCSL can effectively improve the real-time performance of content-based publish/subscribe systems.","PeriodicalId":6805,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"85 1","pages":"148-157"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS47924.2020.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Although many matching algorithms have been proposed to improve the matching efficiency of the content-based publish/subscribe system, existing work seldom consider the real-time of event dissemination from the perspective of event matching. On the basis of two existing matching algorithms, in this paper, we propose a subscription-classifying and structure-layering (SCSL) optimization method for matching algorithms, aiming to improve real-time by shortening the determining time of matching subscriptions. The basic idea of SCSL is that subscriptions with high matching probabilities should be processed first in the process of event matching and their storage positions in the data structure should be adjusted in line with changing probabilities. One challenge of SCSL is the trade-off that needs to be made between the gains of improving real-time performance by identifying matching subscriptions earlier and the cost of increasing matching time due to subscription classification and adjustment. We design a concise scheme to classify subscriptions, establish a lightweight adjustment mechanism to deal with dynamics and propose an efficient greedy algorithm to compute the adjustment solution, which alleviates the impact of SCSL on matching performance. The experiment results show that the 95th percentile of the determining time of matching subscriptions is improved by about 70%. Furthermore, we integrate SCSL into Apache Kafka to augment it as a content-based publish/subscribe system and test the effect of SCSL based on real-world stock trace data, which witnesses about 40% improvement on the average event transfer latency and confirms that SCSL can effectively improve the real-time performance of content-based publish/subscribe systems.