Navendu Jain, Lisa Amini, H. Andrade, Richard P. King, Yoonho Park, P. Selo, C. Venkatramani
{"title":"流处理核心上线性道路基准的设计、实现和评估","authors":"Navendu Jain, Lisa Amini, H. Andrade, Richard P. King, Yoonho Park, P. Selo, C. Venkatramani","doi":"10.1145/1142473.1142522","DOIUrl":null,"url":null,"abstract":"Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.","PeriodicalId":416090,"journal":{"name":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","volume":"474 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"219","resultStr":"{\"title\":\"Design, implementation, and evaluation of the linear road bnchmark on the stream processing core\",\"authors\":\"Navendu Jain, Lisa Amini, H. Andrade, Richard P. King, Yoonho Park, P. Selo, C. Venkatramani\",\"doi\":\"10.1145/1142473.1142522\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.\",\"PeriodicalId\":416090,\"journal\":{\"name\":\"Proceedings of the 2006 ACM SIGMOD international conference on Management of data\",\"volume\":\"474 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"219\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2006 ACM SIGMOD international conference on Management of data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1142473.1142522\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2006 ACM SIGMOD international conference on Management of data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1142473.1142522","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core
Stream processing applications have recently gained significant attention in the networking and database community. At the core of these applications is a stream processing engine that performs resource allocation and management to support continuous tracking of queries over collections of physically-distributed and rapidly-updating data streams. While numerous stream processing systems exist, there has been little work on understanding the performance characteristics of these applications in a distributed setup. In this paper, we examine the performance bottlenecks of streaming data applications, in particular the Linear Road stream data management benchmark, in achieving good performance in large-scale distributed environments, using the Stream Processing Core (SPC), a stream processing middleware we have developed. First, we present the design and implementation of the Linear Road benchmark on the SPC middleware. SPC has been designed to scale to tens of thousands of processing nodes, while supporting concurrent applications and multiple simultaneous queries. Second, we identify the main performance bottlenecks in the Linear Road application in achieving scalability and low query response latency. Our results show that data locality, buffer capacity, physical allocation of processing elements to infrastructure nodes, and packaging for transporting streamed data are important factors in achieving good application performance. Though we evaluate our system primarily for the Linear Road application, we believe it also provides useful insights into the overall system behavior for supporting other distributed and large-scale continuous streaming data applications. Finally, we examine how SPC can be used and tuned to enable a very efficient implementation of the Linear Road application in a distributed environment.