Jiwon Bang, Siwoon Son, Hajin Kim, Yang-Sae Moon, Mi-Jung Choi
{"title":"为解决Apache Kafka中的饥饿问题设计并实现了一个减载引擎","authors":"Jiwon Bang, Siwoon Son, Hajin Kim, Yang-Sae Moon, Mi-Jung Choi","doi":"10.1109/NOMS.2018.8406306","DOIUrl":null,"url":null,"abstract":"Real-time data stream processing technologies such as Apache Storm and Apache Spark are being actively studied to deal with large-capacity data streams that generated rapidly in real time. Because it is difficult to use most real-time processing techniques alone, it is common to use it with a messaging system that supports input and output of data streams. Apache Kafka is a representative distributed messaging system, specialized in delivering large amounts of real-time log data. However, if the production rate of data in Kafka is faster than the consumption rate, data starvation problem may arise. In order to solve the starvation problem, a load shedding technique is needed to limit the incoming data and maintain system performance when the system is under load. Thus, in this paper confirmed the starvation problem that can occur in Kafka, and we designed and implemented a load shedding engine to solve this problem and proposed a solution to the starvation problem in Kafka based on the performance experiment.","PeriodicalId":19331,"journal":{"name":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Design and implementation of a load shedding engine for solving starvation problems in Apache Kafka\",\"authors\":\"Jiwon Bang, Siwoon Son, Hajin Kim, Yang-Sae Moon, Mi-Jung Choi\",\"doi\":\"10.1109/NOMS.2018.8406306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Real-time data stream processing technologies such as Apache Storm and Apache Spark are being actively studied to deal with large-capacity data streams that generated rapidly in real time. Because it is difficult to use most real-time processing techniques alone, it is common to use it with a messaging system that supports input and output of data streams. Apache Kafka is a representative distributed messaging system, specialized in delivering large amounts of real-time log data. However, if the production rate of data in Kafka is faster than the consumption rate, data starvation problem may arise. In order to solve the starvation problem, a load shedding technique is needed to limit the incoming data and maintain system performance when the system is under load. Thus, in this paper confirmed the starvation problem that can occur in Kafka, and we designed and implemented a load shedding engine to solve this problem and proposed a solution to the starvation problem in Kafka based on the performance experiment.\",\"PeriodicalId\":19331,\"journal\":{\"name\":\"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NOMS.2018.8406306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOMS.2018.8406306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Design and implementation of a load shedding engine for solving starvation problems in Apache Kafka
Real-time data stream processing technologies such as Apache Storm and Apache Spark are being actively studied to deal with large-capacity data streams that generated rapidly in real time. Because it is difficult to use most real-time processing techniques alone, it is common to use it with a messaging system that supports input and output of data streams. Apache Kafka is a representative distributed messaging system, specialized in delivering large amounts of real-time log data. However, if the production rate of data in Kafka is faster than the consumption rate, data starvation problem may arise. In order to solve the starvation problem, a load shedding technique is needed to limit the incoming data and maintain system performance when the system is under load. Thus, in this paper confirmed the starvation problem that can occur in Kafka, and we designed and implemented a load shedding engine to solve this problem and proposed a solution to the starvation problem in Kafka based on the performance experiment.