{"title":"Spark Load Balancing Strategy Optimization Based on Internet of Things","authors":"Suzhen Wang, Lu Zhang, Yanpiao Zhang, Ning Cao","doi":"10.1109/CYBERC.2018.00025","DOIUrl":null,"url":null,"abstract":"The data collected by the Internet of Things (IOT) technology is becoming larger and larger, and the traditional data processing methods have encountered tremendous challenges. Spark, as a memory-based distributed computing framework, provides support for the data processing of the IOT. Load balancing is an important indicator to measure the performance of Spark computing. The load balancing strategy of Spark cluster only takes into account the locality of data, and neglects the computing capability and resource utilization of each node, which is prone to load unbalance and affecting the IOT data processing efficiency. Aiming at this issue, this paper optimizes and improves the current load balancing strategy of Spark based on the computing performance of each node in the Spark cluster, and proposes a task execution node assignment algorithm based on genetic algorithm and particle swarm optimization (TENAA). Experiments show that, compared with the Spark load balancing strategy, the load balancing strategy proposed in this paper has a significant increase both in load deviation and task completion time.","PeriodicalId":282903,"journal":{"name":"2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CYBERC.2018.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The data collected by the Internet of Things (IOT) technology is becoming larger and larger, and the traditional data processing methods have encountered tremendous challenges. Spark, as a memory-based distributed computing framework, provides support for the data processing of the IOT. Load balancing is an important indicator to measure the performance of Spark computing. The load balancing strategy of Spark cluster only takes into account the locality of data, and neglects the computing capability and resource utilization of each node, which is prone to load unbalance and affecting the IOT data processing efficiency. Aiming at this issue, this paper optimizes and improves the current load balancing strategy of Spark based on the computing performance of each node in the Spark cluster, and proposes a task execution node assignment algorithm based on genetic algorithm and particle swarm optimization (TENAA). Experiments show that, compared with the Spark load balancing strategy, the load balancing strategy proposed in this paper has a significant increase both in load deviation and task completion time.