{"title":"基于Spark的并行LSTM算法研究","authors":"Zhao Yangyang, Niu Wei, W. Meinan","doi":"10.1109/icisfall51598.2021.9627382","DOIUrl":null,"url":null,"abstract":"Aiming at the problems of large amount of data collected by airborne sensors, lack of data association, and low processing efficiency, this paper proposes a parallel LSTM algorithm model suitable for Spark platform. First, use the Spark platform to complete the traversal scan operation in the memory RDD of all nodes in the distributed cluster, and combine the directed acyclic graph to create a Pipeline pipeline to implement a parallel computing framework. An algorithm model to optimize the parameters of LSTM neural network is proposed, and load balancing processing method is introduced to realize that all nodes of the distributed system can share the computing tasks in a balanced manner. The experimental results show that compared to the stand-alone case, the parallelized LSTM algorithm improves the efficiency. The prediction efficiency of the LSTM algorithm model after load balancing processing is higher, which shows that the distribution of traversal tasks of each node is more balanced and the degree of parallelization is higher.","PeriodicalId":240142,"journal":{"name":"2021 IEEE/ACIS 20th International Fall Conference on Computer and Information Science (ICIS Fall)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Research on Parallel LSTM Algorithm Based on Spark\",\"authors\":\"Zhao Yangyang, Niu Wei, W. Meinan\",\"doi\":\"10.1109/icisfall51598.2021.9627382\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aiming at the problems of large amount of data collected by airborne sensors, lack of data association, and low processing efficiency, this paper proposes a parallel LSTM algorithm model suitable for Spark platform. First, use the Spark platform to complete the traversal scan operation in the memory RDD of all nodes in the distributed cluster, and combine the directed acyclic graph to create a Pipeline pipeline to implement a parallel computing framework. An algorithm model to optimize the parameters of LSTM neural network is proposed, and load balancing processing method is introduced to realize that all nodes of the distributed system can share the computing tasks in a balanced manner. The experimental results show that compared to the stand-alone case, the parallelized LSTM algorithm improves the efficiency. The prediction efficiency of the LSTM algorithm model after load balancing processing is higher, which shows that the distribution of traversal tasks of each node is more balanced and the degree of parallelization is higher.\",\"PeriodicalId\":240142,\"journal\":{\"name\":\"2021 IEEE/ACIS 20th International Fall Conference on Computer and Information Science (ICIS Fall)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACIS 20th International Fall Conference on Computer and Information Science (ICIS Fall)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icisfall51598.2021.9627382\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACIS 20th International Fall Conference on Computer and Information Science (ICIS Fall)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icisfall51598.2021.9627382","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Parallel LSTM Algorithm Based on Spark
Aiming at the problems of large amount of data collected by airborne sensors, lack of data association, and low processing efficiency, this paper proposes a parallel LSTM algorithm model suitable for Spark platform. First, use the Spark platform to complete the traversal scan operation in the memory RDD of all nodes in the distributed cluster, and combine the directed acyclic graph to create a Pipeline pipeline to implement a parallel computing framework. An algorithm model to optimize the parameters of LSTM neural network is proposed, and load balancing processing method is introduced to realize that all nodes of the distributed system can share the computing tasks in a balanced manner. The experimental results show that compared to the stand-alone case, the parallelized LSTM algorithm improves the efficiency. The prediction efficiency of the LSTM algorithm model after load balancing processing is higher, which shows that the distribution of traversal tasks of each node is more balanced and the degree of parallelization is higher.