{"title":"云计算环境下小-大外连接优化算法","authors":"Farshad Delavarpour, A. Ahmadi","doi":"10.1109/CSICC52343.2021.9420579","DOIUrl":null,"url":null,"abstract":"Join operation has always been a topic of interest in scientific research that is commonly used in most applications. Given that a massive amount of information is generated daily, one of the problems and bottlenecks in Join operations is the execution time and the complexity of parallelization. Between all the various join types, the left outer join is the most common whereas little work has been done to optimize this operation. A common type of outer join is Left outer join between small and large tables, and the optimal execution of this operation can have a major impact on the overall performance of programs. In this paper, we present an optimal algorithm that performs left outer join on small-large tables in parallel. We will also discuss all the challenges of parallel join and explain how to implement the algorithm in detail. We perform several experiments in the cloud computing environment using the Spark framework. The results show that the proposed algorithm is scalable and has better performance than existing algorithms.","PeriodicalId":374593,"journal":{"name":"2021 26th International Computer Conference, Computer Society of Iran (CSICC)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Algorithm for Optimizing Small-Large Outer Join in Cloud Computing Environment\",\"authors\":\"Farshad Delavarpour, A. Ahmadi\",\"doi\":\"10.1109/CSICC52343.2021.9420579\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Join operation has always been a topic of interest in scientific research that is commonly used in most applications. Given that a massive amount of information is generated daily, one of the problems and bottlenecks in Join operations is the execution time and the complexity of parallelization. Between all the various join types, the left outer join is the most common whereas little work has been done to optimize this operation. A common type of outer join is Left outer join between small and large tables, and the optimal execution of this operation can have a major impact on the overall performance of programs. In this paper, we present an optimal algorithm that performs left outer join on small-large tables in parallel. We will also discuss all the challenges of parallel join and explain how to implement the algorithm in detail. We perform several experiments in the cloud computing environment using the Spark framework. The results show that the proposed algorithm is scalable and has better performance than existing algorithms.\",\"PeriodicalId\":374593,\"journal\":{\"name\":\"2021 26th International Computer Conference, Computer Society of Iran (CSICC)\",\"volume\":\"150 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 26th International Computer Conference, Computer Society of Iran (CSICC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSICC52343.2021.9420579\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 26th International Computer Conference, Computer Society of Iran (CSICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSICC52343.2021.9420579","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Algorithm for Optimizing Small-Large Outer Join in Cloud Computing Environment
Join operation has always been a topic of interest in scientific research that is commonly used in most applications. Given that a massive amount of information is generated daily, one of the problems and bottlenecks in Join operations is the execution time and the complexity of parallelization. Between all the various join types, the left outer join is the most common whereas little work has been done to optimize this operation. A common type of outer join is Left outer join between small and large tables, and the optimal execution of this operation can have a major impact on the overall performance of programs. In this paper, we present an optimal algorithm that performs left outer join on small-large tables in parallel. We will also discuss all the challenges of parallel join and explain how to implement the algorithm in detail. We perform several experiments in the cloud computing environment using the Spark framework. The results show that the proposed algorithm is scalable and has better performance than existing algorithms.