Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim
{"title":"Performance Gain in HIVE through Query Optimization using Index Joins","authors":"Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim","doi":"10.14257/ijdta.2017.10.9.02","DOIUrl":null,"url":null,"abstract":"Index joins range unit pivotal for proficiency and quality once technique questions over colossal data. HIVE may be a cluster balanced immense data administration motor that is good for data examination applications and for OLAP for phenomenally \"specific\" inquiries whose yield sizes region unit little division from the contributing data, there the beast compel experiences poor execution because of repetitive circle I/O operations or end in starts of additional guide operations. Here all through this paper a shot is made and propose file joins procedure to rush up the inquiry strategy and incorporate it in Hive by mapping our vogue to the unique change stream to assess the execution, we've a slant to give and measure check inquiries on datasets created abuse TPC-H benchmark. Our outcomes show vital execution increase over moderately tremendous data sets and/or uncommonly specific questions having a two-way are a piece of and one be a piece of condition.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"7 1","pages":"11-22"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijdta.2017.10.9.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Index joins range unit pivotal for proficiency and quality once technique questions over colossal data. HIVE may be a cluster balanced immense data administration motor that is good for data examination applications and for OLAP for phenomenally "specific" inquiries whose yield sizes region unit little division from the contributing data, there the beast compel experiences poor execution because of repetitive circle I/O operations or end in starts of additional guide operations. Here all through this paper a shot is made and propose file joins procedure to rush up the inquiry strategy and incorporate it in Hive by mapping our vogue to the unique change stream to assess the execution, we've a slant to give and measure check inquiries on datasets created abuse TPC-H benchmark. Our outcomes show vital execution increase over moderately tremendous data sets and/or uncommonly specific questions having a two-way are a piece of and one be a piece of condition.