通过索引连接查询优化HIVE的性能提升

Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim
{"title":"通过索引连接查询优化HIVE的性能提升","authors":"Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim","doi":"10.14257/ijdta.2017.10.9.02","DOIUrl":null,"url":null,"abstract":"Index joins range unit pivotal for proficiency and quality once technique questions over colossal data. HIVE may be a cluster balanced immense data administration motor that is good for data examination applications and for OLAP for phenomenally \"specific\" inquiries whose yield sizes region unit little division from the contributing data, there the beast compel experiences poor execution because of repetitive circle I/O operations or end in starts of additional guide operations. Here all through this paper a shot is made and propose file joins procedure to rush up the inquiry strategy and incorporate it in Hive by mapping our vogue to the unique change stream to assess the execution, we've a slant to give and measure check inquiries on datasets created abuse TPC-H benchmark. Our outcomes show vital execution increase over moderately tremendous data sets and/or uncommonly specific questions having a two-way are a piece of and one be a piece of condition.","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"7 1","pages":"11-22"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Performance Gain in HIVE through Query Optimization using Index Joins\",\"authors\":\"Stephen Neal Joshua Eali, N. Thirupathi Rao, Swathi Kalam, D. Bhattacharyya, Hye-jin Kim\",\"doi\":\"10.14257/ijdta.2017.10.9.02\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Index joins range unit pivotal for proficiency and quality once technique questions over colossal data. HIVE may be a cluster balanced immense data administration motor that is good for data examination applications and for OLAP for phenomenally \\\"specific\\\" inquiries whose yield sizes region unit little division from the contributing data, there the beast compel experiences poor execution because of repetitive circle I/O operations or end in starts of additional guide operations. Here all through this paper a shot is made and propose file joins procedure to rush up the inquiry strategy and incorporate it in Hive by mapping our vogue to the unique change stream to assess the execution, we've a slant to give and measure check inquiries on datasets created abuse TPC-H benchmark. Our outcomes show vital execution increase over moderately tremendous data sets and/or uncommonly specific questions having a two-way are a piece of and one be a piece of condition.\",\"PeriodicalId\":13926,\"journal\":{\"name\":\"International journal of database theory and application\",\"volume\":\"7 1\",\"pages\":\"11-22\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of database theory and application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14257/ijdta.2017.10.9.02\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijdta.2017.10.9.02","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在庞大的数据中,一旦出现技术问题,指数加入范围单位对熟练程度和质量至关重要。HIVE可能是一个集群平衡的巨大数据管理马达,它适用于数据检查应用程序和OLAP,用于非常“特定”的查询,这些查询的生成大小与贡献数据的区域单位相差很小,在那里,由于重复的循环I/O操作或结束于额外的引导操作的启动,强制执行体验较差。在这里,通过本文的尝试,提出了一个文件连接过程,通过将我们的时尚映射到独特的变更流来评估执行,从而加快查询策略并将其纳入Hive,我们倾向于对滥用TPC-H基准创建的数据集进行检查查询。我们的结果显示,在适度庞大的数据集和/或不常见的特定问题上,执行力有了重要的提高,其中一个是双向的,一个是一个条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance Gain in HIVE through Query Optimization using Index Joins
Index joins range unit pivotal for proficiency and quality once technique questions over colossal data. HIVE may be a cluster balanced immense data administration motor that is good for data examination applications and for OLAP for phenomenally "specific" inquiries whose yield sizes region unit little division from the contributing data, there the beast compel experiences poor execution because of repetitive circle I/O operations or end in starts of additional guide operations. Here all through this paper a shot is made and propose file joins procedure to rush up the inquiry strategy and incorporate it in Hive by mapping our vogue to the unique change stream to assess the execution, we've a slant to give and measure check inquiries on datasets created abuse TPC-H benchmark. Our outcomes show vital execution increase over moderately tremendous data sets and/or uncommonly specific questions having a two-way are a piece of and one be a piece of condition.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信