分布式系统的容错哈希联接

Computer Tools in Education Pub Date : 2022-12-28 DOI:10.32603/2071-2340-2022-4-68-82

Arsen R. Nasibullin

{"title":"分布式系统的容错哈希联接","authors":"Arsen R. Nasibullin","doi":"10.32603/2071-2340-2022-4-68-82","DOIUrl":null,"url":null,"abstract":"Nowadays, enterprises are inclined to deploy data processing and analytical applications from well-equipped mainframes with highly available hardware components to commodity computers. Commodity machines are less reliable than expensive mainframes. Applications deployed on commodity clusters have to deal with failures that occur frequently. Mostly, these applications perform complex client queries with aggregation and join operations. The longer a query executes, the more it suffers from failures. It causes the entire work has to be re-executed. This paper presents a fault tolerant hash join (FTHJ) algorithm for distributed systems implemented in Apache Ignite. The FTHJ achieves fault tolerance by using a data replication mechanism, materializing intermediate computations. To evaluate FTHJ, we implemented the baseline, unreliable hash join algorithm. Experimental results show that FTHJ takes at least 30% less time to recover and complete join operation when a failure occurs during the execution. This paper describes how we reached a compromise between executing recovery tasks for the least amount of time and using additional resources.","PeriodicalId":319537,"journal":{"name":"Computer Tools in Education","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fault Tolerant Hash Join for Distributed Systems\",\"authors\":\"Arsen R. Nasibullin\",\"doi\":\"10.32603/2071-2340-2022-4-68-82\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, enterprises are inclined to deploy data processing and analytical applications from well-equipped mainframes with highly available hardware components to commodity computers. Commodity machines are less reliable than expensive mainframes. Applications deployed on commodity clusters have to deal with failures that occur frequently. Mostly, these applications perform complex client queries with aggregation and join operations. The longer a query executes, the more it suffers from failures. It causes the entire work has to be re-executed. This paper presents a fault tolerant hash join (FTHJ) algorithm for distributed systems implemented in Apache Ignite. The FTHJ achieves fault tolerance by using a data replication mechanism, materializing intermediate computations. To evaluate FTHJ, we implemented the baseline, unreliable hash join algorithm. Experimental results show that FTHJ takes at least 30% less time to recover and complete join operation when a failure occurs during the execution. This paper describes how we reached a compromise between executing recovery tasks for the least amount of time and using additional resources.\",\"PeriodicalId\":319537,\"journal\":{\"name\":\"Computer Tools in Education\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Tools in Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32603/2071-2340-2022-4-68-82\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Tools in Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32603/2071-2340-2022-4-68-82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

如今，企业倾向于将数据处理和分析应用程序从装备精良、硬件组件可用性高的大型机部署到普通计算机上。普通机器不如昂贵的大型机可靠。部署在商品集群上的应用程序必须处理频繁发生的故障。大多数情况下，这些应用程序使用聚合和连接操作执行复杂的客户端查询。查询执行的时间越长，失败的可能性就越大。它导致整个工作必须重新执行。本文提出了一种基于Apache Ignite实现的分布式系统容错散列连接(FTHJ)算法。FTHJ通过使用数据复制机制实现容错，实现中间计算。为了评估FTHJ，我们实现了基线的、不可靠的散列连接算法。实验结果表明，当执行过程中发生故障时，FTHJ恢复并完成连接操作的时间至少减少了30%。本文描述了我们如何在以最少的时间执行恢复任务和使用额外的资源之间达成妥协。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fault Tolerant Hash Join for Distributed Systems

Nowadays, enterprises are inclined to deploy data processing and analytical applications from well-equipped mainframes with highly available hardware components to commodity computers. Commodity machines are less reliable than expensive mainframes. Applications deployed on commodity clusters have to deal with failures that occur frequently. Mostly, these applications perform complex client queries with aggregation and join operations. The longer a query executes, the more it suffers from failures. It causes the entire work has to be re-executed. This paper presents a fault tolerant hash join (FTHJ) algorithm for distributed systems implemented in Apache Ignite. The FTHJ achieves fault tolerance by using a data replication mechanism, materializing intermediate computations. To evaluate FTHJ, we implemented the baseline, unreliable hash join algorithm. Experimental results show that FTHJ takes at least 30% less time to recover and complete join operation when a failure occurs during the execution. This paper describes how we reached a compromise between executing recovery tasks for the least amount of time and using additional resources.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computer Tools in Education

自引率

0.00%

发文量