Performance analysis of write operations in identity and UUID ordered tables

Scientific Journals of Rzeszów University of Technology, Series: Electrotechnics Pub Date : 1900-01-01 DOI:10.7862/RE.2020.6

Maciej Penar

{"title":"Performance analysis of write operations in identity and UUID ordered tables","authors":"Maciej Penar","doi":"10.7862/RE.2020.6","DOIUrl":null,"url":null,"abstract":"Design of the database includes the decision about the physical storage. This is often overlooked as 1) this cannot be expressed in standard SQL and in result each Database Systems have their own way to specify the physical storage and 2) the decision is often made implicitly. This is dangerous situation as many of the databases use B+ trees as table implementation which stores the data physically sorted by some ordering attribute. The choice of the ordering attribute largely affects read and write operations. Commonly, IDENTITY/AUTO_INCREMENT constraint are being chosen as ordering attributes, due to their easy usage and monotonic nature. In some cases ordering tables by the attributes whose values are drawn from uniform distribution leads to better performance in terms of Transactions-Per-Second. Such cases includes situation when data does fit entirely in-memory or when we can limit the set of physical pages being accessed. In the end, however, We cannot entirely say that either monotonic or random attributes are superior. Both have their pros and cons. In this article We present (1) short description of the data structures in contemporary Database Systems, (2) the advantages and the disadvantages of the two common types which are used as the clustering attributes: GUID and IDENTITY, (3) performance analysis of write operation which compare both data types using B+ tree as primary storage and (4) evaluate the efficiency of these bulk load operation using heap files and B+ trees.","PeriodicalId":409847,"journal":{"name":"Scientific Journals of Rzeszów University of Technology, Series: Electrotechnics","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Journals of Rzeszów University of Technology, Series: Electrotechnics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7862/RE.2020.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Design of the database includes the decision about the physical storage. This is often overlooked as 1) this cannot be expressed in standard SQL and in result each Database Systems have their own way to specify the physical storage and 2) the decision is often made implicitly. This is dangerous situation as many of the databases use B+ trees as table implementation which stores the data physically sorted by some ordering attribute. The choice of the ordering attribute largely affects read and write operations. Commonly, IDENTITY/AUTO_INCREMENT constraint are being chosen as ordering attributes, due to their easy usage and monotonic nature. In some cases ordering tables by the attributes whose values are drawn from uniform distribution leads to better performance in terms of Transactions-Per-Second. Such cases includes situation when data does fit entirely in-memory or when we can limit the set of physical pages being accessed. In the end, however, We cannot entirely say that either monotonic or random attributes are superior. Both have their pros and cons. In this article We present (1) short description of the data structures in contemporary Database Systems, (2) the advantages and the disadvantages of the two common types which are used as the clustering attributes: GUID and IDENTITY, (3) performance analysis of write operation which compare both data types using B+ tree as primary storage and (4) evaluate the efficiency of these bulk load operation using heap files and B+ trees.

查看原文本刊更多论文

身份和UUID顺序表中写操作的性能分析

数据库的设计包括物理存储的选择。这一点经常被忽略，因为1)这不能用标准SQL表示，因此每个数据库系统都有自己的方式来指定物理存储，2)决策通常是隐式地做出的。这是一种危险的情况，因为许多数据库使用B+树作为表实现，它存储按某些排序属性进行物理排序的数据。排序属性的选择在很大程度上影响读和写操作。通常，选择IDENTITY/AUTO_INCREMENT约束作为排序属性，因为它们易于使用且单调。在某些情况下，根据属性(这些属性的值来自统一分布)对表进行排序，可以在每秒事务数方面获得更好的性能。这种情况包括数据完全适合内存或我们可以限制访问的物理页面集的情况。然而，最后，我们不能完全说单调属性或随机属性孰优孰劣。两者都有各自的优点和缺点。在本文中，我们呈现(1)对当代数据库系统中的数据结构的简要描述，(2)用作集群属性的两种常见类型:GUID和IDENTITY的优缺点，(3)使用B+树作为主存储比较两种数据类型的写操作的性能分析，以及(4)使用堆文件和B+树评估这些大容量负载操作的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Scientific Journals of Rzeszów University of Technology, Series: Electrotechnics

自引率

0.00%

发文量