Performance analysis of write operations in identity and UUID ordered tables

Maciej Penar
{"title":"Performance analysis of write operations in identity and UUID ordered tables","authors":"Maciej Penar","doi":"10.7862/RE.2020.6","DOIUrl":null,"url":null,"abstract":"Design of the database includes the decision about the physical storage. This is often overlooked as 1) this cannot be expressed in standard SQL and in result each Database Systems have their own way to specify the physical storage and 2) the decision is often made implicitly. This is dangerous situation as many of the databases use B+ trees as table implementation which stores the data physically sorted by some ordering attribute. The choice of the ordering attribute largely affects read and write operations. Commonly, IDENTITY/AUTO_INCREMENT constraint are being chosen as ordering attributes, due to their easy usage and monotonic nature. In some cases ordering tables by the attributes whose values are drawn from uniform distribution leads to better performance in terms of Transactions-Per-Second. Such cases includes situation when data does fit entirely in-memory or when we can limit the set of physical pages being accessed. In the end, however, We cannot entirely say that either monotonic or random attributes are superior. Both have their pros and cons. In this article We present (1) short description of the data structures in contemporary Database Systems, (2) the advantages and the disadvantages of the two common types which are used as the clustering attributes: GUID and IDENTITY, (3) performance analysis of write operation which compare both data types using B+ tree as primary storage and (4) evaluate the efficiency of these bulk load operation using heap files and B+ trees.","PeriodicalId":409847,"journal":{"name":"Scientific Journals of Rzeszów University of Technology, Series: Electrotechnics","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Journals of Rzeszów University of Technology, Series: Electrotechnics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7862/RE.2020.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Design of the database includes the decision about the physical storage. This is often overlooked as 1) this cannot be expressed in standard SQL and in result each Database Systems have their own way to specify the physical storage and 2) the decision is often made implicitly. This is dangerous situation as many of the databases use B+ trees as table implementation which stores the data physically sorted by some ordering attribute. The choice of the ordering attribute largely affects read and write operations. Commonly, IDENTITY/AUTO_INCREMENT constraint are being chosen as ordering attributes, due to their easy usage and monotonic nature. In some cases ordering tables by the attributes whose values are drawn from uniform distribution leads to better performance in terms of Transactions-Per-Second. Such cases includes situation when data does fit entirely in-memory or when we can limit the set of physical pages being accessed. In the end, however, We cannot entirely say that either monotonic or random attributes are superior. Both have their pros and cons. In this article We present (1) short description of the data structures in contemporary Database Systems, (2) the advantages and the disadvantages of the two common types which are used as the clustering attributes: GUID and IDENTITY, (3) performance analysis of write operation which compare both data types using B+ tree as primary storage and (4) evaluate the efficiency of these bulk load operation using heap files and B+ trees.
身份和UUID顺序表中写操作的性能分析
数据库的设计包括物理存储的选择。这一点经常被忽略,因为1)这不能用标准SQL表示,因此每个数据库系统都有自己的方式来指定物理存储,2)决策通常是隐式地做出的。这是一种危险的情况,因为许多数据库使用B+树作为表实现,它存储按某些排序属性进行物理排序的数据。排序属性的选择在很大程度上影响读和写操作。通常,选择IDENTITY/AUTO_INCREMENT约束作为排序属性,因为它们易于使用且单调。在某些情况下,根据属性(这些属性的值来自统一分布)对表进行排序,可以在每秒事务数方面获得更好的性能。这种情况包括数据完全适合内存或我们可以限制访问的物理页面集的情况。然而,最后,我们不能完全说单调属性或随机属性孰优孰劣。两者都有各自的优点和缺点。在本文中,我们呈现(1)对当代数据库系统中的数据结构的简要描述,(2)用作集群属性的两种常见类型:GUID和IDENTITY的优缺点,(3)使用B+树作为主存储比较两种数据类型的写操作的性能分析,以及(4)使用堆文件和B+树评估这些大容量负载操作的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信