Performance analysis of data sharing environments

A. Dan
{"title":"Performance analysis of data sharing environments","authors":"A. Dan","doi":"10.7551/mitpress/5299.001.0001","DOIUrl":null,"url":null,"abstract":"A data sharing environment consists of multiple loosely coupled transaction processing nodes sharing a common database at the disk level. Apart from the private buffers in each node, the environment may contain an additional global shared buffer in the form of disk cache, file server cache or intermediate shared memory. In this dissertation, we develop a comprehensive analytical model for such a complex environment using a hierarchical approach, where the concurrency control, the CPU queueing discipline and the buffer hit probabilities of the private and shared buffers are first modeled separately, and then integrated through an iterative procedure. To this end, we develop two new submodels: (1) the private buffer model that captures the effects of multi-system buffer invalidation, skewed database access, LRU buffer replacement policy and the rerun transactions, and (2) the shared buffer modeling framework that captures the effects of dependence between the contents of private and the shared buffers, and is used to analyze various shared buffer management policies (SBMPs) proposed in this dissertation. The various policies propagate a granule into the shared buffer after one or more of the following events: database update, shared buffer miss and private buffer replacement. \nThe analytical model is then used to investigate various issues in the design of data sharing environment. \nScalability. The model predicts degradation in transaction response time as new nodes are added to the system. \nBuffer utilization. The model predicts the effectiveness of additional buffer allocation for both the private and shared buffers. \nSkewed access. The skewed access increases both data contention and buffer hit probability in the system. The resultant effect on the transaction response time is investigated. The response time is found to be more sensitive to skewed data access under two-phase locking (2PL) than under optimistic concurrency control (OCC) protocol. The skewed access also magnifies the effect of invalidation and reduces the utilization of private buffers. \nPolicy selection. The modeling framework is used to select the best SBMP for a given parameter range (private and shared buffer sizes, shared buffer access overhead and delay, number of nodes, database access pattern, update probabilities, etc.). The updates should always be propagated to the shared buffer to alleviate the invalidation problem. For a smaller number of nodes, the effect of dependence between the contents of the private and the shared buffers influences policy selection. \nOptimal configuration. The model can be used to optimally allocate the buffer between the private and the shared buffers in various system architectures depending on the overhead and delay in accessing the shared buffer. For a larger number of nodes and under skewed database access, the shared buffer can improve the transaction response time significantly.","PeriodicalId":151524,"journal":{"name":"ACM distinguished dissertations","volume":"255 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1992-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM distinguished dissertations","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7551/mitpress/5299.001.0001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20

Abstract

A data sharing environment consists of multiple loosely coupled transaction processing nodes sharing a common database at the disk level. Apart from the private buffers in each node, the environment may contain an additional global shared buffer in the form of disk cache, file server cache or intermediate shared memory. In this dissertation, we develop a comprehensive analytical model for such a complex environment using a hierarchical approach, where the concurrency control, the CPU queueing discipline and the buffer hit probabilities of the private and shared buffers are first modeled separately, and then integrated through an iterative procedure. To this end, we develop two new submodels: (1) the private buffer model that captures the effects of multi-system buffer invalidation, skewed database access, LRU buffer replacement policy and the rerun transactions, and (2) the shared buffer modeling framework that captures the effects of dependence between the contents of private and the shared buffers, and is used to analyze various shared buffer management policies (SBMPs) proposed in this dissertation. The various policies propagate a granule into the shared buffer after one or more of the following events: database update, shared buffer miss and private buffer replacement. The analytical model is then used to investigate various issues in the design of data sharing environment. Scalability. The model predicts degradation in transaction response time as new nodes are added to the system. Buffer utilization. The model predicts the effectiveness of additional buffer allocation for both the private and shared buffers. Skewed access. The skewed access increases both data contention and buffer hit probability in the system. The resultant effect on the transaction response time is investigated. The response time is found to be more sensitive to skewed data access under two-phase locking (2PL) than under optimistic concurrency control (OCC) protocol. The skewed access also magnifies the effect of invalidation and reduces the utilization of private buffers. Policy selection. The modeling framework is used to select the best SBMP for a given parameter range (private and shared buffer sizes, shared buffer access overhead and delay, number of nodes, database access pattern, update probabilities, etc.). The updates should always be propagated to the shared buffer to alleviate the invalidation problem. For a smaller number of nodes, the effect of dependence between the contents of the private and the shared buffers influences policy selection. Optimal configuration. The model can be used to optimally allocate the buffer between the private and the shared buffers in various system architectures depending on the overhead and delay in accessing the shared buffer. For a larger number of nodes and under skewed database access, the shared buffer can improve the transaction response time significantly.
数据共享环境性能分析
数据共享环境由多个松散耦合的事务处理节点组成,这些节点在磁盘级别共享一个公共数据库。除了每个节点中的私有缓冲区之外,环境还可能包含一个额外的全局共享缓冲区,其形式为磁盘缓存、文件服务器缓存或中间共享内存。在本文中,我们使用分层方法建立了一个综合的分析模型,其中并发控制,CPU排队规则以及私有和共享缓冲区的缓冲区命中概率首先分别建模,然后通过迭代过程集成。为此,我们开发了两个新的子模型:(1)私有缓冲区模型,该模型捕获了多系统缓冲区失效、倾斜数据库访问、LRU缓冲区替换策略和重新运行事务的影响;(2)共享缓冲区建模框架,该框架捕获了私有缓冲区和共享缓冲区内容之间的依赖关系,并用于分析本文提出的各种共享缓冲区管理策略(SBMPs)。在发生以下一个或多个事件后,各种策略将一个颗粒传播到共享缓冲区:数据库更新、共享缓冲区丢失和私有缓冲区替换。然后利用分析模型对数据共享环境设计中的各种问题进行了探讨。可伸缩性。该模型预测随着新节点添加到系统中,事务响应时间会下降。缓冲区的利用率。该模型预测了私有缓冲区和共享缓冲区额外分配的有效性。倾斜的访问。倾斜访问增加了系统中的数据争用和缓冲区命中概率。研究了对事务响应时间的最终影响。研究发现,与乐观并发控制(OCC)协议相比,在两阶段锁定(2PL)协议下,响应时间对倾斜数据访问更为敏感。倾斜访问还会放大无效的影响,并降低私有缓冲区的利用率。政策选择。建模框架用于为给定的参数范围(私有和共享缓冲区大小、共享缓冲区访问开销和延迟、节点数量、数据库访问模式、更新概率等)选择最佳SBMP。应该始终将更新传播到共享缓冲区,以减轻无效问题。对于较小数量的节点,私有缓冲区和共享缓冲区内容之间的依赖关系会影响策略选择。最优配置。该模型可用于根据访问共享缓冲区的开销和延迟,在各种系统体系结构中优化分配私有缓冲区和共享缓冲区之间的缓冲区。对于节点数量较多且数据库访问倾斜的情况,共享缓冲区可以显著改善事务响应时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信