Performance Evaluation in Grid Computing: A Modeling and Prediction Perspective

Hui Li
{"title":"Performance Evaluation in Grid Computing: A Modeling and Prediction Perspective","authors":"Hui Li","doi":"10.1109/CCGRID.2007.84","DOIUrl":null,"url":null,"abstract":"Experimental performance studies on computer systems, including Grids, require deep understandings on their workload characteristics. The need arises from two important and closely related topics in performance evaluation, namely, workload modeling and performance prediction. Both topics rely heavily on the representative workload data and have their arsenal from statistics and machine learning. Nevertheless, their goals and the nature of research differ considerably. Workload modeling aims at building mathematical models to generate workloads that can be used in simulation-based performance evaluation studies. It should statistically resemble the original real-world data therefore marginal statistics and second-order properties such as autocorrelation and power spectrum are important matching criteria. Performance prediction, on the other hand, intends to provide realtime forecast of important performance metrics (such as application run time and queue wait time) which can support Grid scheduling decisions. From this perspective prediction accuracy as well as performance should be considered to evaluate candidate techniques. My PhD research focuses primarily on these two topics in space-shared, data-intensive Grid environments. Starting from a comprehensive workload analysis with emphasis on the correlation structures and the scaling behavior, several basic job arrival patterns such as pseudo-periodicity and long range dependence are identified. Models are further proposed to capture these important arrival patterns and a complete workload model including run time is being investigated. The strong autocorrelations present in run time and queue wait time series inspire the research for performance prediction based on learning from historical data. Techniques based on a instance based learning algorithm and several improvements are proposed and empirically evaluated. Research plans are proposed to use the results of workload modeling and performance prediction in the evaluation of scheduling strategies in data-intensive Grid environments.","PeriodicalId":278535,"journal":{"name":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2007.84","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

Experimental performance studies on computer systems, including Grids, require deep understandings on their workload characteristics. The need arises from two important and closely related topics in performance evaluation, namely, workload modeling and performance prediction. Both topics rely heavily on the representative workload data and have their arsenal from statistics and machine learning. Nevertheless, their goals and the nature of research differ considerably. Workload modeling aims at building mathematical models to generate workloads that can be used in simulation-based performance evaluation studies. It should statistically resemble the original real-world data therefore marginal statistics and second-order properties such as autocorrelation and power spectrum are important matching criteria. Performance prediction, on the other hand, intends to provide realtime forecast of important performance metrics (such as application run time and queue wait time) which can support Grid scheduling decisions. From this perspective prediction accuracy as well as performance should be considered to evaluate candidate techniques. My PhD research focuses primarily on these two topics in space-shared, data-intensive Grid environments. Starting from a comprehensive workload analysis with emphasis on the correlation structures and the scaling behavior, several basic job arrival patterns such as pseudo-periodicity and long range dependence are identified. Models are further proposed to capture these important arrival patterns and a complete workload model including run time is being investigated. The strong autocorrelations present in run time and queue wait time series inspire the research for performance prediction based on learning from historical data. Techniques based on a instance based learning algorithm and several improvements are proposed and empirically evaluated. Research plans are proposed to use the results of workload modeling and performance prediction in the evaluation of scheduling strategies in data-intensive Grid environments.
网格计算中的性能评估:一个建模和预测的视角
计算机系统(包括网格)的实验性能研究需要对其工作负载特性有深入的了解。这种需求来自绩效评估中的两个重要且密切相关的主题,即工作负载建模和绩效预测。这两个主题都严重依赖于具有代表性的工作负载数据,并从统计学和机器学习中获得他们的武器库。然而,他们的目标和研究的性质有很大的不同。工作负载建模旨在构建数学模型,以生成可用于基于仿真的性能评估研究的工作负载。因此,边际统计量和二阶特性(如自相关和功率谱)是重要的匹配标准。另一方面,性能预测旨在提供重要性能指标(如应用程序运行时间和队列等待时间)的实时预测,从而支持网格调度决策。从这个角度来看,评估候选技术时应考虑预测精度和性能。我的博士研究主要集中在空间共享、数据密集型网格环境中的这两个主题。从全面的工作量分析出发,重点分析了相关结构和尺度行为,确定了伪周期性和长距离依赖性等几种基本的作业到达模式。进一步提出了捕获这些重要到达模式的模型,并正在研究包括运行时在内的完整工作负载模型。运行时和队列等待时间序列中存在的强自相关性激发了基于历史数据学习的性能预测研究。提出了一种基于实例的学习算法及其改进方法,并对其进行了实证评价。提出了将工作负载建模和性能预测结果用于数据密集型网格环境下调度策略评估的研究计划。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信