Online characterization of buggy applications running on the cloud

Arnamoy Bhattacharyya, Harsh V. P. Singh, Seyed Ali Jokar Jandaghi, C. Amza
{"title":"Online characterization of buggy applications running on the cloud","authors":"Arnamoy Bhattacharyya, Harsh V. P. Singh, Seyed Ali Jokar Jandaghi, C. Amza","doi":"10.1109/CNSM.2016.7818433","DOIUrl":null,"url":null,"abstract":"As Cloud platforms are becoming more popular, efficient resource management in these Cloud platforms helps the Cloud provider to deliver better quality of service to its customers. In this paper, we present an online characterization method that can identify potentially failing jobs in a Cloud platform by analyzing the jobs' resource usage profile as the job runs. We show that, by tracking the online resource consumption, we can develop a model through which we can predict whether or not a job will have an abnormal termination. We further show, using both real world and synthetic data, that our online tool can raise alarms as early as within the first 1/8th of the potentially failing job's lifetime, with a false negative rate as low as 4%. These alarms can become useful in implementing either one of the following resource-conserving Cloud management techniques: alerting clients early, de-prioritizing jobs that are likely to fail or assigning them less performant resources, deploying or up-regulating diagnostic tools for potentially faulty jobs.","PeriodicalId":334604,"journal":{"name":"2016 12th International Conference on Network and Service Management (CNSM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 12th International Conference on Network and Service Management (CNSM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CNSM.2016.7818433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

As Cloud platforms are becoming more popular, efficient resource management in these Cloud platforms helps the Cloud provider to deliver better quality of service to its customers. In this paper, we present an online characterization method that can identify potentially failing jobs in a Cloud platform by analyzing the jobs' resource usage profile as the job runs. We show that, by tracking the online resource consumption, we can develop a model through which we can predict whether or not a job will have an abnormal termination. We further show, using both real world and synthetic data, that our online tool can raise alarms as early as within the first 1/8th of the potentially failing job's lifetime, with a false negative rate as low as 4%. These alarms can become useful in implementing either one of the following resource-conserving Cloud management techniques: alerting clients early, de-prioritizing jobs that are likely to fail or assigning them less performant resources, deploying or up-regulating diagnostic tools for potentially faulty jobs.
在线表征运行在云上的有bug的应用程序
随着云平台变得越来越流行,这些云平台中的高效资源管理可以帮助云提供商为其客户提供更高质量的服务。在本文中,我们提出了一种在线表征方法,该方法可以通过在作业运行时分析作业的资源使用概况来识别云平台中可能失败的作业。我们表明,通过跟踪在线资源消耗,我们可以开发一个模型,通过该模型我们可以预测一个作业是否会异常终止。我们进一步展示,使用真实世界和合成数据,我们的在线工具可以在潜在失败工作生命周期的前1/8内发出警报,假阴性率低至4%。这些警报可用于实现以下资源节约的云管理技术之一:尽早提醒客户端,取消可能失败的作业的优先级或为其分配性能较低的资源,为潜在的故障作业部署或调整诊断工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信