Machine learning job failure analysis and prediction model for the cloud environment

IF 3.2 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Harikrishna Bommala , Uma Maheswari V. , Rajanikanth Aluvalu , Swapna Mudrakola
{"title":"Machine learning job failure analysis and prediction model for the cloud environment","authors":"Harikrishna Bommala ,&nbsp;Uma Maheswari V. ,&nbsp;Rajanikanth Aluvalu ,&nbsp;Swapna Mudrakola","doi":"10.1016/j.hcc.2023.100165","DOIUrl":null,"url":null,"abstract":"<div><p>Reliable and accessible cloud applications are essential for the future of ubiquitous computing, smart appliances, and electronic health. Owing to the vastness and diversity of the cloud, a most cloud services, both physical and logical services have failed. Using currently accessible traces, we assessed and characterized the behaviors of successful and unsuccessful activities. We devised and implemented a method to forecast which jobs will fail. The proposed method optimizes cloud applications more efficiently in terms of resource usage. Using Google Cluster, Mustang, and Trinity traces, which are publicly available, an in-depth evaluation of the proposed model was conducted. The traces were also fed into several different machine learning models to select the most reliable model. Our efficiency analysis proves that the model performs well in terms of accuracy, F1-score, and recall. Several factors, such as failure of forecasting work, design of scheduling algorithms, modification of priority criteria, and restriction of task resubmission, may increase cloud service dependability and availability.</p></div>","PeriodicalId":100605,"journal":{"name":"High-Confidence Computing","volume":"3 4","pages":"Article 100165"},"PeriodicalIF":3.2000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2667295223000636/pdfft?md5=bfe61b5b8fb7fd53b685e1c9be60171b&pid=1-s2.0-S2667295223000636-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"High-Confidence Computing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667295223000636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Reliable and accessible cloud applications are essential for the future of ubiquitous computing, smart appliances, and electronic health. Owing to the vastness and diversity of the cloud, a most cloud services, both physical and logical services have failed. Using currently accessible traces, we assessed and characterized the behaviors of successful and unsuccessful activities. We devised and implemented a method to forecast which jobs will fail. The proposed method optimizes cloud applications more efficiently in terms of resource usage. Using Google Cluster, Mustang, and Trinity traces, which are publicly available, an in-depth evaluation of the proposed model was conducted. The traces were also fed into several different machine learning models to select the most reliable model. Our efficiency analysis proves that the model performs well in terms of accuracy, F1-score, and recall. Several factors, such as failure of forecasting work, design of scheduling algorithms, modification of priority criteria, and restriction of task resubmission, may increase cloud service dependability and availability.

面向云环境的机器学习作业失效分析与预测模型
可靠和可访问的云应用程序对于无处不在的计算、智能设备和电子健康的未来至关重要。由于云的浩瀚和多样性,大多数云服务,包括物理服务和逻辑服务都失败了。使用当前可访问的痕迹,我们评估并描述了成功和不成功活动的行为。我们设计并实施了一种方法来预测哪些工作将失败。提出的方法在资源使用方面更有效地优化了云应用程序。使用公开可用的Google Cluster、Mustang和Trinity跟踪,对所提议的模型进行了深入的评估。这些轨迹也被输入到几个不同的机器学习模型中,以选择最可靠的模型。我们的效率分析证明,该模型在准确率、f1分数和召回率方面表现良好。预测工作的失败、调度算法的设计、优先级标准的修改和任务重新提交的限制等几个因素可能会增加云服务的可靠性和可用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信