{"title":"Data analysis of cyber-activity within high performance computing environments","authors":"L. Ji, S. Kolhe, A. Clark","doi":"10.1109/UEMCON.2017.8249003","DOIUrl":null,"url":null,"abstract":"High performance computing (HPC) environments are becoming the norm for daily use. However, the resilience of these systems is questionable because their complex infrastructure makes troubleshooting both the location and cause of failures extremely difficult. These same reasons make HPCs prone to virulent activity. This paper presents a data analysis framework for analyzing ranges of failure observations as a result of malicious activity. Taking into account the internal reliability infrastructure, data network extrapolation is performed as a preprocessing tool that accurately calculates the normalized failure rates. Next, nonlinear regression is performed on the spectrum of observations taking into account the magnitude, growth rate, and midpoint behavior. Additionally, influence analysis is performed that considers outlying observations. The empirical results using a simulated supercomputing modeling and simulation framework show improvement, in terms of characterization performance, where approximately 91% of the nodes were properly characterized. The results of this work can be applied to develop robust task-scheduling frameworks within supercomputing architectures.","PeriodicalId":403890,"journal":{"name":"2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON.2017.8249003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
High performance computing (HPC) environments are becoming the norm for daily use. However, the resilience of these systems is questionable because their complex infrastructure makes troubleshooting both the location and cause of failures extremely difficult. These same reasons make HPCs prone to virulent activity. This paper presents a data analysis framework for analyzing ranges of failure observations as a result of malicious activity. Taking into account the internal reliability infrastructure, data network extrapolation is performed as a preprocessing tool that accurately calculates the normalized failure rates. Next, nonlinear regression is performed on the spectrum of observations taking into account the magnitude, growth rate, and midpoint behavior. Additionally, influence analysis is performed that considers outlying observations. The empirical results using a simulated supercomputing modeling and simulation framework show improvement, in terms of characterization performance, where approximately 91% of the nodes were properly characterized. The results of this work can be applied to develop robust task-scheduling frameworks within supercomputing architectures.