{"title":"谷歌数据中心数据分析","authors":"P. Minet, É. Renault, I. Khoufi, S. Boumerdassi","doi":"10.1109/CCGRID.2018.00049","DOIUrl":null,"url":null,"abstract":"Data collected from an operational Google data center during 29 days represent a very rich and very useful source of information for understanding the main features of a data center. In this paper, we highlight the strong heterogeneity of jobs. The distribution of job execution duration shows a high disparity, as well as the job waiting time before being scheduled. The resource requests in terms of CPU and memory are also analyzed. The knowledge of all these features is needed to design models of jobs, machines and resource requests that are representative of a real data center.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"14 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Data Analysis of a Google Data Center\",\"authors\":\"P. Minet, É. Renault, I. Khoufi, S. Boumerdassi\",\"doi\":\"10.1109/CCGRID.2018.00049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data collected from an operational Google data center during 29 days represent a very rich and very useful source of information for understanding the main features of a data center. In this paper, we highlight the strong heterogeneity of jobs. The distribution of job execution duration shows a high disparity, as well as the job waiting time before being scheduled. The resource requests in terms of CPU and memory are also analyzed. The knowledge of all these features is needed to design models of jobs, machines and resource requests that are representative of a real data center.\",\"PeriodicalId\":321027,\"journal\":{\"name\":\"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)\",\"volume\":\"14 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGRID.2018.00049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2018.00049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data collected from an operational Google data center during 29 days represent a very rich and very useful source of information for understanding the main features of a data center. In this paper, we highlight the strong heterogeneity of jobs. The distribution of job execution duration shows a high disparity, as well as the job waiting time before being scheduled. The resource requests in terms of CPU and memory are also analyzed. The knowledge of all these features is needed to design models of jobs, machines and resource requests that are representative of a real data center.