N. Chawla, D. Thain, Ryan Lichtenwalter, David A. Cieslak
{"title":"数据挖掘是对网格的挖掘","authors":"N. Chawla, D. Thain, Ryan Lichtenwalter, David A. Cieslak","doi":"10.1109/IPDPS.2008.4536427","DOIUrl":null,"url":null,"abstract":"Both users and administrators of computing grids are presented with enormous challenges in debugging and troubleshooting. Diagnosing a problem with one application on one machine is hard enough, but diagnosing problems in workloads of millions of jobs running on thousands of machines is a problem of a new order of magnitude. Suppose that a user submits one million jobs to a grid, only to discover some time later that half of them have failed, Users of large scale systems need tools that describe the overall situation, indicating what problems are commonplace versus occasional, and which are deterministic versus random. Machine learning techniques can be used to debug these kinds of problems in large scale systems. We present a comprehensive framework from data to knowledge discovery as an important step towards achieving this vision.","PeriodicalId":162608,"journal":{"name":"2008 IEEE International Symposium on Parallel and Distributed Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Data mining on the grid for the grid\",\"authors\":\"N. Chawla, D. Thain, Ryan Lichtenwalter, David A. Cieslak\",\"doi\":\"10.1109/IPDPS.2008.4536427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Both users and administrators of computing grids are presented with enormous challenges in debugging and troubleshooting. Diagnosing a problem with one application on one machine is hard enough, but diagnosing problems in workloads of millions of jobs running on thousands of machines is a problem of a new order of magnitude. Suppose that a user submits one million jobs to a grid, only to discover some time later that half of them have failed, Users of large scale systems need tools that describe the overall situation, indicating what problems are commonplace versus occasional, and which are deterministic versus random. Machine learning techniques can be used to debug these kinds of problems in large scale systems. We present a comprehensive framework from data to knowledge discovery as an important step towards achieving this vision.\",\"PeriodicalId\":162608,\"journal\":{\"name\":\"2008 IEEE International Symposium on Parallel and Distributed Processing\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE International Symposium on Parallel and Distributed Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPS.2008.4536427\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE International Symposium on Parallel and Distributed Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS.2008.4536427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Both users and administrators of computing grids are presented with enormous challenges in debugging and troubleshooting. Diagnosing a problem with one application on one machine is hard enough, but diagnosing problems in workloads of millions of jobs running on thousands of machines is a problem of a new order of magnitude. Suppose that a user submits one million jobs to a grid, only to discover some time later that half of them have failed, Users of large scale systems need tools that describe the overall situation, indicating what problems are commonplace versus occasional, and which are deterministic versus random. Machine learning techniques can be used to debug these kinds of problems in large scale systems. We present a comprehensive framework from data to knowledge discovery as an important step towards achieving this vision.