D. Abramson
{"title":"FTS 2016研讨会主题演讲","authors":"D. Abramson","doi":"10.1109/CLUSTER.2016.98","DOIUrl":null,"url":null,"abstract":"Debugging software has always been difficult, with little tool support available. Finding faults in parallel programs is even harder because the machines and problems are so large, and the amount of state to be examined becomes prohibitive. Faults are often introduced when codes are modified, the software or hardware environment changes or they are scaled up to solve larger problems. All too often we hear the programmers scream “It's not my fault!” Over the years we have developed a technique called “Relative Debugging”, in which a code is debugged against another, reference, version. This makes the process simpler because programmers can compare the state of computation between a faulty version and a previous code that is correct, and the programmer doesn't need to have a mental model of what the program state should be. However, relative debugging can also be expensive because it needs to compare large data structures across the machine. Parallel computers offer a way of accelerating the comparisons using parallel algorithms, making the technique practical. In this talk I will introduce relative debugging, show how it assists test and debug, and discuss the various techniques used to scale it up to very large problems and machines. Bio: Professor David Abramson has been involved in computer architecture and high performance computing research since 1979. He has held appointments at Griffith University, CSIRO, RMIT and Monash University. At CSIRO he was the program leader of the Division of Information Technology High Performance Computing Program, and was also an adjunct Associate Professor at RMIT in Melbourne. He served as a program manager and chief investigator in the Co-operative Research Centre for Intelligent Decisions Systems and the Co-operative Research Centre for Enterprise Distributed Systems. He was the Director of the Monash e-Education Centre and a Professor of Computer Science in the Faculty of Information Technology at Monash University. Abramson is currently the Director of the Research Computing Centre at the University of Queensland. He is a fellow of the Association for Computing Machinery (ACM), the Academy of Science and Technological Engineering (ATSE) and the Australian Computer Society (ACS), and a Senior Member of the IEEE. xv 2016 IEEE International Conference on Cluster Computing 2168-9253/16 $31.00 © 2016 IEEE DOI 10.1109/CLUSTER.2016.98 497","PeriodicalId":92128,"journal":{"name":"Proceedings. IEEE International Conference on Cluster Computing","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FTS 2016 Workshop Keynote Speech\",\"authors\":\"D. Abramson\",\"doi\":\"10.1109/CLUSTER.2016.98\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Debugging software has always been difficult, with little tool support available. Finding faults in parallel programs is even harder because the machines and problems are so large, and the amount of state to be examined becomes prohibitive. Faults are often introduced when codes are modified, the software or hardware environment changes or they are scaled up to solve larger problems. All too often we hear the programmers scream “It's not my fault!” Over the years we have developed a technique called “Relative Debugging”, in which a code is debugged against another, reference, version. This makes the process simpler because programmers can compare the state of computation between a faulty version and a previous code that is correct, and the programmer doesn't need to have a mental model of what the program state should be. However, relative debugging can also be expensive because it needs to compare large data structures across the machine. Parallel computers offer a way of accelerating the comparisons using parallel algorithms, making the technique practical. In this talk I will introduce relative debugging, show how it assists test and debug, and discuss the various techniques used to scale it up to very large problems and machines. Bio: Professor David Abramson has been involved in computer architecture and high performance computing research since 1979. He has held appointments at Griffith University, CSIRO, RMIT and Monash University. At CSIRO he was the program leader of the Division of Information Technology High Performance Computing Program, and was also an adjunct Associate Professor at RMIT in Melbourne. He served as a program manager and chief investigator in the Co-operative Research Centre for Intelligent Decisions Systems and the Co-operative Research Centre for Enterprise Distributed Systems. He was the Director of the Monash e-Education Centre and a Professor of Computer Science in the Faculty of Information Technology at Monash University. Abramson is currently the Director of the Research Computing Centre at the University of Queensland. He is a fellow of the Association for Computing Machinery (ACM), the Academy of Science and Technological Engineering (ATSE) and the Australian Computer Society (ACS), and a Senior Member of the IEEE. xv 2016 IEEE International Conference on Cluster Computing 2168-9253/16 $31.00 © 2016 IEEE DOI 10.1109/CLUSTER.2016.98 497\",\"PeriodicalId\":92128,\"journal\":{\"name\":\"Proceedings. IEEE International Conference on Cluster Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE International Conference on Cluster Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLUSTER.2016.98\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2016.98","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
FTS 2016 Workshop Keynote Speech
Debugging software has always been difficult, with little tool support available. Finding faults in parallel programs is even harder because the machines and problems are so large, and the amount of state to be examined becomes prohibitive. Faults are often introduced when codes are modified, the software or hardware environment changes or they are scaled up to solve larger problems. All too often we hear the programmers scream “It's not my fault!” Over the years we have developed a technique called “Relative Debugging”, in which a code is debugged against another, reference, version. This makes the process simpler because programmers can compare the state of computation between a faulty version and a previous code that is correct, and the programmer doesn't need to have a mental model of what the program state should be. However, relative debugging can also be expensive because it needs to compare large data structures across the machine. Parallel computers offer a way of accelerating the comparisons using parallel algorithms, making the technique practical. In this talk I will introduce relative debugging, show how it assists test and debug, and discuss the various techniques used to scale it up to very large problems and machines. Bio: Professor David Abramson has been involved in computer architecture and high performance computing research since 1979. He has held appointments at Griffith University, CSIRO, RMIT and Monash University. At CSIRO he was the program leader of the Division of Information Technology High Performance Computing Program, and was also an adjunct Associate Professor at RMIT in Melbourne. He served as a program manager and chief investigator in the Co-operative Research Centre for Intelligent Decisions Systems and the Co-operative Research Centre for Enterprise Distributed Systems. He was the Director of the Monash e-Education Centre and a Professor of Computer Science in the Faculty of Information Technology at Monash University. Abramson is currently the Director of the Research Computing Centre at the University of Queensland. He is a fellow of the Association for Computing Machinery (ACM), the Academy of Science and Technological Engineering (ATSE) and the Australian Computer Society (ACS), and a Senior Member of the IEEE. xv 2016 IEEE International Conference on Cluster Computing 2168-9253/16 $31.00 © 2016 IEEE DOI 10.1109/CLUSTER.2016.98 497