{"title":"作为计算机科学选修课的站点可靠性工程教学","authors":"M. Dickerson, Tzu-Yi Chen","doi":"10.1145/3545945.3569809","DOIUrl":null,"url":null,"abstract":"Much of the previous work on preparing undergraduates for industry focusses on software engineering and the skills needed to design and to implement new software systems. There has been relatively little attention given to the skills needed to maintain, to modify, and to repair systems already in use. These skills are captured in the emerging discipline of site reliability engineering, a relative of software reliability engineering. Site reliability engineers use a distinct set of skills, tools, and techniques for managing complex production systems. More importantly, they have a mindset that prioritizes high performance and reliability while attempting to minimize repetitive tasks done by human operators. In this paper we describe an upper-division elective that was designed to introduce students to site reliability engineering through hands-on assignments requiring teams to deploy, maintain, and scale a working software system, done alongside readings and discussion of high-stakes episodes from the broader history of complex systems. We discuss the design of the class and reflect on what worked well and not so well in the initial offerings.","PeriodicalId":371326,"journal":{"name":"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1","volume":"67 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Teaching Site Reliability Engineering as a Computer Science Elective\",\"authors\":\"M. Dickerson, Tzu-Yi Chen\",\"doi\":\"10.1145/3545945.3569809\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Much of the previous work on preparing undergraduates for industry focusses on software engineering and the skills needed to design and to implement new software systems. There has been relatively little attention given to the skills needed to maintain, to modify, and to repair systems already in use. These skills are captured in the emerging discipline of site reliability engineering, a relative of software reliability engineering. Site reliability engineers use a distinct set of skills, tools, and techniques for managing complex production systems. More importantly, they have a mindset that prioritizes high performance and reliability while attempting to minimize repetitive tasks done by human operators. In this paper we describe an upper-division elective that was designed to introduce students to site reliability engineering through hands-on assignments requiring teams to deploy, maintain, and scale a working software system, done alongside readings and discussion of high-stakes episodes from the broader history of complex systems. We discuss the design of the class and reflect on what worked well and not so well in the initial offerings.\",\"PeriodicalId\":371326,\"journal\":{\"name\":\"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1\",\"volume\":\"67 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3545945.3569809\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3545945.3569809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Teaching Site Reliability Engineering as a Computer Science Elective
Much of the previous work on preparing undergraduates for industry focusses on software engineering and the skills needed to design and to implement new software systems. There has been relatively little attention given to the skills needed to maintain, to modify, and to repair systems already in use. These skills are captured in the emerging discipline of site reliability engineering, a relative of software reliability engineering. Site reliability engineers use a distinct set of skills, tools, and techniques for managing complex production systems. More importantly, they have a mindset that prioritizes high performance and reliability while attempting to minimize repetitive tasks done by human operators. In this paper we describe an upper-division elective that was designed to introduce students to site reliability engineering through hands-on assignments requiring teams to deploy, maintain, and scale a working software system, done alongside readings and discussion of high-stakes episodes from the broader history of complex systems. We discuss the design of the class and reflect on what worked well and not so well in the initial offerings.