Isopod: An Expressive DSL for Kubernetes Configuration

Charles Xu, Dmitry Ilyevskiy
{"title":"Isopod: An Expressive DSL for Kubernetes Configuration","authors":"Charles Xu, Dmitry Ilyevskiy","doi":"10.1145/3357223.3365759","DOIUrl":null,"url":null,"abstract":"Kubernetes is an open-source cluster orchestration system for containerized workloads to reduce idiosyncrasy across cloud vendors [2]. Using Kubernetes, Cruise has built a multi-tenant platform with thousands of cores and tens of terabytes of memory. Such a scale is possible in part thanks to the declarative abstraction of Kubernetes, where desired states are described in YAML manifests [5]. However, YAML as a data serialization format is unfit for workload specification. Structured data in YAML are untyped and prone to wrong indents and missing fields. Due to poor meta-programming support, composing YAML with control logic---loops and branches---suffers from YAML fragmentation and indentation tracking (example at bit.ly/yml-hell). Moreover, YAML manifests are often generated by filling a shared template with cluster-specific parameters---the image tag and the replica count might differ in development and production environments. Existing templating tools---Helm [11], Kustomize [9], Kapitan [7] and the likes---assume these parameters are statically known and use CLIs to query dynamic ones, such as secrets stored in HashiCorp Vault [10]. Such scheme is hard to test, since side effects escape through CLIs, and highly depends on the execution environment, since CLI versions vary across machines or might not exist. Not least, YAML manifests describe the eventual state but not how existing workloads will be affected. Blindly applying the manifest---for example, from a stale version of code---can be disastrous and cause unexpected outages. Isopod presents an alternative configuration paradigm by treating Kubernetes objects as first-class citizens. Without intermediate YAML artifacts, Isopod renders Kubernetes objects directly in Protocol Buffers [8], so they are strongly typed and consumed directly by the Kubernetes API. With Isopod, configurations are scripted in Starlark [3], a Python dialect by Google also used by Bazel [1] and Buck [4] build systems. To replace CLI dependencies, Isopod extends Starlark with runtime built-ins to access services and utilities such as Vault, Kubernetes apiserver, Base64 encoder, and UUID generator, etc. Isopod uses a separate runtime for unit tests to mock all built-ins, providing test coverage that was not possible before. Isopod is also hermetic and secure. The common reliance on the kubeconfig file for cluster authentication leaks secrets to disk, a security risk if working from a shared host, such as a cluster node or CICD worker. Instead, Isopod builds Oauth2 tokens [6] to the target cluster using the Identity & Access Management (IAM) service of the cloud vendor. Application secrets are stored in Vault and queried at runtime. Hence, no secrets escape to the disk. In fact, Isopod prohibits disk IO except for loading Starlark modules from other scripts. No external libraries can be loaded unless explicitly implemented as an Isopod built-in. Distributed as a single binary, Isopod is self-contained with all dependencies. Finally, Isopod is extensible. Protobuf packages of Kubernetes API groups added in the future can be loaded in the same way. Because built-ins are modular and pluggable, users can easily implement and register new built-ins with the Isopod runtime to support any Kubernetes vendors. Isopod offers many other features, such as object life cycle management and parallel rollout to multiple clusters, which is impossible if using kubeconfig. In dry-run mode, Isopod displays intended actions from the current code change as a YAML diff against live objects in the cluster to avoid unexpected configuration change. Since the adoption of Isopod, the PaaS team at Cruise has migrated 14 applications and added another 16 without outage or regression, totaling around 10,000 lines of Starlark. The migration results in up to 60% reduction in code size and 80% faster rollout due to code reuse, cluster parallelism, and the removal of YAML intermediaries. All unit tests take less than 10 secs to finish. Isopod is open source at github.com/cruise-automation/isopod.","PeriodicalId":91949,"journal":{"name":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","volume":"28 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM Symposium on Cloud Computing [electronic resource] : SOCC ... ... SoCC (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3357223.3365759","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Kubernetes is an open-source cluster orchestration system for containerized workloads to reduce idiosyncrasy across cloud vendors [2]. Using Kubernetes, Cruise has built a multi-tenant platform with thousands of cores and tens of terabytes of memory. Such a scale is possible in part thanks to the declarative abstraction of Kubernetes, where desired states are described in YAML manifests [5]. However, YAML as a data serialization format is unfit for workload specification. Structured data in YAML are untyped and prone to wrong indents and missing fields. Due to poor meta-programming support, composing YAML with control logic---loops and branches---suffers from YAML fragmentation and indentation tracking (example at bit.ly/yml-hell). Moreover, YAML manifests are often generated by filling a shared template with cluster-specific parameters---the image tag and the replica count might differ in development and production environments. Existing templating tools---Helm [11], Kustomize [9], Kapitan [7] and the likes---assume these parameters are statically known and use CLIs to query dynamic ones, such as secrets stored in HashiCorp Vault [10]. Such scheme is hard to test, since side effects escape through CLIs, and highly depends on the execution environment, since CLI versions vary across machines or might not exist. Not least, YAML manifests describe the eventual state but not how existing workloads will be affected. Blindly applying the manifest---for example, from a stale version of code---can be disastrous and cause unexpected outages. Isopod presents an alternative configuration paradigm by treating Kubernetes objects as first-class citizens. Without intermediate YAML artifacts, Isopod renders Kubernetes objects directly in Protocol Buffers [8], so they are strongly typed and consumed directly by the Kubernetes API. With Isopod, configurations are scripted in Starlark [3], a Python dialect by Google also used by Bazel [1] and Buck [4] build systems. To replace CLI dependencies, Isopod extends Starlark with runtime built-ins to access services and utilities such as Vault, Kubernetes apiserver, Base64 encoder, and UUID generator, etc. Isopod uses a separate runtime for unit tests to mock all built-ins, providing test coverage that was not possible before. Isopod is also hermetic and secure. The common reliance on the kubeconfig file for cluster authentication leaks secrets to disk, a security risk if working from a shared host, such as a cluster node or CICD worker. Instead, Isopod builds Oauth2 tokens [6] to the target cluster using the Identity & Access Management (IAM) service of the cloud vendor. Application secrets are stored in Vault and queried at runtime. Hence, no secrets escape to the disk. In fact, Isopod prohibits disk IO except for loading Starlark modules from other scripts. No external libraries can be loaded unless explicitly implemented as an Isopod built-in. Distributed as a single binary, Isopod is self-contained with all dependencies. Finally, Isopod is extensible. Protobuf packages of Kubernetes API groups added in the future can be loaded in the same way. Because built-ins are modular and pluggable, users can easily implement and register new built-ins with the Isopod runtime to support any Kubernetes vendors. Isopod offers many other features, such as object life cycle management and parallel rollout to multiple clusters, which is impossible if using kubeconfig. In dry-run mode, Isopod displays intended actions from the current code change as a YAML diff against live objects in the cluster to avoid unexpected configuration change. Since the adoption of Isopod, the PaaS team at Cruise has migrated 14 applications and added another 16 without outage or regression, totaling around 10,000 lines of Starlark. The migration results in up to 60% reduction in code size and 80% faster rollout due to code reuse, cluster parallelism, and the removal of YAML intermediaries. All unit tests take less than 10 secs to finish. Isopod is open source at github.com/cruise-automation/isopod.
Isopod:用于Kubernetes配置的表达性DSL
Kubernetes是一个用于容器化工作负载的开源集群编排系统,以减少云供应商之间的特殊性[2]。使用Kubernetes, Cruise构建了一个拥有数千个内核和数十tb内存的多租户平台。这样的规模在某种程度上是可能的,这要归功于Kubernetes的声明性抽象,在YAML清单中描述了所需的状态[5]。然而,YAML作为数据序列化格式不适合工作负载规范。YAML中的结构化数据没有类型,容易出现错误缩进和缺少字段。由于缺乏元编程支持,使用控制逻辑(循环和分支)组合YAML会受到YAML碎片化和缩进跟踪的困扰(例如bit.ly/yml-hell)。此外,YAML清单通常是通过使用特定于集群的参数填充共享模板生成的——在开发环境和生产环境中,图像标记和副本计数可能不同。现有的模板工具——Helm[11]、Kustomize[9]、Kapitan[7]等——假设这些参数是静态已知的,并使用cli查询动态参数,例如存储在HashiCorp Vault[10]中的秘密。这种方案很难测试,因为副作用会通过CLI逃逸,并且高度依赖于执行环境,因为不同机器的CLI版本不同,或者可能不存在。尤其重要的是,YAML清单描述了最终状态,而不是现有工作负载将如何受到影响。盲目地应用清单(例如,从过时的代码版本应用清单)可能是灾难性的,并会导致意外的中断。Isopod通过将Kubernetes对象视为一等公民提供了另一种配置范例。没有中间的YAML工件,Isopod直接在协议缓冲区中呈现Kubernetes对象[8],因此它们是强类型的,并由Kubernetes API直接使用。使用Isopod,配置是用Starlark[3]编写的,Starlark是Google的一种Python方言,也被Bazel[1]和Buck[4]构建系统使用。为了取代对CLI的依赖,Isopod用内置的运行时扩展了Starlark,以访问服务和实用程序,如Vault、Kubernetes apisserver、Base64编码器和UUID生成器等。Isopod为单元测试使用了一个单独的运行时来模拟所有内置组件,从而提供了以前不可能实现的测试覆盖率。Isopod也是密封和安全的。通常依赖kubecconfig文件进行集群身份验证会将秘密泄露到磁盘,如果在共享主机(如集群节点或CICD worker)上工作,则存在安全风险。相反,Isopod使用云供应商的身份与访问管理(Identity & Access Management, IAM)服务构建Oauth2令牌[6]到目标集群。应用程序秘密存储在Vault中,并在运行时查询。因此,没有秘密逃到磁盘。事实上,Isopod禁止磁盘IO,除非从其他脚本加载Starlark模块。除非显式地实现为内置的Isopod,否则不能加载任何外部库。Isopod以单一二进制文件的形式分布,是自包含所有依赖项的。最后,Isopod是可扩展的。将来添加的Kubernetes API组的Protobuf包也可以以同样的方式加载。因为内置插件是模块化和可插拔的,所以用户可以很容易地在Isopod运行时实现和注册新的内置插件,以支持任何Kubernetes供应商。Isopod提供了许多其他特性,例如对象生命周期管理和并行部署到多个集群,如果使用kubecconfig,这是不可能的。在干运行模式下,Isopod将当前代码更改的预期操作显示为针对集群中活动对象的YAML差异,以避免意外的配置更改。自从采用Isopod以来,Cruise的PaaS团队已经迁移了14个应用程序,并在没有中断或回归的情况下添加了另外16个应用程序,总计约10,000行Starlark。由于代码重用、集群并行性和YAML中介体的移除,迁移导致代码大小减少了60%,推出速度提高了80%。所有单元测试的完成时间都不超过10秒。Isopod是开源的,网址是github.com/cruise-automation/isopod。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信