{"title":"EMS®: A Massive Computational Experiment Management System towards Data-driven Robotics","authors":"Qinjie Lin, Guo Ye, Han Liu","doi":"10.1109/ICRA48891.2023.10160948","DOIUrl":null,"url":null,"abstract":"We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration management layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each experimental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data-driven robotics research leading to much-improved productivity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two magnitudes of orders (in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications.","PeriodicalId":360533,"journal":{"name":"2023 IEEE International Conference on Robotics and Automation (ICRA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48891.2023.10160948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration management layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each experimental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data-driven robotics research leading to much-improved productivity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two magnitudes of orders (in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications.