{"title":"EMS®:面向数据驱动机器人的大规模计算实验管理系统","authors":"Qinjie Lin, Guo Ye, Han Liu","doi":"10.1109/ICRA48891.2023.10160948","DOIUrl":null,"url":null,"abstract":"We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration management layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each experimental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data-driven robotics research leading to much-improved productivity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two magnitudes of orders (in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications.","PeriodicalId":360533,"journal":{"name":"2023 IEEE International Conference on Robotics and Automation (ICRA)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"EMS®: A Massive Computational Experiment Management System towards Data-driven Robotics\",\"authors\":\"Qinjie Lin, Guo Ye, Han Liu\",\"doi\":\"10.1109/ICRA48891.2023.10160948\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration management layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each experimental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data-driven robotics research leading to much-improved productivity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two magnitudes of orders (in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications.\",\"PeriodicalId\":360533,\"journal\":{\"name\":\"2023 IEEE International Conference on Robotics and Automation (ICRA)\",\"volume\":\"65 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE International Conference on Robotics and Automation (ICRA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRA48891.2023.10160948\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48891.2023.10160948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
EMS®: A Massive Computational Experiment Management System towards Data-driven Robotics
We propose EMS®, a cloud-enabled massive computational experiment management system supporting high-throughput computational robotics research. Compared to existing systems, EMS® features a sky-based pipeline orchestrator which allows us to exploit heterogeneous computing environments painlessly (e.g., on-premise clusters, public clouds, edge devices) to optimally deploy large-scale computational jobs (e.g., with more than millions of computational hours) in an integrated fashion. Cornerstoned on this sky-based pipeline orchestrator, this paper introduces three abstraction layers of the EMS® software architecture: (i) Configuration management layer focusing on automatically enumerating experimental configurations; (ii) Dependency management layer focusing on managing the complex task dependencies within each experimental configuration; (iii) Computation management layer focusing on optimally executing the computational tasks using the given computing resource. Such an architectural design greatly increases the scalability and reproducibility of data-driven robotics research leading to much-improved productivity. To demonstrate this point, we compare EMS® with more traditional approaches on an offline reinforcement learning problem for training mobile robots. Our results show that EMS® outperforms more traditional approaches in two magnitudes of orders (in terms of experimental high throughput and cost) with only several lines of code change. We also exploit EMS® to develop mobile robot, robot arm, and bipedal applications, demonstrating its applicability to numerous robot applications.