{"title":"使用DRL减少DNN推理延迟","authors":"Suhwan Kim, Sehun Jung, Hyang-Won Lee","doi":"10.1109/ICTC55196.2022.9952987","DOIUrl":null,"url":null,"abstract":"With the development of artificial intelligence (AI) technology, many applications are providing AI services. The key part of these AI services is the Deep Neural Networks(DNNs) requiring a lot of computation. However, it is usually time-consuming to provide an inference process on end devices that lack resources. Because of these limitations, distributed computing, which can perform large amounts of calculations using the processing power of various computers connected to the Internet, is emerging. We develop how to efficiently distribute DNN inference jobs in distributed computing environments and quickly process large amounts of DNN computations. In this paper, we will introduce the learning method and the results of the Deep Reinforcement Learning(DRL) model to reduce end-to-end latency by observing the state of the distributed computing environment and scheduling the DNN job using DRL.","PeriodicalId":441404,"journal":{"name":"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)","volume":"15 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reducing DNN inference latency using DRL\",\"authors\":\"Suhwan Kim, Sehun Jung, Hyang-Won Lee\",\"doi\":\"10.1109/ICTC55196.2022.9952987\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the development of artificial intelligence (AI) technology, many applications are providing AI services. The key part of these AI services is the Deep Neural Networks(DNNs) requiring a lot of computation. However, it is usually time-consuming to provide an inference process on end devices that lack resources. Because of these limitations, distributed computing, which can perform large amounts of calculations using the processing power of various computers connected to the Internet, is emerging. We develop how to efficiently distribute DNN inference jobs in distributed computing environments and quickly process large amounts of DNN computations. In this paper, we will introduce the learning method and the results of the Deep Reinforcement Learning(DRL) model to reduce end-to-end latency by observing the state of the distributed computing environment and scheduling the DNN job using DRL.\",\"PeriodicalId\":441404,\"journal\":{\"name\":\"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)\",\"volume\":\"15 3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTC55196.2022.9952987\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 13th International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC55196.2022.9952987","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
With the development of artificial intelligence (AI) technology, many applications are providing AI services. The key part of these AI services is the Deep Neural Networks(DNNs) requiring a lot of computation. However, it is usually time-consuming to provide an inference process on end devices that lack resources. Because of these limitations, distributed computing, which can perform large amounts of calculations using the processing power of various computers connected to the Internet, is emerging. We develop how to efficiently distribute DNN inference jobs in distributed computing environments and quickly process large amounts of DNN computations. In this paper, we will introduce the learning method and the results of the Deep Reinforcement Learning(DRL) model to reduce end-to-end latency by observing the state of the distributed computing environment and scheduling the DNN job using DRL.