Zichong Li, Lan Zhang, Mu Yuan, Miao-Hui Song, Qianjun Song
{"title":"基于查询难度相关任务调度的高效深度集成推理","authors":"Zichong Li, Lan Zhang, Mu Yuan, Miao-Hui Song, Qianjun Song","doi":"10.1109/ICDE55515.2023.00082","DOIUrl":null,"url":null,"abstract":"Deep ensemble learning has been widely adopted to boost accuracy through combing outputs from multiple deep models prepared for the same task. However, the extra computation and memory cost it entails could impose an unacceptably high deadline miss rate in latency-sensitive tasks. Conventional approaches, including ensemble selection, focus on accuracy while ignoring deadline constraints, and thus cannot smartly cope with bursty query traffic and queries with different hardness. This paper explores redundancy in deep ensemble model inference and presents Schemble, a query difficulty-dependent task scheduling framework. Schemble treats ensemble inference progress as multiple base model inference tasks and schedules tasks for queries based on their difficulty and queuing status. We evaluate Schemble on real-world datasets, considering intelligent Q&A system, video analysis and image retrieval as the running applications. Experimental results show that Schemble achieves a 5× lower deadline miss rate and improves the accuracy by 30.8% given deadline constraints.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient Deep Ensemble Inference via Query Difficulty-dependent Task Scheduling\",\"authors\":\"Zichong Li, Lan Zhang, Mu Yuan, Miao-Hui Song, Qianjun Song\",\"doi\":\"10.1109/ICDE55515.2023.00082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep ensemble learning has been widely adopted to boost accuracy through combing outputs from multiple deep models prepared for the same task. However, the extra computation and memory cost it entails could impose an unacceptably high deadline miss rate in latency-sensitive tasks. Conventional approaches, including ensemble selection, focus on accuracy while ignoring deadline constraints, and thus cannot smartly cope with bursty query traffic and queries with different hardness. This paper explores redundancy in deep ensemble model inference and presents Schemble, a query difficulty-dependent task scheduling framework. Schemble treats ensemble inference progress as multiple base model inference tasks and schedules tasks for queries based on their difficulty and queuing status. We evaluate Schemble on real-world datasets, considering intelligent Q&A system, video analysis and image retrieval as the running applications. Experimental results show that Schemble achieves a 5× lower deadline miss rate and improves the accuracy by 30.8% given deadline constraints.\",\"PeriodicalId\":434744,\"journal\":{\"name\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE55515.2023.00082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Deep Ensemble Inference via Query Difficulty-dependent Task Scheduling
Deep ensemble learning has been widely adopted to boost accuracy through combing outputs from multiple deep models prepared for the same task. However, the extra computation and memory cost it entails could impose an unacceptably high deadline miss rate in latency-sensitive tasks. Conventional approaches, including ensemble selection, focus on accuracy while ignoring deadline constraints, and thus cannot smartly cope with bursty query traffic and queries with different hardness. This paper explores redundancy in deep ensemble model inference and presents Schemble, a query difficulty-dependent task scheduling framework. Schemble treats ensemble inference progress as multiple base model inference tasks and schedules tasks for queries based on their difficulty and queuing status. We evaluate Schemble on real-world datasets, considering intelligent Q&A system, video analysis and image retrieval as the running applications. Experimental results show that Schemble achieves a 5× lower deadline miss rate and improves the accuracy by 30.8% given deadline constraints.