{"title":"Segment-Level FP-Scheduling in FreeRTOS","authors":"R. Edmaier, Niklas Ueter, Jian-Jia Chen","doi":"10.1109/RTCSA55878.2022.00026","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00026","url":null,"abstract":"In the domain of embedded systems, modern SoCs (System-on-Chips) increasingly employ dedicated hardware to improve the performance of specialized tasks. The herein generated performance benefits come at the cost of increased coordination complexity of multiple tasks accessing these various hardware units in varying alternating sequences. For example, a task may first execute on a processor and then proceed execution on a GPU. This problem is even more complex in the case of real-time constraints, i.e., the execution within formally guaranteed time bounds. Real-time constraints may lead to severe resource under-utilization if the scheduling algorithms are not properly designed. A solution to this problem is self-suspension and segment-level fixed-priority scheduling. In this approach, tasks are divided into successive alternating segments of computation and self-suspension. The task may self-suspend if it tries to access a hardware resource that is already held by another task. In this paper, we propose and discuss different implementations of the segmented self-suspension task model in the FreeRTOS real-time operating system. Moreover, we evaluate the overhead of the different implementations on the OM40007 IoT-module from NXP.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"169 1","pages":"186-194"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85169270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Welcome Message from the RTCSA 2022 Chairs","authors":"","doi":"10.1109/rtcsa55878.2022.00005","DOIUrl":"https://doi.org/10.1109/rtcsa55878.2022.00005","url":null,"abstract":"","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"1 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83892243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting Binary Equilibrium for Efficient LDPC Decoding in 3D NAND Flash","authors":"Hsiang-Sen Hsu, Li-Pin Chang","doi":"10.1109/RTCSA55878.2022.00018","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00018","url":null,"abstract":"3D NAND flash is prone to bit errors due to severe charge leakage. Modern SSDs adopt LDPC for bit error management, but LDPC can incur a high read latency through iterative adjustment to the reference voltage. Bit scrambling helps reduce inter-cell interference, and with it, ones and zeros equally contribute to raw data. We observed that as bit errors develop, the 0-bit ratio in raw data deviates from 50%. Inspired by this property, we propose a method for fast adjustment to the reference voltage, involving a placement step and a fine-tuning step. Our method uses only a few hundreds of bytes of RAM but improves the average read latency upon existing methods by up to 24%.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"94 1","pages":"113-119"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91342144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enabling Real-time AI Inference on Mobile Devices via GPU-CPU Collaborative Execution","authors":"Hao Li, J. Ng, T. Abdelzaher","doi":"10.1109/RTCSA55878.2022.00027","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00027","url":null,"abstract":"AI-powered mobile applications are becoming increasingly popular due to recent advances in machine intelligence. They include, but are not limited to mobile sensing, virtual assistants, and augmented reality. Mobile AI models, especially Deep Neural Networks (DNN), are usually executed locally, as sensory data are collected and generated by end devices. This imposes a heavy computational burden on the resource-constrained mobile phones. There are usually a set of DNN jobs with deadline constraints waiting for execution. Existing AI inference frameworks process incoming DNN jobs in sequential order, which does not optimally support mobile users’ real-time interactions with AI services. In this paper, we propose a framework to achieve real-time inference by exploring the heterogeneous mobile SoCs, which contain a CPU and a GPU. Considering characteristics of DNN models, we optimally partition the execution between the mobile GPU and CPU. We present a dynamic programming-based approach to solve the formulated real-time DNN partitioning and scheduling problem. The proposed framework has several desirable properties: 1) computational resources on mobile devices are better utilized; 2) it optimizes inference performance in terms of deadline miss rate; 3) no sacrifices in inference accuracy are made. Evaluation results on an off-the-shelf mobile phone show that our proposed framework can provide better real-time support for AI inference tasks on mobile platforms, compared to several baselines.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"29 1","pages":"195-204"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85137252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dawei Shen, Tianyu Zhang, Jiachen Wang, Qingxu Deng, Song Han, X. Hu
{"title":"QoS Guaranteed Resource Allocation for Coexisting eMBB and URLLC Traffic in 5G Industrial Networks","authors":"Dawei Shen, Tianyu Zhang, Jiachen Wang, Qingxu Deng, Song Han, X. Hu","doi":"10.1109/RTCSA55878.2022.00015","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00015","url":null,"abstract":"The fifth-generation (5G) cellular networks are increasingly considered for industrial applications, such as factory automation systems. In 5G networks, Enhanced Mobile Broadband (eMBB) and Ultra-Reliable Low-Latency Communication (URLLC) are two essential services. eMBB services require high data rates with some lower bounds while URLLC traffic is subject to strict latency and reliability requirements. Existing approaches to scheduling coexisting eMBB and URLLC traffic all assume that URLLC traffic preempts eMBB traffic immediately upon arrival, which can adversely impact the achievable eMBB data rates. Furthermore, none of the prior work considers guaranteeing minimum data rate requirements imposed on certain eMBB traffic. This paper proposes a new model to capture the URLLC and eMBB requirements and introduces a novel framework, QoSG-RA, to perform network resource allocation for coexisting eMBB and URLLC traffic. QoSG-RA builds on a hybrid offline/online approach which performs offline resource allocation to ensure the Quality of Service (QoS) requirements of eMBB and URLLC traffic to be satisfied and online resource allocation to maximize fairness on the data rates among eMBB traffic based on runtime information. QoSG-RA is able to (i) meet latency and reliability requirements of URLLC traffic, and (ii) maximize the data rates for eMBB traffic in a fair way while fulfilling their minimum data rate requirements. Experimental results demonstrate the effectiveness of QoSG-RA compared to the state-of-the-art.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"46 1","pages":"81-90"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82752680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yilian Ribot González, Geoffrey Nelissen, E. Tovar
{"title":"IPDeN: Real-Time deflection-based NoC with in-order flits delivery","authors":"Yilian Ribot González, Geoffrey Nelissen, E. Tovar","doi":"10.1109/RTCSA55878.2022.00023","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00023","url":null,"abstract":"In deflection-based Network-on-Chips (NoC), when several flits entering a router contend for the same output port, one of the flit is routed to the desired output and the others are deflected to alternatives outputs. The approach reduces power consumption and silicon footprint in comparison to virtual-channels (VCs) based solutions. However, due to the non-deterministic number of deflections that flits may suffer while traversing the network, flits may be received in an out-of-order fashion at their destinations. In this work, we present IPDeN, a novel deflection-based NoC that ensures in-order flit delivery. To avoid the use of costly reordering mechanisms at the destination of each communication flow, we propose a solution based on a single small buffer added to each router to prevents flits from over taking other flits belonging to the same communication flow. We also develop a worst-case traversal time (WCTT) analysis for packets transmitted over IPDeN. We implemented IPDeN in Verilog and synthesized it for an FPGA platform. We show that a router of IPDeN requires ≈3-times less hardware resources than routers that use VCs. Experimental results shown that the worst-case and average packets communication time is reduced in comparison to the state-of-the-art.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"20 1","pages":"160-169"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81276467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Performance Acceleration of Secure Machine Learning Computations for Edge Applications","authors":"Zi-Jie Lin, Chuan-Chi Wang, Chia-Heng Tu, Shih-Hao Hung","doi":"10.1109/RTCSA55878.2022.00021","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00021","url":null,"abstract":"Edge appliances built with machine learning applications have been gradually adopted in a wide variety of application fields, such as intelligent transportation, the banking industry, and medical diagnosis. Privacy-preserving computation approaches can be used on smart appliances in order to secure the privacy of sensitive data, including application data and the parameters of machine learning models. Nevertheless, the data privacy is achieved at the cost of execution time. That is, the execution speed of a secure machine learning application is several orders of magnitude slower than that of the application in plaintext. Especially, the performance gap is enlarged for edge appliances. In this work, in order to improve the execution efficiency of secure applications, an open-source software framework CrypTen is targeted, which is widely used for building secure machine learning applications using the Secure Multi-Party Computation (SMPC) based privacy-preserving computation approach. We analyze the performance characteristics of the secure machine learning applications built with CrypTen, and the analysis reveals that the communication overhead hinders the execution of the secure applications. To tackle the issue, a communication library, OpenMPI, is added to the CrypTen framework as a new communication backend to boost the application performance by up to 50%. We further develop a hybrid communication scheme by combining the OpenMPI backend with the original communication backend with the CrypTen framework. The experimental results show that the enhanced CrypTen framework is able to provide better performance for the small-size data (LeNet5 on MNIST dataset by up to 50% of speedup) and maintain similar performance for large-size data (AlexNet on CIFAR-10), compared to the original CrypTen framework.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"83 1","pages":"138-147"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83044458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Anytime-Lidar: Deadline-aware 3D Object Detection","authors":"Ahmet Soyyigit, Shuochao Yao, H. Yun","doi":"10.1109/RTCSA55878.2022.00010","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00010","url":null,"abstract":"In this work, we present a novel scheduling frame-work enabling anytime perception for deep neural network (DNN) based 3D object detection pipelines. We focus on computationally expensive region proposal network (RPN) and per-category multi-head detector components, which are common in 3D object detection pipelines, and make them deadline-aware. We propose a scheduling algorithm, which intelligently selects the subset of the components to make effective time and accuracy trade-off on the fly. We minimize accuracy loss of skipping some of the neural network sub-components by projecting previously detected objects onto the current scene through estimations. We apply our approach to a state-of-art 3D object detection network, PointPillars, and evaluate its performance on Jetson Xavier AGX using nuScenes dataset. Compared to the baselines, our approach significantly improve the network’s accuracy under various deadline constraints.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"80 1","pages":"31-40"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77604891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Energy-Adaptive Real-time Sensing for Batteryless Devices","authors":"Mohsen Karimi, Yidi Wang, Hyoseung Kim","doi":"10.1109/RTCSA55878.2022.00028","DOIUrl":"https://doi.org/10.1109/RTCSA55878.2022.00028","url":null,"abstract":"The use of batteryless energy harvesting devices has been recognized as a promising solution for their low maintenance requirements and ability to work in harsh environments. However, these devices have to harvest energy from ambient energy sources and execute real-time sensing tasks periodically while satisfying data freshness constraints, which is especially challenging as the energy sources are often unreliable and intermittent. In this paper, we develop an energy-adaptive real-time sensing framework for batteryless devices. This framework includes a lightweight machine learning-based energy predictor that is capable of running on microcontroller devices and predicting the energy availability and intensity based on energy traces. Using this, the framework adapts the schedule of real-time tasks by effectively taking into account the predicted energy supply and the resulting age of information of each task, in order to achieve continuous sensing operations and satisfy given data freshness requirements. We discuss various design choices for adaptive scheduling and evaluate their performance in the context of batteryless devices. Experimental results show that the proposed adaptive real-time approach outperforms the recent methods based on static and reactive approaches, in both energy utilization and data freshness.","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"171 1","pages":"205-211"},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84007461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RTCSA 2022 Organizers","authors":"","doi":"10.1109/rtcsa55878.2022.00006","DOIUrl":"https://doi.org/10.1109/rtcsa55878.2022.00006","url":null,"abstract":"","PeriodicalId":38446,"journal":{"name":"International Journal of Embedded and Real-Time Communication Systems (IJERTCS)","volume":"82 1","pages":""},"PeriodicalIF":0.7,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75964957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}