{"title":"IoT-Fog-Cloud Centric Earthquake Monitoring and Prediction","authors":"Kanika Saini, S. Kalra, S. Sood","doi":"10.1145/3487942","DOIUrl":"https://doi.org/10.1145/3487942","url":null,"abstract":"Earthquakes are among the most inevitable natural catastrophes. The uncertainty about the severity of the earthquake has a profound effect on the burden of disaster and causes massive economic and societal losses. Although unpredictable, it can be expected to ameliorate damage and fatalities, such as monitoring and predicting earthquakes using the Internet of Things (IoT). With the resurgence of the IoT, an emerging innovative approach is to integrate IoT technology with Fog and Cloud Computing to augment the effectiveness and accuracy of earthquake monitoring and prediction. In this study, the integrated IoT-Fog-Cloud layered framework is proposed to predict earthquakes using seismic signal information. The proposed model is composed of three layers: (i) at sensor layer, seismic data are acquired, (ii) fog layer incorporates pre-processing, feature extraction using fast Walsh–Hadamard transform (FWHT), selection of relevant features by applying High Order Spectral Analysis (HOSA) to FWHT coefficients, and seismic event classification by K-means accompanied by real-time alert generation, (iii) at cloud layer, an artificial neural network (ANN) is employed to forecast the magnitude of an earthquake. For performance evaluation, K-means classification algorithm is collated with other well-known classification algorithms from the perspective of accuracy and execution duration. Implementation statistics indicate that with chosen HOS features, we have been able to attain high accuracy, precision, specificity, and sensitivity values of 93.30%, 96.65%, 90.54%, and 92.75%, respectively. In addition, the ANN provides an average correct magnitude prediction of 75%. The findings ensured that the proposed framework has the potency to classify seismic signals and predict earthquakes and could therefore further enhance the detection of seismic activities. Moreover, the generation of real-time alerts further amplifies the effectiveness of the proposed model and makes it more real-time compatible.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127603495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. C. R. Da Silva, Lorena Leão, V. Petrucci, A. Gamatie, Fernando Magno Quintão Pereira
{"title":"Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program Inputs","authors":"J. C. R. Da Silva, Lorena Leão, V. Petrucci, A. Gamatie, Fernando Magno Quintão Pereira","doi":"10.1145/3478288","DOIUrl":"https://doi.org/10.1145/3478288","url":null,"abstract":"A hardware configuration is a set of processors and their frequency levels in a multicore heterogeneous system. This article presents a compiler-based technique to match functions with hardware configurations. Such a technique consists of using multivariate linear regression to associate function arguments with particular hardware configurations. By showing that this classification space tends to be convex in practice, this article demonstrates that linear regression is not only an efficient tool to map computations to heterogeneous hardware, but also an effective one. To demonstrate the viability of multivariate linear regression as a way to perform adaptive compilation for heterogeneous architectures, we have implemented our ideas onto the Soot Java bytecode analyzer. Code that we produce can predict the best configuration for a large class of Java and Scala benchmarks running on an Odroid XU4 big.LITTLE board; hence, outperforming prior techniques such as ARM’s GTS and CHOAMP, a recently released static program scheduler.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132970776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware Acceleration for Embedded Keyword Spotting: Tutorial and Survey","authors":"J. S. P. Giraldo, M. Verhelst","doi":"10.1145/3474365","DOIUrl":"https://doi.org/10.1145/3474365","url":null,"abstract":"In recent years, Keyword Spotting (KWS) has become a crucial human–machine interface for mobile devices, allowing users to interact more naturally with their gadgets by leveraging their own voice. Due to privacy, latency and energy requirements, the execution of KWS tasks on the embedded device itself instead of in the cloud, has attracted significant attention from the research community. However, the constraints associated with embedded systems, including limited energy, memory, and computational capacity, represent a real challenge for the embedded deployment of such interfaces. In this article, we explore and guide the reader through the design of KWS systems. To support this overview, we extensively survey the different approaches taken by the recent state-of-the-art (SotA) at the algorithmic, architectural, and circuit level to enable KWS tasks in edge, devices. A quantitative and qualitative comparison between relevant SotA hardware platforms is carried out, highlighting the current design trends, as well as pointing out future research directions in the development of this technology.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116768093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Furkan Aydin, Aydin Aysu, Mohit Tiwari, A. Gerstlauer, M. Orshansky
{"title":"Horizontal Side-Channel Vulnerabilities of Post-Quantum Key Exchange and Encapsulation Protocols","authors":"Furkan Aydin, Aydin Aysu, Mohit Tiwari, A. Gerstlauer, M. Orshansky","doi":"10.1145/3476799","DOIUrl":"https://doi.org/10.1145/3476799","url":null,"abstract":"Key exchange protocols and key encapsulation mechanisms establish secret keys to communicate digital information confidentially over public channels. Lattice-based cryptography variants of these protocols are promising alternatives given their quantum-cryptanalysis resistance and implementation efficiency. Although lattice cryptosystems can be mathematically secure, their implementations have shown side-channel vulnerabilities. But such attacks largely presume collecting multiple measurements under a fixed key, leaving the more dangerous single-trace attacks unexplored.\u0000 \u0000 This article demonstrates successful single-trace power side-channel attacks on lattice-based key exchange and encapsulation protocols. Our attack targets both hardware and software implementations of matrix multiplications used in lattice cryptosystems. The crux of our idea is to apply a horizontal attack that makes hypotheses on several intermediate values within a single execution all relating to the same secret, and to combine their correlations for accurately estimating the secret key. We illustrate that the design of protocols combined with the nature of lattice arithmetic enables our attack. Since a straightforward attack suffers from false positives, we demonstrate a novel\u0000 extend-and-prune\u0000 procedure to recover the key by following the sequence of intermediate updates during multiplication.\u0000 \u0000 \u0000 We analyzed two protocols,\u0000 Frodo\u0000 and\u0000 FrodoKEM\u0000 , and reveal that they are vulnerable to our attack. We implement both stand-alone hardware and RISC-V based software realizations and test the effectiveness of the proposed attack by using concrete parameters of these protocols on physical platforms with real measurements. We show that the proposed attack can estimate secret keys from a single power measurement with over 99% success rate.\u0000","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132445097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Horizontal Auto-Scaling for Multi-Access Edge Computing Using Safe Reinforcement Learning","authors":"Kaustabha Ray, A. Banerjee","doi":"10.1145/3475991","DOIUrl":"https://doi.org/10.1145/3475991","url":null,"abstract":"Multi-Access Edge Computing (MEC) has emerged as a promising new paradigm allowing low latency access to services deployed on edge servers to avert network latencies often encountered in accessing cloud services. A key component of the MEC environment is an auto-scaling policy which is used to decide the overall management and scaling of container instances corresponding to individual services deployed on MEC servers to cater to traffic fluctuations. In this work, we propose a Safe Reinforcement Learning (RL)-based auto-scaling policy agent that can efficiently adapt to traffic variations to ensure adherence to service specific latency requirements. We model the MEC environment using a Markov Decision Process (MDP). We demonstrate how latency requirements can be formally expressed in Linear Temporal Logic (LTL). The LTL specification acts as a guide to the policy agent to automatically learn auto-scaling decisions that maximize the probability of satisfying the LTL formula. We introduce a quantitative reward mechanism based on the LTL formula to tailor service specific latency requirements. We prove that our reward mechanism ensures convergence of standard Safe-RL approaches. We present experimental results in practical scenarios on a test-bed setup with real-world benchmark applications to show the effectiveness of our approach in comparison to other state-of-the-art methods in literature. Furthermore, we perform extensive simulated experiments to demonstrate the effectiveness of our approach in large scale scenarios.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123186418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Computation Reuse for Energy-Efficient Training of Deep Neural Networks","authors":"Jason Servais, E. Atoofian","doi":"10.1145/3487025","DOIUrl":"https://doi.org/10.1145/3487025","url":null,"abstract":"In recent years, Deep Neural Networks (DNNs) have been deployed into a diverse set of applications from voice recognition to scene generation mostly due to their high-accuracy. DNNs are known to be computationally intensive applications, requiring a significant power budget. There have been a large number of investigations into energy-efficiency of DNNs. However, most of them primarily focused on inference while training of DNNs has received little attention.\u0000 This work proposes an adaptive technique to identify and avoid redundant computations during the training of DNNs. Elements of activations exhibit a high degree of similarity, causing inputs and outputs of layers of neural networks to perform redundant computations. Based on this observation, we propose Adaptive Computation Reuse for Tensor Cores (ACRTC) where results of previous arithmetic operations are used to avoid redundant computations. ACRTC is an architectural technique, which enables accelerators to take advantage of similarity in input operands and speedup the training process while also increasing energy-efficiency. ACRTC dynamically adjusts the strength of computation reuse based on the tolerance of precision relaxation in different training phases. Over a wide range of neural network topologies, ACRTC accelerates training by 33% and saves energy by 32% with negligible impact on accuracy.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"4 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116820455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biresh Kumar Joardar, J. Doppa, Hai Li, K. Chakrabarty, P. Pande
{"title":"Learning to Train CNNs on Faulty ReRAM-based Manycore Accelerators","authors":"Biresh Kumar Joardar, J. Doppa, Hai Li, K. Chakrabarty, P. Pande","doi":"10.1145/3476986","DOIUrl":"https://doi.org/10.1145/3476986","url":null,"abstract":"\u0000 The growing popularity of convolutional neural networks (CNNs) has led to the search for efficient computational platforms to accelerate CNN training. Resistive random-access memory (ReRAM)-based manycore architectures offer a promising alternative to commonly used GPU-based platforms for training CNNs. However, due to the immature fabrication process and limited write endurance, ReRAMs suffer from different types of faults. This makes training of CNNs challenging as weights are misrepresented when they are mapped to faulty ReRAM cells. This results in unstable training, leading to unacceptably low accuracy for the trained model. Due to the distributed nature of the mapping of the individual bits of a weight to different ReRAM cells, faulty weights often lead to exploding gradients. This in turn introduces a positive feedback in the training loop, resulting in extremely large and unstable weights. In this paper, we propose a lightweight and reliable CNN training methodology using weight clipping to prevent this phenomenon and enable training even in the presence of many faults. Weight clipping prevents large weights from destabilizing CNN training and provides the backpropagation algorithm with the opportunity to compensate for the weights mapped to faulty cells. The proposed methodology achieves near-GPU accuracy without introducing significant area or performance overheads. Experimental evaluation indicates that weight clipping enables the successful training of CNNs in the presence of faults, while also reducing training time by 4\u0000 X\u0000 on average compared to a conventional GPU platform. Moreover, we also demonstrate that weight clipping outperforms a recently proposed error correction code (ECC)-based method when training is carried out using faulty ReRAMs.\u0000","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116937332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Interpretable Machine Learning Model Enhanced Integrated CPU-GPU DVFS Governor","authors":"Jurn-Gyu Park, N. Dutt, Sung-Soo Lim","doi":"10.1145/3470974","DOIUrl":"https://doi.org/10.1145/3470974","url":null,"abstract":"Modern heterogeneous CPU-GPU-based mobile architectures, which execute intensive mobile gaming/graphics applications, use software governors to achieve high performance with energy-efficiency. However, existing governors typically utilize simple statistical or heuristic models, assuming linear relationships using a small unbalanced dataset of mobile games; and the limitations result in high prediction errors for dynamic and diverse gaming workloads on heterogeneous platforms. To overcome these limitations, we propose an interpretable machine learning (ML) model enhanced integrated CPU-GPU governor: (1) It builds tree-based piecewise linear models (i.e., model trees) offline considering both high accuracy (low error) and interpretable ML models based on mathematical formulas using a simulatability operation counts quantitative metric. And then (2) it deploys the selected models for online estimation into an integrated CPU-GPU Dynamic Voltage Frequency Scaling governor. Our experiments on a test set of 20 mobile games exhibiting diverse characteristics show that our governor achieved significant energy efficiency gains of over 10% (up to 38%) improvements on average in energy-per-frame with a surprising-but-modest 3% improvement in Frames-per-Second performance, compared to a typical state-of-the-art governor that employs simple linear regression models.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134000848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Editorial: Reimagining ACM Transactions on Embedded Computing Systems (TECS)","authors":"T. Mitra","doi":"10.1145/3450438","DOIUrl":"https://doi.org/10.1145/3450438","url":null,"abstract":"Welcome to the latest issue of the ACM Transactions on Embedded Computing Systems (TECS). TECS is the flagship journal in embedded systems spanning the entire spectrum from software to hardware, from applications to design methodologies. The journal has evolved in tandem with the rapid transformation of the field and serves as the nexus of research and innovation. I am honored to have the opportunity to continue the tradition. I started my journey as the Editor-in-Chief of TECS in 2020 in the middle of a global pandemic. As we begin 2021 with cautious optimism about returning to some semblance of a normal life, it is also time to reimagine the vision and the future of the journal. On the one hand, it is the perfect time to be involved in the embedded systems research. Now more than ever, embedded system is firmly driving the technological revolution that blurs the line between physical, biological, and cyber entities. Embedded systems provide the foundation of almost all modern electronics systems today from automotive, avionics, smart grids to medical devices, wearables, and myriad consumer electronic devices. The challenge of designing complex, low-power, high-performance, safety-critical, secure, real-time, intelligent embedded computing systems that serve as the fundamental building blocks of these devices has grown exponentially. From a broader perspective, computing systems research, in general, is increasingly adopting a holistic, integrated, cross-layer, hardware-software codesigned approach that has been the cornerstone of embedded systems research from the onset. As a result, TECS is uniquely positioned to steer the course of these exciting developments. On the other hand, we are living in an unprecedented time where the well-established conference model needs rethinking and the distinctions between journals and conferences are getting fuzzy with the virtual events. We, the embedded systems community, are facing a predicament in both research and education with limited access to lab facilities in most parts of the world. But we are resilient, and I am confident that the journal will emerge stronger by embracing the emerging research opportunities and addressing the challenges imposed on us by the pandemic. I envision the journal to flourish at the forefront of research by continuing to engage with the core embedded systems community and aim to revitalize it by building bridges to the cognate research communities that contribute to the different theoretical and systems aspect of the embedded systems design including formal methods, real-time systems, machine learning, computer security, operating systems, sensor networks, compilers, computer architectures, design automation, and hardware-software codesign. Needless to say, TECS is evidently intertwined with the Internet of Things and Cyber-Physical Systems in the form of intersecting but synergistic, complementary focus.","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"131 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116936574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M'arton B'ur, Kristóf Marussy, B. Meyer, Dániel Varró
{"title":"Worst-Case Execution Time Calculation for Query-Based Monitors by Witness Generation","authors":"M'arton B'ur, Kristóf Marussy, B. Meyer, Dániel Varró","doi":"10.1145/3471904","DOIUrl":"https://doi.org/10.1145/3471904","url":null,"abstract":"\u0000 Runtime monitoring plays a key role in the assurance of modern intelligent cyber-physical systems, which are frequently data-intensive and safety-critical. While graph queries can serve as an expressive yet formally precise specification language to capture the safety properties of interest, there are no timeliness guarantees for such auto-generated runtime monitoring programs, which prevents their use in a real-time setting. While worst-case execution time (WCET) bounds derived by existing static WCET estimation techniques are safe, they may not be tight as they are unable to exploit domain-specific (semantic) information about the input models. This article presents a semantic-aware WCET analysis method for data-driven monitoring programs derived from graph queries. The method incorporates results obtained from low-level timing analysis into the objective function of a modern graph solver. This allows the systematic generation of input graph models up to a specified size (referred to as\u0000 witness models\u0000 ) for which the monitor is expected to take the most time to complete. Hence, the estimated execution time of the monitors on these graphs can be considered as safe and tight WCET. Additionally, we perform a set of experiments with query-based programs running on a real-time platform over a set of generated models to investigate the relationship between execution times and their estimates, and we compare WCET estimates produced by our approach with results from two well-known timing analyzers, aiT and OTAWA.\u0000","PeriodicalId":183677,"journal":{"name":"ACM Trans. Embed. Comput. Syst.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126251328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}