ACM Transactions on Embedded Computing Systems最新文献_第9页

A Self-Sustained CPS Design for Reliable Wildfire Monitoring 可靠野火监测的自维持CPS设计

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3608100

Yigit Tuncel, Toygun Basaklar, Dina Carpenter-Graffy, Umit Ogras

{"title":"A Self-Sustained CPS Design for Reliable Wildfire Monitoring","authors":"Yigit Tuncel, Toygun Basaklar, Dina Carpenter-Graffy, Umit Ogras","doi":"10.1145/3608100","DOIUrl":"https://doi.org/10.1145/3608100","url":null,"abstract":"Continuous monitoring of areas nearby the electric grid is critical for preventing and early detection of devastating wildfires. Existing wildfire monitoring systems are intermittent and oblivious to local ambient risk factors, resulting in poor wildfire awareness. Ambient sensor suites deployed near the gridlines can increase the monitoring granularity and detection accuracy. However, these sensors must address two challenging and competing objectives at the same time. First, they must remain powered for years without manual maintenance due to their remote locations. Second, they must provide and transmit reliable information if and when a wildfire starts. The first objective requires aggressive energy savings and ambient energy harvesting, while the second requires continuous operation of a range of sensors. To the best of our knowledge, this paper presents the first self-sustained cyber-physical system that dynamically co-optimizes the wildfire detection accuracy and active time of sensors. The proposed approach employs reinforcement learning to train a policy that controls the sensor operations as a function of the environment (i.e., current sensor readings), harvested energy, and battery level. The proposed cyber-physical system is evaluated extensively using real-life temperature, wind, and solar energy harvesting datasets and an open-source wildfire simulator. In long-term (5 years) evaluations, the proposed framework achieves 89% uptime, which is 46% higher than a carefully tuned heuristic approach. At the same time, it averages a 2-minute initial response time, which is at least 2.5× faster than the same heuristic approach. Furthermore, the policy network consumes 0.6 mJ per day on the TI CC2652R microcontroller using TensorFlow Lite for Micro, which is negligible compared to the daily sensor suite energy consumption.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DASS: Differentiable Architecture Search for Sparse Neural Networks 稀疏神经网络的可微结构搜索

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609385

Hamid Mousavi, Mohammad Loni, Mina Alibeigi, Masoud Daneshtalab

{"title":"DASS: Differentiable Architecture Search for Sparse Neural Networks","authors":"Hamid Mousavi, Mohammad Loni, Mina Alibeigi, Masoud Daneshtalab","doi":"10.1145/3609385","DOIUrl":"https://doi.org/10.1145/3609385","url":null,"abstract":"The deployment of Deep Neural Networks (DNNs) on edge devices is hindered by the substantial gap between performance requirements and available computational power. While recent research has made significant strides in developing pruning methods to build a sparse network for reducing the computing overhead of DNNs, there remains considerable accuracy loss, especially at high pruning ratios. We find that the architectures designed for dense networks by differentiable architecture search methods are ineffective when pruning mechanisms are applied to them. The main reason is that the current methods do not support sparse architectures in their search space and use a search objective that is made for dense networks and does not focus on sparsity. This paper proposes a new method to search for sparsity-friendly neural architectures. It is done by adding two new sparse operations to the search space and modifying the search objective. We propose two novel parametric SparseConv and SparseLinear operations in order to expand the search space to include sparse operations. In particular, these operations make a flexible search space due to using sparse parametric versions of linear and convolution operations. The proposed search objective lets us train the architecture based on the sparsity of the search space operations. Quantitative analyses demonstrate that architectures found through DASS outperform those used in the state-of-the-art sparse networks on the CIFAR-10 and ImageNet datasets. In terms of performance and hardware effectiveness, DASS increases the accuracy of the sparse version of MobileNet-v2 from 73.44% to 81.35% (+7.91% improvement) with a 3.87× faster inference time.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136192606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Optimal Synthesis of Robust IDK Classifier Cascades 鲁棒IDK分类器级联的最优综合

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609129

Sanjoy Baruah, Alan Burns, Robert Ian Davis

{"title":"Optimal Synthesis of Robust IDK Classifier Cascades","authors":"Sanjoy Baruah, Alan Burns, Robert Ian Davis","doi":"10.1145/3609129","DOIUrl":"https://doi.org/10.1145/3609129","url":null,"abstract":"An IDK classifier is a computing component that categorizes inputs into one of a number of classes, if it is able to do so with the required level of confidence, otherwise it returns “I Don’t Know” (IDK). IDK classifier cascades have been proposed as a way of balancing the needs for fast response and high accuracy in classification-based machine perception. Efficient algorithms for the synthesis of IDK classifier cascades have been derived; however, the responsiveness of these cascades is highly dependent on the accuracy of predictions regarding the run-time behavior of the classifiers from which they are built. Accurate predictions of such run-time behavior is difficult to obtain for many of the classifiers used for perception. By applying the algorithms using predictions framework, we propose efficient algorithms for the synthesis of IDK classifier cascades that are robust to inaccurate predictions in the following sense: the IDK classifier cascades synthesized by our algorithms have short expected execution durations when the predictions are accurate, and these expected durations increase only within specified bounds when the predictions are inaccurate.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136192616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Let Coarse-Grained Resources Be Shared: Mapping Entire Neural Networks on FPGAs 让粗粒度资源共享:在fpga上映射整个神经网络

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609109

Tzung-Han Juang, Christof Schlaak, Christophe Dubach

引用次数: 0

Energy-efficient Personalized Federated Search with Graph for Edge Computing 基于图的边缘计算节能个性化联邦搜索

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609435

Zhao Yang, Qingshuang Sun

引用次数: 0

GHOST: A Graph Neural Network Accelerator using Silicon Photonics GHOST:使用硅光子学的图形神经网络加速器

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609097

Salma Afifi, Febin Sunny, Amin Shafiee, Mahdi Nikdast, Sudeep Pasricha

{"title":"GHOST: A Graph Neural Network Accelerator using Silicon Photonics","authors":"Salma Afifi, Febin Sunny, Amin Shafiee, Mahdi Nikdast, Sudeep Pasricha","doi":"10.1145/3609097","DOIUrl":"https://doi.org/10.1145/3609097","url":null,"abstract":"Graph neural networks (GNNs) have emerged as a powerful approach for modelling and learning from graph-structured data. Multiple fields have since benefitted enormously from the capabilities of GNNs, such as recommendation systems, social network analysis, drug discovery, and robotics. However, accelerating and efficiently processing GNNs require a unique approach that goes beyond conventional artificial neural network accelerators, due to the substantial computational and memory requirements of GNNs. The slowdown of scaling in CMOS platforms also motivates a search for alternative implementation substrates. In this paper, we present GHOST , the first silicon-photonic hardware accelerator for GNNs. GHOST efficiently alleviates the costs associated with both vertex-centric and edge-centric operations. It implements separately the three main stages involved in running GNNs in the optical domain, allowing it to be used for the inference of various widely used GNN models and architectures, such as graph convolution networks and graph attention networks. Our simulation studies indicate that GHOST exhibits at least 10.2 × better throughput and 3.8 × better energy efficiency when compared to GPU, TPU, CPU and multiple state-of-the-art GNN hardware accelerators.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards Building Verifiable CPS using Lingua Franca 使用通用语言构建可验证的CPS

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609134

Shaokai Lin, Yatin A. Manerkar, Marten Lohstroh, Elizabeth Polgreen, Sheng-Jung Yu, Chadlia Jerad, Edward A. Lee, Sanjit A. Seshia

{"title":"Towards Building Verifiable CPS using Lingua Franca","authors":"Shaokai Lin, Yatin A. Manerkar, Marten Lohstroh, Elizabeth Polgreen, Sheng-Jung Yu, Chadlia Jerad, Edward A. Lee, Sanjit A. Seshia","doi":"10.1145/3609134","DOIUrl":"https://doi.org/10.1145/3609134","url":null,"abstract":"Formal verification of cyber-physical systems (CPS) is challenging because it has to consider real-time and concurrency aspects that are often absent in ordinary software. Moreover, the software in CPS is often complex and low-level, making it hard to assure that a formal model of the system used for verification is a faithful representation of the actual implementation, which can undermine the value of a verification result. To address this problem, we propose a methodology for building verifiable CPS based on the principle that a formal model of the software can be derived automatically from its implementation. Our approach requires that the system implementation is specified in Lingua Franca (LF), a polyglot coordination language tailored for real-time, concurrent CPS, which we made amenable to the specification of safety properties via annotations in the code. The program structure and the deterministic semantics of LF enable automatic construction of formal axiomatic models directly from LF programs. The generated models are automatically checked using Bounded Model Checking (BMC) by the verification engine Uclid5 using the Z3 SMT solver. The proposed technique enables checking a well-defined fragment of Safety Metric Temporal Logic (Safety MTL) formulas. To ensure the completeness of BMC, we present a method to derive an upper bound on the completeness threshold of an axiomatic model based on the semantics of LF. We implement our approach in the LF V erifier and evaluate it using a benchmark suite with 22 programs sampled from real-life applications and benchmarks for Erlang, Lustre, actor-oriented languages, and RTOSes. The LF V erifier correctly checks 21 out of 22 programs automatically.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Keep in Balance: Runtime-reconfigurable Intermittent Deep Inference 保持平衡:运行时可重构的间歇深度推理

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3607918

Chih-Hsuan Yen, Hashan Roshantha Mendis, Tei-Wei Kuo, Pi-Cheng Hsiu

{"title":"Keep in Balance: Runtime-reconfigurable Intermittent Deep Inference","authors":"Chih-Hsuan Yen, Hashan Roshantha Mendis, Tei-Wei Kuo, Pi-Cheng Hsiu","doi":"10.1145/3607918","DOIUrl":"https://doi.org/10.1145/3607918","url":null,"abstract":"Intermittent deep neural network (DNN) inference is a promising technique to enable intelligent applications on tiny devices powered by ambient energy sources. Nonetheless, intermittent execution presents inherent challenges, primarily involving accumulating progress across power cycles and having to refetch volatile data lost due to power loss in each power cycle. Existing approaches typically optimize the inference configuration to maximize data reuse. However, we observe that such a fixed configuration may be significantly inefficient due to the fluctuating balance point between data reuse and data refetch caused by the dynamic nature of ambient energy. This work proposes DynBal , an approach to dynamically reconfigure the inference engine at runtime. DynBal is realized as a middleware plugin that improves inference performance by exploring the interplay between data reuse and data refetch to maintain their balance with respect to the changing level of intermittency. An indirect metric is developed to easily evaluate an inference configuration considering the variability in intermittency, and a lightweight reconfiguration algorithm is employed to efficiently optimize the configuration at runtime. We evaluate the improvement brought by integrating DynBal into a recent intermittent inference approach that uses a fixed configuration. Evaluations were conducted on a Texas Instruments device with various network models and under varied intermittent power strengths. Our experimental results demonstrate that DynBal can speed up intermittent inference by 3.26 times, achieving a greater improvement for a large network under high intermittency and a large gap between memory and computation performance.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136192113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LaDy: Enabling L ocality- a ware D eduplication Technolog y on Shingled Magnetic Recording Drives 使能局域性——瓦式磁记录驱动器上的一种软件重复技术

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3607921

Jung-Hsiu Chang, Tzu-Yu Chang, Yi-Chao Shih, Tseng-Yi Chen

{"title":"LaDy: Enabling L ocality- a ware D eduplication Technolog y on Shingled Magnetic Recording Drives","authors":"Jung-Hsiu Chang, Tzu-Yu Chang, Yi-Chao Shih, Tseng-Yi Chen","doi":"10.1145/3607921","DOIUrl":"https://doi.org/10.1145/3607921","url":null,"abstract":"The continuous increase in data volume has led to the adoption of shingled-magnetic recording (SMR) as the primary technology for modern storage drives. This technology offers high storage density and low unit cost but introduces significant performance overheads due to the read-update-write operation and garbage collection (GC) process. To reduce these overheads, data deduplication has been identified as an effective solution as it reduces the amount of written data to an SMR-based storage device. However, deduplication can result in poor data locality, leading to decreased read performance. To tackle this problem, this study proposes a data locality-aware deduplication technology, LaDy, that considers both the overheads of writing duplicate data and the impact on data locality to determine whether the duplicate data should be written. LaDy integrates with DiskSim, an open-source project, and modifies it to simulate an SMR-based drive. The experimental results demonstrate that LaDy can significantly reduce the response time in the best-case scenario by 87.3% compared with CAFTL on the SMR drive. LaDy achieves this by selectively writing duplicate data, which preserves data locality, resulting in improved read performance. The proposed solution provides an effective and efficient method for mitigating the performance overheads associated with data deduplication in SMR-based storage devices.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136192262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ANV-PUF: Machine-Learning-Resilient NVM-Based Arbiter PUF ANV-PUF:基于机器学习弹性nvm的仲裁PUF

3区计算机科学

ACM Transactions on Embedded Computing Systems Pub Date : 2023-09-09 DOI: 10.1145/3609388

Hassan Nassar, Lars Bauer, Jörg Henkel

{"title":"ANV-PUF: Machine-Learning-Resilient NVM-Based Arbiter PUF","authors":"Hassan Nassar, Lars Bauer, Jörg Henkel","doi":"10.1145/3609388","DOIUrl":"https://doi.org/10.1145/3609388","url":null,"abstract":"Physical Unclonable Functions (PUFs) have been widely considered an attractive security primitive. They use the deviations in the fabrication process to have unique responses from each device. Due to their nature, they serve as a DNA-like identity of the device. But PUFs have also been targeted for attacks. It has been proven that machine learning (ML) can be used to effectively model a PUF design and predict its behavior, leading to leakage of the internal secrets. To combat such attacks, several designs have been proposed to make it harder to model PUFs. One design direction is to use Non-Volatile Memory (NVM) as the building block of the PUF. NVM typically are multi-level cells, i.e, they have several internal states, which makes it harder to model them. However, the current state of the art of NVM-based PUFs is limited to ‘weak PUFs’, i.e., the number of outputs grows only linearly with the number of inputs, which limits the number of possible secret values that can be stored using the PUF. To overcome this limitation, in this work we design the Arbiter Non-Volatile PUF (ANV-PUF) that is exponential in the number of inputs and that is resilient against ML-based modeling. The concept is based on the famous delay-based Arbiter PUF (which is not resilient against modeling attacks) while using NVM as a building block instead of switches. Hence, we replace the switch delays (which are easy to model via ML) with the multi-level property of NVM (which is hard to model via ML). Consequently, our design has the exponential output characteristics of the Arbiter PUF and the resilience against attacks from the NVM-based PUFs. Our results show that the resilience to ML modeling, uniqueness, and uniformity are all in the ideal range of 50%. Thus, in contrast to the state-of-the-art, ANV-PUF is able to be resilient to attacks, while having an exponential number of outputs.","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136192252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0