Abbas Dehghani;Sadegh Fadaei;Bahman Ravaei;Keyvan RahimiZadeh
{"title":"Deadline-Aware and Energy-Efficient Dynamic Task Mapping and Scheduling for Multicore Systems Based on Wireless Network-on-Chip","authors":"Abbas Dehghani;Sadegh Fadaei;Bahman Ravaei;Keyvan RahimiZadeh","doi":"10.1109/TETC.2023.3315298","DOIUrl":"10.1109/TETC.2023.3315298","url":null,"abstract":"Hybrid Wireless Network-on-Chip (HWNoC) architecture has been introduced as a promising communication infrastructure for multicore systems. HWNoC-based multicore systems encounter extremely dynamic application workloads that are submitted at run-time. Mapping and scheduling of these applications are critical for system performance, especially for real-time applications. The existing resource allocation approaches either ignore the use of wireless links in task allocation on cores or ignore the timing characteristic of tasks. In this paper, we propose a new deadline-aware and energy-efficient dynamic task mapping and scheduling approach for the HWNoC-based multicore system. By using of core utilization threshold and tasks laxity time, the proposed approach aims to minimize communication energy consumption and satisfy the deadline of the real-time applications tasks. Through cycle-accurate simulation, the performance of the proposed approach has been compared with state-of-the-art approaches in terms of communication energy consumption, deadline violation rate, communication latency, and runtime overhead. The experimental results confirmed that the proposed approach is a very competitive approach among the alternative approaches.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 4","pages":"1031-1044"},"PeriodicalIF":5.9,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135555776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ben Perach;Ronny Ronen;Benny Kimelfeld;Shahar Kvatinsky
{"title":"Understanding Bulk-Bitwise Processing In-Memory Through Database Analytics","authors":"Ben Perach;Ronny Ronen;Benny Kimelfeld;Shahar Kvatinsky","doi":"10.1109/TETC.2023.3315189","DOIUrl":"10.1109/TETC.2023.3315189","url":null,"abstract":"Bulk-bitwise processing-in-memory (PIM), where large bitwise operations are performed in parallel by the memory array itself, is an emerging form of computation with the potential to mitigate the memory wall problem. This article examines the capabilities of bulk-bitwise PIM by constructing PIMDB, a fully-digital system based on memristive stateful logic, utilizing and focusing on in-memory bulk-bitwise operations, designed to accelerate a real-life workload: analytical processing of relational databases. We introduce a host processor programming model to support bulk-bitwise PIM in virtual memory, develop techniques to efficiently perform in-memory filtering and aggregation operations, and adapt the application data set into the memory. To understand bulk-bitwise PIM, we compare it to an equivalent in-memory database on the same host system. We show that bulk-bitwise PIM substantially lowers the number of required memory read operations, thus accelerating TPC-H filter operations by 1.6×–18× and full queries by 56×–608×, while reducing the energy consumption by 1.7×–18.6× and 0.81×–12× for these benchmarks, respectively. Our extensive evaluation uses the gem5 full-system simulation environment. The simulations also evaluate cell endurance, showing that the required endurance is within the range of existing endurance of RRAM devices.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 1","pages":"7-22"},"PeriodicalIF":5.9,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135551143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Grottke;Alberto Avritzer;Hironori Washizaki;Kishor Trivedi
{"title":"Guest Editorial Special Section on Applied Software Aging and Rejuvenation","authors":"Michael Grottke;Alberto Avritzer;Hironori Washizaki;Kishor Trivedi","doi":"10.1109/TETC.2023.3299150","DOIUrl":"10.1109/TETC.2023.3299150","url":null,"abstract":"Since the publication of the first paper on software aging and rejuvenation by Huang et al. in 1995 [1], considerable research has been devoted to this topic. It deals with the phenomenon that continuously-running software systems may show an increasing failure rate and/or a degrading performance, either because error conditions accumulate inside the running system or because the rate at which faults are activated and errors are propagated is positively correlated with system uptime. Software rejuvenation relates to techniques counteracting aging (for example, by regularly stopping and restarting the software) in order to remove aging effects and to proactively prevent failures from occurring.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 3","pages":"550-552"},"PeriodicalIF":5.9,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10241255","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62529112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Emerging Topics in Computing Information for Authors","authors":"","doi":"10.1109/TETC.2023.3300132","DOIUrl":"10.1109/TETC.2023.3300132","url":null,"abstract":"","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 3","pages":"C2-C2"},"PeriodicalIF":5.9,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10241262","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135804629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multiplier-Free RNS-Based CNN Accelerator Exploiting Bit-Level Sparsity","authors":"Vasilis Sakellariou;Vassilis Paliouras;Ioannis Kouretas;Hani Saleh;Thanos Stouraitis","doi":"10.1109/TETC.2023.3301590","DOIUrl":"10.1109/TETC.2023.3301590","url":null,"abstract":"In this work, a Residue Numbering System (RNS)-based Convolutional Neural Network (CNN) accelerator utilizing a multiplier-free distributed-arithmetic Processing Element (PE) is proposed. A method for maximizing the utilization of the arithmetic hardware resources is presented. It leads to an increase of the system's throughput, by exploiting bit-level sparsity within the weight vectors. The proposed PE design takes advantage of the properties of RNS and Canonical Signed Digit (CSD) encoding to achieve higher energy efficiency and effective processing rate, without requiring any compression mechanism or introducing any approximation. An extensive design space exploration for various parameters (RNS base, PE micro-architecture, encoding) using analytical models as well as experimental results from CNN benchmarks is conducted and the various trade-offs are analyzed. A complete end-to-end RNS accelerator is developed based on the proposed PE. The introduced accelerator is compared to traditional binary and RNS counterparts as well as to other state-of-the-art systems. Implementation results in a 22-nm process show that the proposed PE can lead to \u0000<inline-formula><tex-math>$1.85times$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$1.54times$</tex-math></inline-formula>\u0000 more energy-efficient processing compared to binary and conventional RNS, respectively, with a \u0000<inline-formula><tex-math>$1.88times$</tex-math></inline-formula>\u0000 maximum increase of effective throughput for the employed benchmarks. Compared to a state-of-the-art, all-digital, RNS-based system, the proposed accelerator is \u0000<inline-formula><tex-math>$8.87times$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$1.11times$</tex-math></inline-formula>\u0000 more energy- and area-efficient, respectively.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 2","pages":"667-683"},"PeriodicalIF":5.9,"publicationDate":"2023-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62529168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Noise-Shaping Binary-to-Stochastic Converters for Reduced-Length Bit-Streams","authors":"Kleanthis Papachatzopoulos;Vassilis Paliouras","doi":"10.1109/TETC.2023.3299516","DOIUrl":"10.1109/TETC.2023.3299516","url":null,"abstract":"Stochastic computations have attracted significant attention for applications with moderate fixed-point accuracy requirements, as they offer minimal complexity. In these systems, a stochastic bit-stream encodes a data sample. The derived bit-stream is used for processing. The bit-stream length determines the computation latency for bit-serial implementations and hardware complexity for bit-parallel ones. Noise shaping is a feedback technique that moves the quantization noise outside the bandwidth of interest of a signal. This article proposes a technique that builds on noise shaping and reduces the length of the stochastic bit-stream required to achieve a specific Signal-to-Quantization-Noise Ratio (SQNR). The technique is realized by digital units that encode binary samples into stochastic streams, hereafter called as binary-to-stochastic converters. Furthermore, formulas are derived that relate the bit-stream length reduction to the signal bandwidth. First-order and second-order converters that implement the proposed technique are analyzed. Two architectures are introduced, distinguished by placing a stochastic converter either inside or outside of the noise-shaping loop. The proposed bit-stream length reduction is quantitatively compared to conventional binary-to-stochastic converters for the same signal quality level. Departing from conventional approaches, this article employs bit-stream lengths that are not a power of two, and proposes a modified stochastic-to-binary conversion scheme as a part of the proposed binary-to-stochastic converter. Particularly, SQNR gains of 29.8 dB and 42.1 dB are achieved for the first-order and second-order converters compared to the conventional converters for equal-length bit-streams and low signal bandwidth. The investigated converters are designed and synthesized at a 28-nm FDSOI technology for a range of bit widths.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 4","pages":"1002-1017"},"PeriodicalIF":5.9,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62529127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Edge-Cloud Collaboration Framework for Graph Processing in Smart Society","authors":"Jun Zhou;Masaaki Kondo","doi":"10.1109/TETC.2023.3297066","DOIUrl":"10.1109/TETC.2023.3297066","url":null,"abstract":"Due to the limitations of cloud computing on latency, bandwidth and data confidentiality, edge computing has emerged as a novel location-aware way to provide the capacity-constrained portable terminals with more processing capacity to improve the computing performance and quality of service (QoS) in several typical domains of the human activity in smart society, such as social networks, medical diagnosis, telecommunications, recommendation systems, internal threat detection, transportation, Internet of Things (IoT), etc. These application domains often manage a vast collection of entities with various relationships, which can be naturally represented by the graph data structure. Graph processing is a powerful tool to model and optimize complex problems where graph-based data is involved. In consideration of the relatively insufficient resource provisioning of the edge devices, in this article, for the first time to our knowledge, we propose a reliable edge-cloud collaboration framework that facilitates the graph primitives based on a lightweight interactive graph processing library (GPL), especially for shortest path search (SPS) operations as the demonstrative example. Two types of different practical cases are also presented to show the typical application scenarios of our graph processing strategy. Experimental evaluations indicate that the acceleration rate of performance can reach 6.87x via graph reduction, and less than 3% and 20% extra latency is required for much better user experiences for navigation and pandemic control, respectively, while the online security measures merely consume about 1% extra time of the overall data transmission. Our framework can efficiently execute the applications with considering of user-friendliness, low-latency response, interactions among edge devices, collaboration between edge and cloud, and privacy protection at an acceptable overhead.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 4","pages":"985-1001"},"PeriodicalIF":5.9,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62529060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mónica Sánchez de Francisco;Paloma Díaz;Teresa Onorati;Álvaro Monteron;Ignacio Aedo
{"title":"Designing Mobile Technologies to Encourage Civic Engagement: The Role of Situated Motivational Affordances","authors":"Mónica Sánchez de Francisco;Paloma Díaz;Teresa Onorati;Álvaro Monteron;Ignacio Aedo","doi":"10.1109/TETC.2023.3296772","DOIUrl":"10.1109/TETC.2023.3296772","url":null,"abstract":"Social and ubiquitous computing opens up many opportunities to engage citizens in activities that benefit their communities. Technology is ready and available, but there are still open issues concerning how to engage people in activities that are not extrinsically rewarding or whose impact is not immediately perceived. In this paper, we explore the role that situated motivational affordances can play in encouraging citizens in one of such activities, early warning. With this purpose, we designed and implemented a gamified app, IWarn that was iteratively designed following an action-research process to align the needs and capabilities of two types of stakeholders: emergency managers and citizens. The situated motivational affordances framework was used to lead the evaluation considering the motivational affordances enabled by the app and the situation in which it was used. The IWarn app was evaluated in an in-the-wild deployment where 4 emergency workers and 17 citizens took part in a real exercise for one week. Our results suggest that the gamified elements helped to improve intrinsic and extrinsic motivation and user engagement. This work contributes to the social computing domain by illustrating a use case where carefully designed gamification can help in engaging citizens in participatory processes","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 3","pages":"739-751"},"PeriodicalIF":5.1,"publicationDate":"2023-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10192506","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62529024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jim Plusquellic;Eirini Eleni Tsiropoulou;Cyrus Minwalla
{"title":"Privacy-Preserving Authentication Protocols for IoT Devices Using the SiRF PUF","authors":"Jim Plusquellic;Eirini Eleni Tsiropoulou;Cyrus Minwalla","doi":"10.1109/TETC.2023.3296016","DOIUrl":"10.1109/TETC.2023.3296016","url":null,"abstract":"Authentication between IoT devices is important for maintaining security, trust and data integrity in an edge device ecosystem. The low-power, reduced computing capacity of the IoT device makes public-private, certificate-based forms of authentication impractical, while other lighter-weight, symmetric cryptography-based approaches, such as message authentication codes, are easy to spoof in unsupervised environments where adversaries have direct physical access to the device. Such environments are better served by security primitives rooted in the hardware with capabilities exceeding those available in cryptography-only frameworks. A key foundational hardware security primitive is the physical unclonable function or PUF. PUFs are well known for removing the need to store secrets in secure non-volatile memories, and for providing very large sets of authentication credentials. In this article, we describe two PUF-based mutual authentication protocols rooted in the entropy provided by a strong PUF. The security properties of the authentication protocols, called COBRA and PARCE, are evaluated in hardware experiments on SoC-based FPGAs, and under extended industrial-standard operating conditions. A codesign-based system architecture is presented in which the SiRF PUF and core authentication functions are implemented in the programmable logic as a secure enclave, while network and database operations are implemented in software on an embedded microprocessor.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"11 4","pages":"918-933"},"PeriodicalIF":5.9,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62529012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Error in Ulps of the Multiplication or Division by a Correctly-Rounded Function or Constant in Binary Floating-Point Arithmetic","authors":"Nicolas Brisebarre;Jean-Michel Muller;Joris Picot","doi":"10.1109/TETC.2023.3294986","DOIUrl":"10.1109/TETC.2023.3294986","url":null,"abstract":"Assume we use a binary floating-point arithmetic and that \u0000<inline-formula><tex-math>$operatorname{RN}$</tex-math></inline-formula>\u0000 is the round-to-nearest function. Also assume that \u0000<inline-formula><tex-math>$c$</tex-math></inline-formula>\u0000 is a constant or a real function of one or more variables, and that we have at our disposal a correctly rounded implementation of \u0000<inline-formula><tex-math>$c$</tex-math></inline-formula>\u0000, say \u0000<inline-formula><tex-math>$hat{c}= operatorname{RN}(c)$</tex-math></inline-formula>\u0000. For evaluating \u0000<inline-formula><tex-math>$x cdot c$</tex-math></inline-formula>\u0000 (resp. \u0000<inline-formula><tex-math>$ x / c$</tex-math></inline-formula>\u0000 or \u0000<inline-formula><tex-math>$c / x$</tex-math></inline-formula>\u0000), the natural way is to replace it by \u0000<inline-formula><tex-math>$operatorname{RN}(x cdot hat{c})$</tex-math></inline-formula>\u0000 (resp. \u0000<inline-formula><tex-math>$ operatorname{RN}(x / hat{c})$</tex-math></inline-formula>\u0000 or \u0000<inline-formula><tex-math>$operatorname{RN}(hat{c}/ x)$</tex-math></inline-formula>\u0000), that is, to call function \u0000<inline-formula><tex-math>$hat{c}$</tex-math></inline-formula>\u0000 and to perform a floating-point multiplication or division. This can be generalized to the approximation of \u0000<inline-formula><tex-math>$n/d$</tex-math></inline-formula>\u0000 by \u0000<inline-formula><tex-math>$operatorname{RN}(hat{n}/hat{d})$</tex-math></inline-formula>\u0000 and the approximation of \u0000<inline-formula><tex-math>$n cdot d$</tex-math></inline-formula>\u0000 by \u0000<inline-formula><tex-math>$operatorname{RN}(hat{n} cdot hat{d})$</tex-math></inline-formula>\u0000, where \u0000<inline-formula><tex-math>$hat{n} = operatorname{RN}(n)$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$hat{d} = operatorname{RN}(d)$</tex-math></inline-formula>\u0000, and \u0000<inline-formula><tex-math>$n$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$d$</tex-math></inline-formula>\u0000 are functions for which we have at our disposal a correctly rounded implementation. We discuss tight error bounds in ulps of such approximations. From our results, one immediately obtains tight error bounds for calculations such as \u0000<inline-formula><tex-math>$mathtt {x * pi}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {ln(2)/x}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {x/(y+z)}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {(x+y)*z}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {x/sqrt(y)}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {sqrt(x)/{y}}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {(x+y)(z+t)}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {(x+y)/(z+t)}$</tex-math></inline-formula>\u0000, \u0000<inline-formula><tex-math>$mathtt {(x+y)/(zt)}$</tex-math></inline-formula>\u0000, etc. in floating-point arithmetic.","PeriodicalId":13156,"journal":{"name":"IEEE Transactions on Emerging Topics in Computing","volume":"12 2","pages":"656-666"},"PeriodicalIF":5.9,"publicationDate":"2023-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62528977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}