{"title":"Robustness of predictive energy harvesting systems: Analysis and adaptive prediction scaling","authors":"Naomi Stricker, Reto Da Forno, Lothar Thiele","doi":"10.1049/cdt2.12042","DOIUrl":"10.1049/cdt2.12042","url":null,"abstract":"<p>Internet of Things (IoT) systems can rely on energy harvesting to extend battery lifetimes or even render batteries obsolete. Such systems employ an energy scheduler to optimise their behaviour and thus performance by adapting the system's operation. Predictive models of harvesting sources, which are inherently non-deterministic and consequently challenging to predict, are often necessary for the scheduler to optimise performance. Because the inaccurate predictions are utilised by the scheduler, the predictive model's accuracy inevitably impacts the scheduler and system performance. This fact has largely been overlooked in the vast amount of available results on energy schedulers and predictors for harvesting-based systems. The authors systematically describe the effect prediction errors have on the scheduler and thus system performance by defining a novel robustness metric. To alleviate the severe impact prediction errors can have on the system performance, the authors propose an adaptive prediction scaling method that learns from the local environment and system behaviour. The authors demonstrate the concept of robustness with datasets from both outdoor and indoor scenarios. In addition, the authors highlight the improvement and overhead of the proposed adaptive prediction scaling method for both scenarios. It improves a non-robust system's performance by up to 13.8 times in a real-world setting.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 4","pages":"106-124"},"PeriodicalIF":1.2,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12042","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83346208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashur Rafiev, Alex Yakovlev, Ghaith Tarawneh, Matthew F. Naylor, Simon W. Moore, David B. Thomas, Graeme M. Bragg, Mark L. Vousden, Andrew D. Brown
{"title":"Synchronization in graph analysis algorithms on the Partially Ordered Event-Triggered Systems many-core architecture","authors":"Ashur Rafiev, Alex Yakovlev, Ghaith Tarawneh, Matthew F. Naylor, Simon W. Moore, David B. Thomas, Graeme M. Bragg, Mark L. Vousden, Andrew D. Brown","doi":"10.1049/cdt2.12041","DOIUrl":"10.1049/cdt2.12041","url":null,"abstract":"<p>One of the key problems in designing and implementing graph analysis algorithms for distributed platforms is to find an optimal way of managing communication flows in the massively parallel processing network. Message-passing and global synchronization are powerful abstractions in this regard, especially when used in combination. This paper studies the use of a hardware-implemented refutable global barrier as a design optimization technique aimed at unifying these abstractions at the API level. The paper explores the trade-offs between the related overheads and performance factors on a message-passing prototype machine with 49,152 RISC-V threads distributed over 48 FPGAs (called the Partially Ordered Event-Triggered Systems platform). Our experiments show that some graph applications favour synchronized communication, but the effect is hard to predict in general because of the interplay between multiple hardware and software factors. A classifier model is therefore proposed and implemented to perform such a prediction based on the application graph topology parameters: graph diameter, degree of connectivity, and reconvergence metric. The presented experimental results demonstrate that the correct choice of communication mode, granted by the new model-driven approach, helps to achieve 3.22 times faster computation time on average compared to the baseline platform operation.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 2-3","pages":"71-88"},"PeriodicalIF":1.2,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12041","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85073025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid multi-level hardware Trojan detection platform for gate-level netlists based on XGBoost","authors":"Ying Zhang, Sen Li, Xin Chen, Jiaqi Yao, Zhiming Mao, Jizhong Yang, Yifeng Hua","doi":"10.1049/cdt2.12040","DOIUrl":"10.1049/cdt2.12040","url":null,"abstract":"<p>Coping with the problem of malicious third-party vendors implanting Hardware Trojan (HT) in the circuit design stage, this paper proposes a hybrid-mode gate-level hardware Trojan detection platform based on the XGBoost algorithm. This detection platform is composed of multi-level HT localization and circuit structure based HT detection. Each wire of the circuit is regarded as a node in multi-level HT localization, and static characteristics of nodes are analysed, combining with dynamic detection to locate HT. The network structure features of the circuit are extracted in modular HT structure detection, aiming to identify HT accurately and rapidly. The hybrid-mode HT detection platform can efficiently meet various detection requirements, such as HT localization or rapid and accurate HT detection. The experiment results on Trust-Hub benchmark show that the multi-level localization can achieve 94.0% location accuracy, and the modular HT structure detection accuracy can achieve 100%. The modular HT structure detection is about four times as fast as the multi-level HT localization on feature extraction. Therefore, multi-level localization and modular HT structure detection can be respectively or cooperatively applied for specific HT detection issues, which proves that the proposed hybrid-mode gate-level HT detection scheme is practical and effective.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 2-3","pages":"54-70"},"PeriodicalIF":1.2,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12040","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87993598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced overloaded code division multiple access for network on chip","authors":"Behnam Vakili, Morteza Gholipour","doi":"10.1049/cdt2.12039","DOIUrl":"10.1049/cdt2.12039","url":null,"abstract":"<p>The Code-division multiple access (CDMA) method is commonly used as the network infrastructure in multi-core chips. One of its advantages is the simultaneous connection of all network components. Another advantage is the constant delay of this method. On the other hand, one drawback is that the number of transmitters is limited to the number of encoding bits. In this study, the authors used the combination of Walsh codes and their inverses, as well as the simultaneous application of the time-division multiple access (TDMA) method, to increase the transmission capacity of this protocol more than four times the standard mode. In the proposed design, although the circuit area does not increase significantly, a fourfold increase in the throughput of the CDMA network is seen. Using the method proposed in this study, it will be possible to increase the capacity further.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 2-3","pages":"45-53"},"PeriodicalIF":1.2,"publicationDate":"2021-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12039","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81311322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online multi-object tracking based on time and frequency domain features","authors":"Mahbubeh Nazarloo, Meisam Yadollahzadeh-Tabari, Homayun Motameni","doi":"10.1049/cdt2.12037","DOIUrl":"10.1049/cdt2.12037","url":null,"abstract":"<p>Multi-object tracking (MOT) can be considered as an interesting field in computer vision research. Its application can be found in video motion analysis, smart interfaces, and visual surveillance. It is a challenging issue due to difficulties made by a variable number of objects and interaction between them. In this work, a new method for online MOT based on time and frequency domain features is presented. The features are obtained from the wavelet transform and fractal dimension. The modified cuckoo optimization algorithm is utilized for feature selection, which has the ability such as fast convergence and global optima finding. The features are given for learning vector quantization, which is a supervised artificial neural network (ANN). It is used to classify the dataset. To evaluate the performance of the presented technique, simulations are performed using the ETH Mobile Platform and VS-PETS 2009 datasets. The simulation results show the superiority of the presented technique for MOT compared to earlier studies in terms of accuracy. The mostly tracked values for the datasets are 74.3% and 97.2%, which leads to at least 4.2% and 2.5% better performance according to the other methods, respectively.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"19-28"},"PeriodicalIF":1.2,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12037","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76226451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse convolutional neural network acceleration with lossless input feature map compression for resource-constrained systems","authors":"Jisu Kwon, Joonho Kong, Arslan Munir","doi":"10.1049/cdt2.12038","DOIUrl":"10.1049/cdt2.12038","url":null,"abstract":"<p>Many recent research efforts have exploited data sparsity for the acceleration of convolutional neural network (CNN) inferences. However, the effects of data transfer between main memory and the CNN accelerator have been largely overlooked. In this work, the authors propose a CNN acceleration technique that leverages hardware/software co-design and exploits the sparsity in input feature maps (IFMs). On the software side, the authors' technique employs a novel lossless compression scheme for IFMs, which are sent to the hardware accelerator via direct memory access. On the hardware side, the authors' technique uses a CNN inference accelerator that performs convolutional layer operations with their compressed data format. With several design optimization techniques, the authors have implemented their technique in a field-programmable gate array (FPGA) system-on-chip platform and evaluated their technique for six different convolutional layers in SqueezeNet. Results reveal that the authors' technique improves the performance by 1.1×–22.6× while reducing energy consumption by 47.7%–97.4% as compared to the CPU-based execution. Furthermore, results indicate that the IFM size and transfer latency are reduced by 34.0%–85.2% and 4.4%–75.7%, respectively, compared to the case without data compression. In addition, the authors' hardware accelerator shows better performance per hardware resource with less than or comparable power consumption to the state-of-the-art FPGA-based designs.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"29-43"},"PeriodicalIF":1.2,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89983943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An embedded intelligence engine for driver drowsiness detection","authors":"Shirisha Vadlamudi, Ali Ahmadinia","doi":"10.1049/cdt2.12036","DOIUrl":"10.1049/cdt2.12036","url":null,"abstract":"<p>Motor vehicle crashes involving drowsy driving are huge in number all over the world. Many studies revealed that 10%–30% of crashes are due to drowsy driving. Fatigue has costly effects on the safety, health, and quality of life. This drowsiness of drivers can be detected using various methods, for example, algorithms based on behavioural gestures, physiological signals and vitals. Also, few of them are vehicle based. Drowsiness of drivers was detected based on steering wheel movement and lane change patterns. A pattern is derived based on slow drifting and fast corrective steering movement. A prototype that detects the drowsiness of an automobile driver using artificial intelligence techniques, precisely using open-source tools like TensorFlow Lite on a Raspberry Pi development board, is developed. The TensorFlow model is trained on images captured from the video with the help of object detection using cascade classifier. In order to have a better accuracy, an Inception v3 architecture is used in pre-training the model with the image dataset. The final model is created and trained using long short-term memory and then the final TensorFlow model is converted to TensorFlow Lite model and this Lite model is used on Raspberry Pi board to detect the drowsiness of drivers. The results are comparable with desktop-based results in the literature.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"10-18"},"PeriodicalIF":1.2,"publicationDate":"2021-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78213695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Who is wearing me? TinyDL-based user recognition in constrained personal devices","authors":"Ramon Sanchez-Iborra, Antonio Skarmeta","doi":"10.1049/cdt2.12035","DOIUrl":"10.1049/cdt2.12035","url":null,"abstract":"<p>Deep learning (DL) techniques have been extensively studied to improve their precision and scalability in a vast range of applications. Recently, a new milestone has been reached driven by the emergence of the TinyDL paradigm, which enables adaptation of complex DL models generated by well-known libraries to the restrictions of constrained microcontroller-based devices. In this work, a comprehensive discussion is provided regarding this novel ecosystem, by identifying the benefits that it will bring to the wearable industry and analysing different TinyDL initiatives promoted by tech giants. The specific use case of automatic user recognition from data captured by a wearable device is also presented. The whole development process by which different DL configurations have been embedded in a real microcontroller unit is described. The attained results in terms of accuracy and resource usage confirm the validity of the proposal, which allows precise predictions in a highly constrained platform with limited input information. Therefore, this work provides insights into the viability of the integration of TinyDL models within wearables, which may be valuable for researchers, practitioners, and makers related to this industry.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 1","pages":"1-9"},"PeriodicalIF":1.2,"publicationDate":"2021-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12035","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74789266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating the SM3 hash algorithm with CPU-FPGA Co-Designed architecture","authors":"Xiaoying Huang, Zhichuan Guo, Mangu Song, Xuewen Zeng","doi":"10.1049/cdt2.12034","DOIUrl":"10.1049/cdt2.12034","url":null,"abstract":"<p>SM3 hash algorithm developed by the Chinese Government is used in various fields of information security, and it is being widely used in commercial security products. However, the performance of implementation on the software architecture is not sufficient for high-speed applications. This study proposes a CPU-FPGA co-designed architecture which offloads the SM3 function on field-programmable gate array so that high throughput can be achieved. The architecture can execute the SM3 hash algorithm with 16 concurrent streams or more, which means that multiple data streams can be processed in parallel. This design is implemented on the Xilinx XCKU115-flva1517-2-e device and Dell commercial server, and the throughput of this design can reach up to 35.5 Gbps when 16 individual SM3 modules are processed in parallel. The proposed architecture results in an excellent performance in the CPU-FPGA-coupled environment.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"15 6","pages":"427-436"},"PeriodicalIF":1.2,"publicationDate":"2021-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12034","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80352999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EmRep: Energy management relying on state-of-charge extrema prediction","authors":"Lars Hanschke, Christian Renner","doi":"10.1049/cdt2.12033","DOIUrl":"10.1049/cdt2.12033","url":null,"abstract":"<p>The persistent rise of Energy Harvesting Wireless Sensor Networks entails increasing demands on the efficiency and configurability of energy management. New applications often profit from or even require user-defined time-varying utilities, for example, the health assessment of bridges is only possible at rushhour. However, monitoring times do not necessarily overlap with energy harvest periods. This misalignment is often corrected by over-provisioning the energy storage. Favourable small-footprint and cheap energy storage, however, fill up quickly and waste surplus energy. Hence, EmRep is presented, which decouples the energy management of high-intake from low-intake harvest periods. Based on the State-of-Charge extrema prediction, the authors enhance energy management and reduce saturation of energy storage by design. Considering multiple user-defined utility profiles, the benefits of EmRep in combination with a variety of prediction algorithms, time resolutions, and energy storage sizes are showcased. EmRep is tailored to platforms with small energy storage, in which it is found that it doubles effective utility, and also increases performance by <math>\u0000 <mn>10</mn>\u0000 <mi>%</mi></math> with large-sized storage.</p>","PeriodicalId":50383,"journal":{"name":"IET Computers and Digital Techniques","volume":"16 4","pages":"91-105"},"PeriodicalIF":1.2,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cdt2.12033","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85793765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}