2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)最新文献

筛选
英文 中文
Mini-Batch Training along Convolution Windows for Representation Learning Based on Spike-Time-Dependent-Plasticity Rule 基于Spike-Time-Dependent-Plasticity Rule的小批量卷积窗表示学习训练
Yohei Shimmyo, Y. Okuyama
{"title":"Mini-Batch Training along Convolution Windows for Representation Learning Based on Spike-Time-Dependent-Plasticity Rule","authors":"Yohei Shimmyo, Y. Okuyama","doi":"10.1109/MCSoC51149.2021.00052","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00052","url":null,"abstract":"This paper presents a mini-batch training methodology along convolutional windows for layer-wised STDP unsupervised training on convolutional layers in order to shorten the training time of spiking neural networks (SNNs). SNN is a third-generation neural network that uses an accurate neuron model compared to rate-coded models used in conventional artificial neural networks (ANNs). The mini-batches of input convolution windows are convoluted at once. Then, the input, output, and current filter generate a batch of weight updates at once. This system reduces overheads of library calls or GPU execution. The batch processing methodology leads more significant and extensive models to be trained in ANNs, while many evaluations of direct SNN training methodologies are limited to smaller models. Currently, training large-scale models is virtually impossible. We evaluated the mini-batch processing effect on training speed and feature extraction power against various mini-batch sizes. The result showed that a larger mini-batch size enables us to utilize GPUs effectively, maintaining comparable feature extraction power. This research concludes that mini-batch training along convolution windows reduces training time by STDP training rule.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126180897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2QoSM: A Q-Learner QoS Manager for Application-Guided Power-Aware Systems qosm:面向应用导向的功率感知系统的Q-Learner QoS管理器
Michael J. Giardino, D. Schwyn, Bonnie H. Ferri, A. Ferri
{"title":"2QoSM: A Q-Learner QoS Manager for Application-Guided Power-Aware Systems","authors":"Michael J. Giardino, D. Schwyn, Bonnie H. Ferri, A. Ferri","doi":"10.1109/MCSoC51149.2021.00040","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00040","url":null,"abstract":"This paper describes the design and performance of Q-learning-based quality-of-service manager (2QoSM) for compute-aware applications (CAAs) as part of platform-agnostic resource management framework. CAAs and hardware are able to share metrics of performance with the 2QoSM and the 2QoSM can attempt to reconfigure CAAs and hardware to meet performance targets. This enables many co-design benefits while allowing for policy and platform portability. The use of Q-Learning allows online generation of the power management policy without requiring details about system state or actions, and can meet different goals including error, power minimization, or a combination of both. 2QoSM, evaluated using an embedded MCSoC controlling a mobile robot, reduces power compared to the Linux on-demand governor by 38.7-42.6% and a situation-aware governor by 4.0-10.2%. An error-minimization policy obtained a reduction in path-following error of 4.6-8.9%.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123587724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling DAGs of Multi-Version Multi-Phase Tasks on Heterogeneous Real-Time Systems 异构实时系统中多版本多阶段任务的调度dag
Julius Roeder, Benjamin Rouxel, C. Grelck
{"title":"Scheduling DAGs of Multi-Version Multi-Phase Tasks on Heterogeneous Real-Time Systems","authors":"Julius Roeder, Benjamin Rouxel, C. Grelck","doi":"10.1109/MCSoC51149.2021.00016","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00016","url":null,"abstract":"Heterogeneous high performance embedded systems are increasingly used in industry. Nowadays, these platforms embed accelerator-style components, such as GPUs, alongside different CPU cores. We use multiple alternatives/versions/implementations of tasks to fully benefit from the heterogeneous capacities of such platforms and due to binary incompatibility. Implementations targeting accelerators not only require access to the accelerator but also to a CPU core for, e.g., pre-processing and branching the control flow. Hence, accelerator workloads can naturally be divided into multiple phases (e.g. CPU, GPU, CPU). We propose an asynchronous scheduling approach that utilises multiple phases and thereby enables a finegrained scheduling of tasks that require two types of hardware. We show that our approach can increase the schedulability rate by up 24% over two multi-version phase-unaware schedulers. Additionally, we demonstrate that the schedulability rate of our heuristic is close to the optimal schedulability rate.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122806795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Efficient Resource Shared RISC-V Multicore Processor 高效资源共享RISC-V多核处理器
Md. Ashraful Islam, Kenji Kise
{"title":"Efficient Resource Shared RISC-V Multicore Processor","authors":"Md. Ashraful Islam, Kenji Kise","doi":"10.1109/MCSoC51149.2021.00061","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00061","url":null,"abstract":"Edge computing pushes the computational loads from the cloud to embedded devices, where data would be processed near the data source. Heterogeneous multicore architecture is believed to be a promising solution to fulfill the edge computational requirement. In FPGAs, the heterogeneous multicore is realized as multiple soft processor cores with custom processing elements. Since FPGA is a resource-constrained device, sharing the hardware resources among the soft processor cores can be advantageous. Some research has focused on the sharing resources among soft processors, but they do not study how much FPGA logic is minimized for a five-stage pipeline processor. This paper proposes the microarchitecture of a five-stage pipeline scalar processor that enables the sharing of functional units for execution among the multiple cores. We then investigate the performance and hardware resource utilization for a four-core processor. We find that sharing different functional units can save the LUT usage to 23.5% and DSP usage to 75%. We analyze the performance impact of sharing from the Embench benchmark program by simulating the same program in all four cores. Our simulation results indicate that based on the sharing configuration, the average performance drops from 2.9% to 22.3%.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122129971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task Scheduling Strategies for Batched Basic Linear Algebra Subprograms on Many-core CPUs 多核cpu上批处理基本线性代数子程序的任务调度策略
Daichi Mukunoki, Yusuke Hirota, Toshiyuki Imamura
{"title":"Task Scheduling Strategies for Batched Basic Linear Algebra Subprograms on Many-core CPUs","authors":"Daichi Mukunoki, Yusuke Hirota, Toshiyuki Imamura","doi":"10.1109/MCSoC51149.2021.00042","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00042","url":null,"abstract":"Batched Basic Linear Algebra Subprograms (BLAS) provides an interface that allows multiple problems for a given BLAS routine (operation) - with different parameters and sizes independent of each other - to be computed in a single routine. The efficient use of cores on many-core processors has been introduced for computing multiple minor problems for which sufficient parallelism cannot be extracted from a single problem. The major goal of this study is to automatically generate high-performance batched routines for all BLAS routines using nonbatched BLAS implementation and OpenMP on CPUs. Furthermore, the primary challenge is the task scheduling method for allocating batches to cores. In this study, we propose a scheduling method based on a greedy algorithm, which allocates batches based on their costs in advance to eliminate load imbalance when the costs of batches vary. Then, we investigate the performance of five scheduling methods, including ones implemented in OpenMP and our proposed method, on matrix multiplication (GEMM) and matrix-vector multiplication (GEMV) under several conditions and environments. As a result, we found that the optimal scheduling strategy differs depending on the problem setting and environment. Based on this result, we propose an automatic generation scheme of batched BLAS from nonbatched BLAS that can introduce arbitrary task scheduling. This scheme facilitates the development of batched routines for a full set of BLAS routines and special BLAS implementations such as high-precision versions.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121392524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Detection of Cache Side Channel Attacks Using Thread Level Monitoring of Hardware Performance Counters 利用硬件性能计数器的线程级监控检测缓存侧通道攻击
Pavitra Prakash Bhade, Sharad Sinha
{"title":"Detection of Cache Side Channel Attacks Using Thread Level Monitoring of Hardware Performance Counters","authors":"Pavitra Prakash Bhade, Sharad Sinha","doi":"10.1109/MCSoC51149.2021.00039","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00039","url":null,"abstract":"Modern multiprocessor systems adopt optimization techniques to boost the speed of execution. These optimizations create vulnerabilities that can be exploited by attackers, thus causing security breaches. The hierarchical structure of cache memory where the Last Level Cache is a super set of previous levels and is shared between multiple cores of the processors creates an attack vector for cache side-channel attacks (SCA). In such attacks, the attacker is able to trace the pattern of victim process execution and correspondingly retrieve secret information by monitoring the shared cache. Mitigation techniques against such attacks trade off security against overall system performance. Hence, mitigation only when an attack is detected is needed. We propose an architecture-agnostic approach that uses hardware performance counters at run time and at thread level instead of current state of the art which use counters at system level to detect cache SCA. The proposed approach reduces the false positives by 48% when compared with system level approaches. Thus, the trade off with performance is also reduced and hence, the proposed approach is especially significant for embedded systems where processor cycle time is a limited resource.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133325450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Framework and Its User Interface to Learn Machine Learning Models 学习机器学习模型的框架及其用户界面
Atsushi Takamiya, Md. Mostafizer Rahman, Y. Watanobe
{"title":"A Framework and Its User Interface to Learn Machine Learning Models","authors":"Atsushi Takamiya, Md. Mostafizer Rahman, Y. Watanobe","doi":"10.1109/MCSoC51149.2021.00059","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00059","url":null,"abstract":"In order to develop a system related to machine learning (ML), it is necessary to understand various contents such as prerequisite knowledge, implementation procedures, verification methods, and improvement methods. However, although general learning sites on the Web provide extensive learning contents such as videos and textbooks, they are insufficient for acquiring practical skills. In this paper, we propose a framework for learning ML and its user interface. The framework manages the ML learning phases, which includes learning the theory and practical knowledge, implementation, validation, improvement, and completion. In the model validation phase, checks are automatically applied according to the target ML model. Similarly, in the model improvement phase, improvement methods are automatically applied according to the target ML model. As a case study, we have developed contents on linear regression, classification, clustering, and dimensionality reduction.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124658795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trends and Challenges in Ensuring Security for Low-Power and High-Performance Embedded SoCs* 确保低功耗和高性能嵌入式soc安全性的趋势和挑战*
Parisa Rahimi, Ashutosh Kumar Singh, Xiaohang Wang, Alok Prakash
{"title":"Trends and Challenges in Ensuring Security for Low-Power and High-Performance Embedded SoCs*","authors":"Parisa Rahimi, Ashutosh Kumar Singh, Xiaohang Wang, Alok Prakash","doi":"10.1109/MCSoC51149.2021.00041","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00041","url":null,"abstract":"In recent years, security, power consumption, and performance have become the important issues in embedded SoCs’ design. With the growing number of embedded devices for automotive electronic and electric vehicles, real-time systems, robotics, artificial intelligence, smart technologies, or telecommunication, it is highly likely that these systems will be exposed to attacks or threats. Therefore, it is not easy to implement the security measure of such devices, and it becomes challenging while considering the performance and power issues due to limited available computing resources and often operating on batteries. In this paper, we survey the weaknesses of the embedded SoCs and examine the attacks, power consumption, and performance more closely with the main focus on Physical and Side-Channel attacks, which have not been surveyed previously. Along with the current trends and challenges, upcoming trends and challenges are also elaborated. This paper is intended to help the researchers and system designers in gaining deep insight into designing secure, power-efficient, and high-performance embedded SoCs in the future.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"907 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123267368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Light-weight Enhanced Semantics-Guided Neural Networks for Skeleton-Based Human Action Recognition 基于骨骼的人体动作识别的轻量级增强语义引导神经网络
Hongbo Chen, Lei Jing
{"title":"Light-weight Enhanced Semantics-Guided Neural Networks for Skeleton-Based Human Action Recognition","authors":"Hongbo Chen, Lei Jing","doi":"10.1109/MCSoC51149.2021.00036","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00036","url":null,"abstract":"In the skeleton-based human action recognition domain, the methods based on graph convolutional networks have had great success recently. However, most graph neural networks rely on large parameters, which is not easy to train and take up a large computational cost. In the above, a simple yet effective semantics-guided neural network (SGN) obtains with a few parameters and has achieved good results. However, the simple use of semantics is limited to the improvement of recognition rate. Moreover, using only one fixed temporal convolution kernel, which is not enough to extract the temporal details comprehensively. To this end, we propose an enhanced semantics-guided neural network (ESGN) in this paper. Some simple but effective strategies are applied to ESGN, such as semantic expansion, graph pooling methods, and regularization loss function, which do not significantly increase the parameter size but improve the accuracy on two large datasets than SGN. The proposed method with an order of magnitude smaller size than most previous papers is evaluated on the NTU60 and NTU120, the experimental results show that our method achieves the state-of-the-art performance.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124975091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-objective Reinforcement Learning for Energy Harvesting Wireless Sensor Nodes 能量采集无线传感器节点的多目标强化学习
Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura
{"title":"Multi-objective Reinforcement Learning for Energy Harvesting Wireless Sensor Nodes","authors":"Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura","doi":"10.1109/MCSoC51149.2021.00022","DOIUrl":"https://doi.org/10.1109/MCSoC51149.2021.00022","url":null,"abstract":"Modern Energy Harvesting Wireless Sensor Nodes (EHWSNs) need to intelligently allocate their limited and unreliable energy budget among multiple tasks to ensure long-term uninterrupted operation. Traditional solutions are ill-equipped to deal with multiple objectives and execute a posteriori tradeoffs. We propose a general Multi-objective Reinforcement Learning (MORL) framework for Energy Neutral Operation (ENO) of EHWSNs. Our proposed framework consists of a novel Multi-objective Markov Decision Process (MOMDP) formulation and two novel MORL algorithms. Using our framework, EHWSNs can learn policies to maximize multiple task-objectives and perform dynamic runtime tradeoffs. The high computation and learning costs, usually associated with powerful MORL algorithms, can be avoided by using our comparatively less resource-intensive MORL algorithms. We evaluate our framework on a general single-task and dual-task EHWSN system model through simulations and show that our MORL algorithms can successfully tradeoff between multiple objectives at runtime.","PeriodicalId":166811,"journal":{"name":"2021 IEEE 14th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122410280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信